Banner
Home      Log In      Contacts      FAQs      INSTICC Portal
 
Documents

Keynote Lectures

Robust Fitting in Computer Vision
Jiri Matas, Czech Technical University in Prague, Faculty of Electrical Engineering, Czech Republic

Neural Diffusion PDEs, Differential Geometry, and Graph Neural Networks
Michael Bronstein, Imperial College London, United Kingdom

Semantic Information Pursuit
René Vidal, The Johns Hopkins University, United States

Challenges and Opportunities in Autonomous Navigation
Jana Kosecka, George Mason University, United States

 

Robust Fitting in Computer Vision

Jiri Matas
Czech Technical University in Prague, Faculty of Electrical Engineering
Czech Republic
 

Brief Bio
Jiri Matas is a full professor at the Center for Machine Perception, Czech Technical University in Prague. He holds a PhD degree from the University of Surrey, UK (1995). He has published more than 200 papers in refereed journals and conferences. His publications have approximately 34000 citations registered in Google Scholar and 13000 in the Web of Science. His h- index is 65 (Google scholar) and 43 (Clarivate Analytics Web of Science) respectively. He received the best paper prize e.g. at the British Machine Vision Conferences in 2002 and 2005, at the Asian Conference on Computer Vision in 2007 and at Int. Conf. on Document analysis and Recognition in 2015. J. Matas has served in various roles at major international computer vision conferences (e.g. ICCV, CVPR, ICPR, NIPS, ECCV), co-chairing ECCV 2004, 2016 and CVPR 2007. He is on the editorial board of IJCV and was the Associate Editor-in-Chief of IEEE T. PAMI. He served on the computer science panel of ERC. His research interests include visual tracking, object recognition, image matching and retrieval, sequential pattern recognition, and RANSAC- type optimization metods.


Abstract
In many  computer vision applications, the following problem arises: a collection of  observations is interpreted as multiple noisy instances of different classes of objects. The observations can be 2D keypoints in an RG B image, 3D points in lidar scans, 2D-2D point correspondences in stereo matching, trajectories in tracking, etc. The classeses of object can be 2D or 3D geometric entitities like lines and planes, or transformations and relations between images.
We will review recent techniques, mainly based on the RANSAC approach, for estimation of single and multiple instances of such object and demostrate their performance on a range of practical problems.



 

 

Neural Diffusion PDEs, Differential Geometry, and Graph Neural Networks

Michael Bronstein
Imperial College London
United Kingdom
 

Brief Bio
Michael Bronstein is the DeepMind Professor of AI at the University of Oxford and Head of Graph Learning Research at Twitter. He was previously a professor at Imperial College London and held visiting appointments at Stanford, MIT, and Harvard, and has also been affiliated with three Institutes for Advanced Study (at TUM as a Rudolf Diesel Fellow (2017-2019), at Harvard as a Radcliffe fellow (2017-2018), and at Princeton as a short-time scholar (2020)). Michael received his PhD from the Technion in 2007. He is the recipient of the Royal Society Wolfson Research Merit Award, Royal Academy of Engineering Silver Medal, five ERC grants, two Google Faculty Research Awards, and two Amazon AWS ML Research Awards. He is a Member of the Academia Europaea, Fellow of IEEE, IAPR, BCS, and ELLIS, ACM Distinguished Speaker, and World Economic Forum Young Scientist. In addition to his academic career, Michael is a serial entrepreneur and founder of multiple startup companies, including Novafora, Invision (acquired by Intel in 2012), Videocites, and Fabula AI (acquired by Twitter in 2019).


Abstract
In this talk, I will make connections between Graph Neural Networks (GNNs) and non-Euclidean diffusion equations. I will show that drawing on methods from the domain of differential geometry, it is possible to provide a principled view on such GNN architectural choices as positional encoding and graph rewiring as well as explain and remedy the phenomena of oversquashing and bottlenecks.



 

 

Semantic Information Pursuit

René Vidal
The Johns Hopkins University
United States
 

Brief Bio
Dr. René Vidal is the Herschel Seder Professor of Biomedical Engineering and the Director of the Mathematical Institute for Data Science (MINDS) at Johns Hopkins University. He is Associate Editor in Chief of TPAMI, an Amazon Scholar and a Chief Scientist at NORCE. He also directs the NSF-Simons Collaboration on the Mathematical Foundations of Deep Learning, the NSF-TRIPODS Institute on the Foundations of Graph and Deep Learning, and the Multidisciplinary University Research Initiative on Semantic Information. His lab has made seminal contributions to machine learning, computer vision and biomedical data science, including best paper awards for his work on generalized principal component analysis, sparse subspace clustering, motion segmentation, action recognition, and surgical skill assessment. His lab also creates new technologies for a variety of biomedical applications, including detection, classification and tracking of blood cells in holographic images, classification of embryonic cardio-myocytes in optical images, and assessment of surgical skill in surgical videos. Dr. Vidal is the recipient of the 2021 IEEE Edward J. McCluskey Technical Achievement Award, 2018 D’Alembert Faculty Fellowship, 2012 IAPR J.K. Aggarwal Prize, the 2009 ONR Young Investigator Award, the 2009 Sloan Research Fellowship, and the 2005 NFS CAREER Award. Dr. Vidal is a Fellow of the AIMBE, the IEEE, and the IAPR, and a member of the ACM and SIAM.


Abstract
In 1948, Shannon published a famous paper, which laid the foundations of information theory and led to a revolution in communication technologies. Critical to Shannon’s ideas was the notion that a signal can be represented in terms of “bits,” and that the information content of the signal can be measured by the minimum expected number of bits. However, while such a notion of information is well suited for tasks such as signal compression and reconstruction, it is not directly applicable to audio-visual scene interpretation tasks, because bits do not depend on the “semantic content” of the signal, such as words in a document, or objects in an image. In this talk, I will present a new measure of semantic information content called “semantic entropy”, which is defined as the minimum expected number of semantic queries about the data whose answers are sufficient for solving a given task (e.g., classification). I will also present an information-theoretic framework called ``information pursuit'' for deciding which queries to ask and in which order, which requires a probabilistic generative model relating data and questions to the task. Experiments on handwritten digit classification show, for example, that the translated MNIST dataset is harder to classify than the MNIST dataset. Joint work with Aditya Chattopadhyay, Benjamin Haeffele and Donald Geman.



 

 

Challenges and Opportunities in Autonomous Navigation

Jana Kosecka
George Mason University
United States
 

Brief Bio
Jana Kosecka is Professor at the Department of Computer Science, George Mason University. She obtained Ph.D. in Computer Science from University of Pennsylvania. Following her PhD, she was a postdoctoral fellow at the EECS Department at University of California, Berkeley. She is the recipient of David Marr's prize and received the National Science Foundation CAREER Award. Jana is a chair of IEEE technical Committee of Robot Perception, Associate Editor of IEEE Robotics and Automation Letters and International Journal of Computer Vision, former editor of IEEE Transactions on Pattern Analysis and Machine Intelligence. She held visiting positions at Stanford University, Google and Nokia Research. She is a co-author of a monograph titled Invitation to 3D vision: From Images to Geometric Models. Her general research interests are in Computer Vision and Robotics. In particular she is interested 'seeing' systems engaged in autonomous tasks, acquisition of static and dynamic models of environments by means of visual sensing and human-computer interaction.


Abstract
Advancements in reliable navigation and mapping rest to a large extent on robust, efficient and scalable understanding of the surrounding environment. The success in recent years have been propelled by the use machine learning techniques for capturing geometry and semantics of environment from video and range sensors. I will discuss approaches object detection, pose recovery, 3D reconstruction and detailed semantic parsing using deep convolutional neural networks (CNNs) as well as challenges of deploying these systems in real-world settings. To overcome the need for large amount of labeled data for training object instance detectors we use active self-supervision provided by a robot traversing an environment and generate training examples for learning novel object embeddings from unlabelled data. The object detectors trained in this manner achieve higher mAP compared to off-the-shelf detectors trained on this limited data. While perception and decision making are often considered separately, I will outline few strategies for jointly optimizing perception and decision making algorithms in the context of elementary navigation tasks. The presented explorations open interesting avenues for control of embodied physical agents and general strategies for design and development of general purpose autonomous systems.



footer