IFML Seminar: 12/06/24 - Unscripted Grounded Visual Learning

Stella Yu, Professor of Electrical Engineering and Computer Science University of Michigan

12:15 - 1:15pm

The University of Texas at Austin
Gates Dell Complex (GDC 6.302)
2317 Speedway
Austin, TX 78712
United States

Abstract: Computer vision has made remarkable advances through data-driven learning of image-text associations. Large-scale vision and language models like CLIP, SAM, and ChatGPT can generate compelling descriptions of images. However, these models, trained with scripted data and limited grounding, often struggle to provide detailed visual evidence and to generalize across a diverse range of infrequent visual concepts during testing. In contrast, human infants develop robust visual understanding from limited experiences, even before acquiring language. This contrast raises crucial questions: What are we missing? Do we not see without naming our visual experiences? Can vision be developed entirely from visual data without predefined labels and semantic knowledge? I will present our research progress on how we can computationally learn to abstract and generalize visual concepts directly from images and videos.

Bio: Stella Yu received her Ph.D. from Carnegie Mellon University, where she studied robotics at the Robotics Institute and vision science at the Center for the Neural Basis of Cognition. Before joining the University of Michigan as a Full Professor of Electrical Engineering and Computer Science in Fall 2022, she was the Director of Vision Group at the International Computer Science Institute, a Senior Fellow at the Berkeley Institute for Data Science, and a faculty member in Computer Science, Vision Science, Cognitive and Brain Sciences at UC Berkeley. She is a recipient of the US NSF CAREER Award and the Clare Boothe Luce Professorship. Dr. Yu has been conducting cutting-edge research on unsupervised representation learning and open long-tailed recognition from natural data. She is actively extending these approaches to robotics, grounding action and perception by learning sensorimotor contingencies. Dr. Yu is interested not only in understanding visual perception from multiple perspectives, but also in using computer vision and machine learning to automate and exceed human expertise in practical applications. https://web.eecs.umich.edu/~stellayu/

Event Registration