Events

IFML Seminar

IFML Seminar: 10/17/24 - A Picture of the Prediction Space of Deep Neural Networks

Pratik Chaudhari, Assistant Professor, Electrical and Systems Engineering and Computer and Information Science, UPenn

-

The University of Texas at Austin
Gates Dell Complex (GDC 6.302)
2317 Speedway
Austin, TX 78712
United States

headshot

Abstract: I will argue that deep networks work well because of a characteristic structure in the space of learnable tasks. The input correlation matrix for typical tasks has a “sloppy” eigenspectrum where eigenvalues decay linearly on a logarithmic scale. As a consequence, the Hessian and the Fisher Information Matrix of a trained network also have a sloppy eigenspectrum. Using this idea, I will demonstrate an analytical, non-vacuous PAC-Bayes bound on the generalization error for general deep networks.
I will show that the training process in deep learning explores a remarkably low dimensional manifold, as low as three. Networks with a wide variety of architectures, sizes, optimization and regularization methods lie on the same manifold. Networks being trained on different tasks (e.g., different subsets of ImageNet) using different methods (e.g., supervised, transfer, meta, semi and self-supervised learning) also lie on the same low-dimensional manifold. I will show that typical tasks are highly redundant functions of their inputs. Many perception tasks, from visual recognition, semantic segmentation, optical flow, depth estimation, to vocalization discrimination, can be predicted extremely well regardless whether data is projected in the principal subspace where it varies the most, some intermediate subspace with moderate variability---or the bottom subspace where data varies the least.

Bio: Pratik Chaudhari is an Assistant Professor in Electrical and Systems Engineering and Computer and Information Science at the University of Pennsylvania. He is a core member of the GRASP Laboratory. From 2018-19, he was a Senior Applied Scientist at Amazon Web Services and a Postdoctoral Scholar in Computing and Mathematical Sciences at Caltech. Pratik received his PhD in Computer Science from UCLA, and his Master's and Engineer's degrees in Aeronautics and Astronautics from MIT. He was a part of NuTonomy Inc. (now Hyundai-Aptiv Motional) from 2014-16. He is the recipient of the Amazon Machine Learning Research Award, NSF CAREER award and the Intel Rising Star Faculty Award.