[ABI team seminar] Talk by Dr.Pratik Chaudhari on "A Picture of the Prediction Space of Deep Networks"
イベント内容
This is an online seminar.
Registration is required.
Title:
A Picture of the Prediction Space of Deep Networks
Abstract:
Deep networks have many more parameters than the number of training data and can therefore overfit---and yet, they predict remarkably accurately in practice. Training such networks is a high-dimensional, large-scale and non-convex optimization problem and should be prohibitively difficult---and yet, it is quite tractable. This talk aims to shed light on these puzzles.
We will argue that deep networks generalize well because of a characteristic structure in the space of learnable tasks. The input correlation matrix for typical tasks has a “sloppy” eigenspectrum where, in addition to a few large eigenvalues, there is a large number of small eigenvalues that are distributed uniformly over a very large range. As a consequence, the Hessian and the Fisher Information Matrix of a trained network also have a sloppy eigenspectrum. Using these ideas, we will demonstrate an analytical non-vacuous PAC-Bayes generalization bound for general deep networks.
We will next develop information-geometric techniques to analyze the trajectories of the predictions of deep networks during training. By examining the underlying high-dimensional probabilistic models, we will reveal that the training process explores an effectively low dimensional manifold. Networks with a wide range of architectures, sizes, trained using different optimization methods, regularization techniques, data augmentation techniques, and weight initializations lie on the same manifold in the prediction space. We will also show that predictions of networks being trained on different tasks (e.g., different subsets of ImageNet) using different representation learning methods (e.g., supervised, meta-, semi supervised and contrastive learning) also lie on a low-dimensional manifold.
References:
1. Does the data induce capacity control in deep learning? Rubing Yang, Jialin Mao, and Pratik Chaudhari. [ICML '22] https://arxiv.org/abs/2110.14163
2. The Training Process of Many Deep Networks Explores the Same Low-Dimensional Manifold. Jialin Mao, Itay Griniasty, Han Kheng Teoh, Rahul Ramesh, Rubing Yang, Mark K. Transtrum, James P. Sethna, and Pratik Chaudhari. [2023 arXiv preprint] https://arxiv.org/abs/2305.01604
3. A picture of the space of typical learnable tasks. Rahul Ramesh, Jialin Mao, Itay Griniasty, Rubing Yang, Han Kheng Teoh, Mark Transtrum, James P. Sethna, and Pratik Chaudhari [ICML ’23]. https://arxiv.org/abs/2210.17011
Bio:
Pratik Chaudhari is an Assistant Professor in Electrical and Systems Engineering and Computer and Information Science at the University of Pennsylvania. He is a core member of the GRASP Laboratory. From 2018-19, he was a Senior Applied Scientist at Amazon Web Services and a Postdoctoral Scholar in Computing and Mathematical Sciences at Caltech. Pratik received his PhD (2018) in Computer Science from UCLA, and his Master's (2012) and Engineer's (2014) degrees in Aeronautics and Astronautics from MIT. He was a part of NuTonomy Inc. (now Hyundai-Aptiv Motional) from 2014-16. He is the recipient of the Amazon Machine Learning Research Award (2020), NSF CAREER award (2022) and the Intel Rising Star Faculty Award (2022).
注意事項
※ こちらのイベント情報は、外部サイトから取得した情報を掲載しています。
※ 掲載タイミングや更新頻度によっては、情報提供元ページの内容と差異が発生しますので予めご了承ください。
※ 最新情報の確認や参加申込手続き、イベントに関するお問い合わせ等は情報提供元ページにてお願いします。

お問い合わせ
関連するイベント

最新『N2WS 4.5』でクラウドバックアップをより強固に!EKS/S3のデータ保護からAzure対応拡張など新機能を紹介!
2026/05/19(火) 開催
AI駆動クラウドマイグレーション実践講座(全4回) ④ ROI最大化と内製化
2026/05/21(木) 開催
AI駆動クラウドマイグレーション実践講座(全4回) ③ AIOps×サーバーレスで変わるクラウド運用
2026/05/14(木) 開催
【東京エレクトロン様ご登壇】次世代IT運用と現場のリアリティを繋ぐ「SRE実践」の最適解 ~Dynatrace×PagerDutyの戦略活用と高度なIT運用内製化への道~
2026/05/22(金) 開催
