BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//https://techplay.jp//JP
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALDESC:[ABI team seminar] Talk by Dr.Pratik Chaudhari on "A Picture o
 f the Prediction Space of Deep Networks"
X-WR-CALNAME:[ABI team seminar] Talk by Dr.Pratik Chaudhari on "A Picture o
 f the Prediction Space of Deep Networks"
X-WR-TIMEZONE:Asia/Tokyo
BEGIN:VTIMEZONE
TZID:Asia/Tokyo
BEGIN:STANDARD
DTSTART:19700101T000000
TZOFFSETFROM:+0900
TZOFFSETTO:+0900
TZNAME:JST
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
UID:933524@techplay.jp
SUMMARY:[ABI team seminar] Talk by Dr.Pratik Chaudhari on "A Picture of the
  Prediction Space of Deep Networks"
DTSTART;TZID=Asia/Tokyo:20240216T100000
DTEND;TZID=Asia/Tokyo:20240216T120000
DTSTAMP:20260505T160550Z
CREATED:20240125T060138Z
DESCRIPTION:イベント詳細はこちら\nhttps://techplay.jp/event/93352
 4?utm_medium=referral&utm_source=ics&utm_campaign=ics\n\nThis is an onlin
 e seminar.\nRegistration is required.\n\nTitle:\nA Picture of the Predict
 ion Space of Deep Networks\n\nAbstract:\nDeep networks have many more par
 ameters than the number of training data and can therefore overfit---and 
 yet\, they predict remarkably accurately in practice. Training such netwo
 rks is a high-dimensional\, large-scale and non-convex optimization probl
 em and should be prohibitively difficult---and yet\, it is quite tractabl
 e. This talk aims to shed light on these puzzles.\n\nWe will argue that d
 eep networks generalize well because of a characteristic structure in the
  space of learnable tasks. The input correlation matrix for typical tasks
  has a “sloppy” eigenspectrum where\, in addition to a few large eige
 nvalues\, there is a large number of small eigenvalues that are distribut
 ed uniformly over a very large range. As a consequence\, the Hessian and 
 the Fisher Information Matrix of a trained network also have a sloppy eig
 enspectrum. Using these ideas\, we will demonstrate an analytical non-vac
 uous PAC-Bayes generalization bound for general deep networks.\n\nWe will
  next develop information-geometric techniques to analyze the trajectorie
 s of the predictions of deep networks during training. By examining the u
 nderlying high-dimensional probabilistic models\, we will reveal that the
  training process explores an effectively low dimensional manifold. Netwo
 rks with a wide range of architectures\, sizes\, trained using different 
 optimization methods\, regularization techniques\, data augmentation tech
 niques\, and weight initializations lie on the same manifold in the predi
 ction space. We will also show that predictions of networks being trained
  on different tasks (e.g.\, different subsets of ImageNet) using differen
 t representation learning methods (e.g.\, supervised\, meta-\, semi super
 vised and contrastive learning) also lie on a low-dimensional manifold.\n
 \nReferences:\n1. Does the data induce capacity control in deep learning?
  Rubing Yang\, Jialin Mao\, and Pratik Chaudhari. [ICML '22] https://arxi
 v.org/abs/2110.14163\n2. The Training Process of Many Deep Networks Explo
 res the Same Low-Dimensional Manifold. Jialin Mao\, Itay Griniasty\, Han 
 Kheng Teoh\, Rahul Ramesh\, Rubing Yang\, Mark K. Transtrum\, James P. Se
 thna\, and Pratik Chaudhari. [2023 arXiv preprint] https://arxiv.org/abs/
 2305.01604\n3. A picture of the space of typical learnable tasks. Rahul R
 amesh\, Jialin Mao\, Itay Griniasty\, Rubing Yang\, Han Kheng Teoh\, Mark
  Transtrum\, James P. Sethna\, and Pratik Chaudhari [ICML ’23]. https:/
 /arxiv.org/abs/2210.17011\n\nBio:\nPratik Chaudhari is an Assistant Profe
 ssor in Electrical and Systems Engineering and Computer and Information S
 cience at the University of Pennsylvania. He is a core member of the GRAS
 P Laboratory. From 2018-19\, he was a Senior Applied Scientist at Amazon 
 Web Services and a Postdoctoral Scholar in Computing and Mathematical Sci
 ences at Caltech. Pratik received his PhD (2018) in Computer Science from
  UCLA\, and his Master's (2012) and Engineer's (2014) degrees in Aeronaut
 ics and Astronautics from MIT. He was a part of NuTonomy Inc. (now Hyunda
 i-Aptiv Motional) from 2014-16. He is the recipient of the Amazon Machine
  Learning Research Award (2020)\, NSF CAREER award (2022) and the Intel R
 ising Star Faculty Award (2022).
LOCATION:オンライン
URL:https://techplay.jp/event/933524?utm_medium=referral&utm_source=ics&utm
 _campaign=ics
END:VEVENT
END:VCALENDAR
