Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning中国科学院数学与系统科学研究院应用数学研究所

Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning

Title:	Optimization, Generalization, and Out-of-distribution Prediction in Deep Learning
Speaker:	陈薇研究员，微软亚洲研究院
Inviter:	张世华研究员
Time & Venue:	2021.12.6 10:00 N208
Abstract:	With big data and big computation, deep learning has achieved breakthrough in computer vision, natural language computing, speech, etc. At the same time, researchers are thinking about how to alleviate hyper-parameter tuning efforts, understanding why DNN can generalize well so far, and investigating how to make deep learning do better in out-of-distribution prediction. In this talk, I will introduce our recent research about the optimization, generalization, and o.o.d. prediction in deep learning. Firstly, I will present a new group-invariant optimization framework for ReLU neural networks, in which the positive-scaling redundancy can be removed; then, I will present our work about the implicit bias of the widely-used stochastic optimization algorithms in deep learning; finally, I will talk about how to improve out-of-distribution prediction by incorporating "causal" invariance.
Affiliation: