Seminars&Colloquia
- Colloquium: Generalization Properties of Overparametrized Models
- saarc |
- 2024-09-06 13:46:28|
- 57
- 일시
- 2024. 10. 8.
- 장소
- E6-1, Room1401
- 연사
- 손영탁 교수
Title: Generalization Properties of Overparametrized Models
Abstract: Modern machine learning methods such as multi-layer neural networks often have millions of parameters achieving near-zero training errors. Nevertheless, they maintain strong generalization capabilities, challenging traditional statistical theories based on the uniform law of large numbers. Motivated by this phenomenon, we consider high-dimensional binary classification with linearly separable data. For Gaussian covariates, we characterize linear classification problems for which the minimum norm interpolating prediction rule, namely the max-margin classification, has near-optimal generalization error.
In the second part of the talk, we consider max-margin classification with non-Gaussian covariates. In particular, we leverage universality arguments to characterize the generalization error of non-linear random features model, a two-layer neural network with random first layer weights. In the wide-network limit, where the number of neurons tends to infinity, we show how non-linear max-margin classification with random features collapse to a linear classifier with a soft-margin objective.
Abstract: Modern machine learning methods such as multi-layer neural networks often have millions of parameters achieving near-zero training errors. Nevertheless, they maintain strong generalization capabilities, challenging traditional statistical theories based on the uniform law of large numbers. Motivated by this phenomenon, we consider high-dimensional binary classification with linearly separable data. For Gaussian covariates, we characterize linear classification problems for which the minimum norm interpolating prediction rule, namely the max-margin classification, has near-optimal generalization error.
In the second part of the talk, we consider max-margin classification with non-Gaussian covariates. In particular, we leverage universality arguments to characterize the generalization error of non-linear random features model, a two-layer neural network with random first layer weights. In the wide-network limit, where the number of neurons tends to infinity, we show how non-linear max-margin classification with random features collapse to a linear classifier with a soft-margin objective.
첨부파일 |
|
---|