UE Model selection for large-scale learning

Degrees incorporating this pedagocial element :


When estimating parameters in a statistical model, sharp calibration is important to get optimal performances. In this course, we will focus on the selection of estimators with respect to the data. Particularly, we will consider calibration of parameters (e.g., regularization parameter for minimization of regularized empirical risk, like Lasso or Ridge estimators) and model selection (where each estimator minimizes the empirical risk on a specified model, as mixture models with several number of clusters).

We will focus on the penalized empirical risk, where the penalty may be deterministic (as BIC or ICL) or estimated with data (as the slope heuristic).

Recommended prerequisite

Basic knowledges in probability and statistics

Targeted skills


  • When model selection is needed.
  • What can be proved theoretically for existing methods.
  • How those results can help in practice to choose a criterion for some specific statistical problem
  • How the theory can serve to define new procedures of selection.


- T. Hastie, R. Tibshirani and J. Friedman, The Elements of Statistical Learning. Data Mining, Inference, and Prediction

- P. Buhlmann and S. van de Geer, Statistics for High-Dimensional Data. Methods, Theory and Applications

- P. Massart, Concentration Inequalities and Model Selection