UE Machine learning fundamentals

Diplômes intégrant cet élément pédagogique :

Descriptif

Machine Learning is one of the key areas of Artificial Intelligence and it concerns the study and the development of quantitative models that enables a computer to perform tasks without being explicitly programmed to do them. Learning in this context is hence to recognize complex forms and to make intelligent decisions. Given all existing entries, the difficulty of this task lies in the fact that all possible decisions is usually very complex to enumerate. To get around that, machine learning algorithms are designed in order to gain knowledge on the problem to be addressed based on a limited set of observed data extracted from this problem. To illustrate this principle, consider the supervised learning task, where the prediction function, which infers a predicted output for a given input, is learned over a finite set of labeled training examples, where each instance of this set is a pair constituted of a vector characterizing an observation in a given vector space, and an associated desired response for that instance (also called desired output). After the training step, the function returned by the algorithm is sought to give predictions on new examples, which have not been used in the learning process, with the lowest probability of error. The underlying assumption in this case is that the examples are, in general, representative of the prediction problem on which the function will be applied. We expect that the learning algorithm produces a function that will have a good generalization performance and not the one that is able to perfectly reproduce the outputs associated to the training examples. Guarantees of learnability of this process were studied in the theory of machine learning largely initiated by Vladimir Vapnik. These guarantees are dependent on the size of the training set and the complexity of the class of functions where the algorithm searches for the prediction function. Emerging technologies, particularly those related to the development of Internet, reshaped the domain of machine learning with new learning frameworks that have been studied to better tackle the related problems. One of these frameworks concerns the problem of learning with partially labeled data, or semi-supervised learning, which development is motivated by the effort that has to be made to construct labeled training sets for some problems, while large amount of unlabeled data can be gathered easily for these problems. The inherent assumption, in this case, is that unlabeled data contain relevant information about the task that has to be solved, and that it is a natural idea to try to extract this information so as to provide the learning algorithm more evidence. From these facts were born a number of works that intended to use a small amount of labeled data simultaneously with a large amount of unlabeled data to learn a prediction function.

The intent of this course is to propose a broad introduction to the field of Machine Learning, including discussions of each of the major frameworks, supervised, unsupervised and semi-supervised learning.

Program

Part I. Supervised Learning

This part gives an overview of foundations of supervised learning. We will see that learning is an inductive process where a general rule is to be found from a finite set of labeled observations by minimizing the empirical risk of the rule over that set. The study of consistency gives conditions that, in the limit of infinite sample sizes, the minimizer of the empirical risk will lead to a value of the risk that is as good as the best attainable risk. The direct minimization of the empirical risk is not tractable as the latter is not derivative, hence learning algorithms find the parameters of the learning rule by minimizing a convex upper-bound (or surrogate) of the empirical risk. We present, classical strategies for unconstrained convex optimization: gradient descente, Quasi-Newton approach, and conjugate gradient descente. We present classical learning algorithms for binary classification: the perceptron, logistic regression and boosting by linking the development of these models to the Empirical Risk Minimization framework as well as the Multi-class classification paradigm. Particularly, we present Multi-Layer Perceptron as well as the back-propagation algorithm that is in use in deep learning.

Part II. Unsupervised and semi-supervised Learning

We will present generative models for clustering as well as two powerful tools for parameter estimation namely Expectation-Maximization (EM) and Classification Expectation-Maximization (CEM) algorithms. In the context of Big Data, labeling observations for learning is a tedious task. Semiu-supervised paradigm aims at learining with few labeled and a huge amount of unlabeled data. In this part we review the three families of techniques proposed in semi-supervised learning, that is Graphical, Generative and Discriminant models.

Pré-requis recommandés

Statistics and probability (BSc)

Compétences visées

Understanding of fundamental notions in Machine Learning (inference, ERM and SRM principles, generalization bounds, classical learning models, unsupervised learning, semi-supervised learning.

Bibliographie

[1] Massih-Reza Amini - Apprentissage Machine de la théorie à la pratique, Eyrolles, 2015.
[2] Christopher Bishop - Neural Networks for Pattern Recognition, Oxford University Press, 1995.
[3] Richard Duda, Peter Hart & David Strok - Pattern Classification, John Wiley & Sons, 1997.
[4] John Shawe-Taylor & Nello Cristianini - Kernel Methods for Pattern Analysis, Cambridge University Press, 2004.
[5] Colin McDiarmid - On the method of bounded differences,Surveys in Combinatorics, 141:148-188, 1989. 
[6] Mehryar Mohri, Afshin Rostamzadeh & Ameet Talwalker - Foundations of Machine Learning, MIT Press, 2012.
[7] Bernhard Schölkopf & Alexander J. Smola - Learning with Kernels, MIT Press, 2002.
[8] Vladimir Kolchinskii - Rademacher penalties and structural risk minimization, IEEE Transactions on Information Theory, 47(5):1902–1914, 2001.

Informations complémentaires

Méthode d'enseignement : En présence
Lieu(x) : Grenoble
Langue(s) : Anglais