UE Fundamentals of probabilistic data mining

Degrees incorporating this pedagocial element :


This lecture introduces fundamental concepts and associated numerical methods in model-based clustering, classification and models with latent structure. These approaches are particularly relevant to model random vectors, sequences or graphs, to account for data heterogeneity, and to present general principles in statistical modelling. The following topics are addressed:

At the end of the course, the student will have basic knowledge in the most common probabilistic models with latent variables. Therefore, the student will be able to perform model-based clustering, analysis and segmentation of time-series with hidden Markov models, build a graphical model associated with a given distribution, represent numerical multivariate data with missing coordinates into planes and work with state-of-the-art non- linear regression models based on variational autoencoders.

Recommended prerequisite

Fundamental principles in probability theory (conditioning) and statistics (maximum likelihood estimator and its usual asymptotic properties).