Optimization for Machine Learning

6 ECTS.

This course will review the mathematical foundations for Machine Learning, as well as the underlying algorithmic methods and showcases some modern applications of a broad range of optimization techniques.

Optimization is at the heart of most recent advances in machine learning. This includes of course most basic methods (linear regression, SVM and kernel methods). It is also the key for the recent explosion of deep learning which are state of the art approaches to solve supervised and unsupervised problems in imaging, vision and natural language processing.

This course will review the mathematical foundations, the underlying algorithmic methods and showcases some modern applications of a broad range of optimization techniques. The course will be composed of both classical lectures and numerical sessions in Python. The first part covers the basic methods of smooth optimization (gradient descent) and convex optimization (optimality condition, constrained optimization, duality). The second part will features more advanced methods (non-smooth optimization, SDP programming,interior points and proximal methods). The last part will cover large scale methods (stochastic gradient descent), automatic differentiation (using modern python framework) and their application to neural network (shallow and deep nets).

Location:

Lectures will not be at Université Paris-Dauphine, but at ENS, 29 rue d’Ulm, in the 5th district of Paris. More precisely, lectures will be:

in room U209 on Tuesdays (except for November 19, room U207)
in room Paul Langevin on Thursdays

Lecturers:

– Vincent Duval (INRIA)
– Robert Gower (Telecom Paris)
– Gabriel Peyré (CNRS et ENS)
– Clément Royer (Dauphine)
– Alessandro Rudi (INRIA)
– Irene Waldspurger (CNRS et Dauphine)

References:

Theory and algorithms:

Convex Optimization, Boyd and Vandenberghe
Introduction to matrix numerical analysis and optimization, Philippe Ciarlet
Proximal algorithms, N. Parikh and S. Boyd
Introduction to Nonlinear Optimization – Theory, Algorithms and Applications, Amir Beck

Numerics:

Pyrthon and Jupyter installation: use only Python 3 with Anaconda distribution.
The Numerical Tours of Signal Processing, Gabriel Peyré
Scikitlearn tutorial #1 and Scikitlearn tutorial #2, Fabian Pedregosa, Jake VanderPlas
Reverse-mode automatic differentiation: a tutorial
Convolutional Neural Networks for Visual Recognition
Christopher Olah, Blog