Machine learning with kernel methods, Spring 2017

Julien Mairal and Jean-Philippe Vert

MSc Mathematics, Vision, Learning (MVA) (ENS Cachan)

MSc Mathematics for Life Sciences (MathSV) (University Paris South, Ecole Polytechnique, ENS Cachan)

MSc Mathematiques, Apprentissage et Sciences Humaines (MASH) (PSL Research University, Paris-Dauphine University, ENS Paris)

RESULTS

Slides (new version!)

Slides are frequently updated. Please let us know if you spot typos!

Outline

Many problems in real-world applications of machine learning can be formalized as classical statistical problems, e.g., pattern recognition, regression or dimension reduction, with the caveat that the data are often not vectors of numbers. For example, protein sequences and structures in computational biology, text and XML documents in web mining, segmented pictures in image processing, or time series in speech recognition and finance, have particular structures which contain relevant information for the statistical problem but can hardly be encoded into finite-dimensional vector representations.

Kernel methods are a class of algorithms well suited for such problems. Indeed they extend the applicability of many statistical methods initially designed for vectors to virtually any type of data, without the need for explicit vectorization of the data. The price to pay for this extension to non-vectors is the need to define a so-called positive definite kernel function between the objects, formally equivalent to an implicit vectorization of the data. The "art" of kernel design for various objects have witnessed important advances in recent years, resulting in many state-of-the-art algorithms and successful applications in many domains.

The goal of this course is to present the mathematical foundations of kernel methods, as well as the main approaches that have emerged so far in kernel design. We will start with a presentation of the theory of positive definite kernels and reproducing kernel Hilbert spaces, which will allow us to introduce several kernel methods including kernel principal component analysis and support vector machines. Then we will come back to the problem of defining the kernel. We will present the main results about Mercer kernels and semigroup kernels, as well as a few examples of kernel for strings and graphs, taken from applications in computational biology, text processing and image analysis. Finally we will touch upon topics of active research, such as large-scale kernel methods and deep kernel machines.

References

N. Aronszajn, "Theory of reproducing kernels", Transactions of the American Mathematical Society, 68:337-404, 1950.
C. Berg, J.P.R. Christensen et P. Ressel, "Harmonic analysis on semi-groups", Springer, 1994.
N. Cristianini and J. Shawe-Taylor, "Kernel Methods for Pattern Analysis", Cambridge University Press, 2004.
B. Schölkopf et A. Smola, "Learning with kernels", MIT Press, 2002.
B. Schölkopf, K. Tsuda et J.-P. Vert, "Kernel methods in computational biology", MIT Press, 2004.
V. Vapnik, "Statistical Learning Theory", Wiley, 1998.

Schedule

Lecture (in english) take place at ENS Cachan in amphi Marie Curie, 1-4pm.

Date	Lecturer	Topic	Slides
Jan 11	JPV	Positive definite kernel, RKHS, Aronszajn's theorem	1-45
Jan 18	JM	Kernel trick, Representer theorem, kernel ridge regression	46-95
Jan 25	JPV	Supervised classification, Kernel logistic regression, large margin classifiers, SVM	96-156
Feb 1	JM	Unsupervised analysis, kernel PCA, kernel CCA, kernel K-means, large-scale optimization	159-194, 532-551
Feb 8	JPV	Green, Mercer, Herglotz and Bochner kernels	195-269
Mar 1	JM	Kernels from generative models, string kernels	290-371
Mar 8	JPV	Graph kernels, kernels on graphs	393-491
Mar 15	JM	Large-scale kernel machines, deep kernel learning	552-624

Evaluation

The final note will be an average of the homeworks (60%) and the project (40%). You can solve each homework and project alone or with up to two friends; however you should not be twice with the same friend.

Homework 1 (due Jan 25)
Homework 2 (due Feb 8, please indicate on your homework your affiliation, MVA, MSV, MASH, master X...)
Data Challenge (due March 10th, follow the instructions there)
Link to upload the data challenge report + code, due March 12th.
See this note on changes in the course evaluation

PhD, Internship offers

Next generation tools for image based transcriptomics, at MINES ParisTech and Institut Curie

Back to homepage