Machine learning in computational biology

Jean-Philippe Vert

ENSAE
Spring 2015

Modern technologies like DNA microarrays or high-throughput sequencing are revolutionising biology and medical research. By allowing the collection of large amounts of measures at the molecular level on living organisms, they pave the way to a quantitative and rationale analysis of biological systems. Unsurprisingly, statistics and machine learning play an important role in this revolution. By processing large collections of datasets, they allow to extract new biological knowledge and infer predictive models.

The goal of this course is to present a few modern statistical learning techniques, and to touch upon a selected panel of applications in computational and systems biology. We will study in particular support vector machines (SVM) and kernels, as well as feature selection techniques including lasso regression. Applications include protein annotation, virtual screening in drug design, prognostic and predictive models for personalised medicine in oncology, and gene network inference in systems biology.

Slides

Schedule

WhenWhat
Friday, Feb 6, 4:30pm-7:30pmIntroduction, learning in high dimension, ridge regression and ridge logistic regression
Friday, Feb 13, 4:30pm-7:30pmSVM, kernel trick, kernel ridge regression
Friday, Mar 6, 4:30pm-7:30pmKernel methods
Friday, Mar 13, 4:30pm-7:30pmKernels with network information, data integration with kernels, string kernels
Friday, Mar 20, 4:30pm-6:30pmSparsity: feature selection, lasso, atomic norms
Friday, Mar 27, 4:30pm-6:30pmNo course: work on the project

Project: Prostate cancer DREAM challenge

To validate the course you must participate to the Prostate cancer DREAM challenge competition. This is a crowd-sourcing experiment to improve our ability to treat prostate cancer.

Students should make teams of up to 4 students and register officially to the challenge to participate. At the end, each team should provide a detailed report describing what it did, and submit predictions to the challenge.

Although each team should participate on its own, discussion among teams is welcome.

A final report in PDF should be sent by each team to Jean-Philippe.Vert@mines-paristech.fr before May 13, 2015. Teams are encouraged to participate to the challenges until the end: in addition to validating this course, you have the opportunity to directly impact cancer treatment, and get a wonderful story to write on your resume!



Back to my homepage