Modern technologies like DNA microarrays or high-throughput sequencing are revolutionising biology and medical research. By allowing the collection of large amounts of measures at the molecular level on living organisms, they pave the way to a quantitative and rationale analysis of biological systems. Unsurprisingly, statistics and machine learning play an important role in this revolution. By processing large collections of datasets, they allow to extract new biological knowledge and infer predictive models.
The goal of this course is to present a few modern statistical learning techniques, and to touch upon a selected panel of applications in computational and systems biology. We will study in particular support vector machines (SVM) and kernels, as well as feature selection techniques including lasso regression. Applications include protein annotation, virtual screening in drug design, prognostic and predictive models for personalised medicine in oncology, and gene network inference in systems biology.
When | What |
Friday, Feb 6, 4:30pm-7:30pm | Introduction, learning in high dimension, ridge regression and ridge logistic regression |
Friday, Feb 13, 4:30pm-7:30pm | SVM, kernel trick, kernel ridge regression |
Friday, Mar 6, 4:30pm-7:30pm | Kernel methods |
Friday, Mar 13, 4:30pm-7:30pm | Kernels with network information, data integration with kernels, string kernels |
Friday, Mar 20, 4:30pm-6:30pm | Sparsity: feature selection, lasso, atomic norms |
Friday, Mar 27, 4:30pm-6:30pm | No course: work on the project |
To validate the course you must participate to the Prostate cancer DREAM challenge competition. This is a crowd-sourcing experiment to improve our ability to treat prostate cancer.
Students should make teams of up to 4 students and register officially to the challenge to participate. At the end, each team should provide a detailed report describing what it did, and submit predictions to the challenge.
Although each team should participate on its own, discussion among teams is welcome.
A final report in PDF should be sent by each team to Jean-Philippe.Vert@mines-paristech.fr before May 13, 2015. Teams are encouraged to participate to the challenges until the end: in addition to validating this course, you have the opportunity to directly impact cancer treatment, and get a wonderful story to write on your resume!