Accèder directement au contenu

T109 - Data analysis

Biology Master, ENS
Year : 1 (M1)
Semester : 1 (S1)

Course code : BIO-M1-T109-S1

Course name : Data analysis

Schedule : 2020 provisional schedule

Coordinator :
Emeline Perthame (Institut Pasteur)
Lucie Zinger (ENS)

ECTS : 3

*** Registration required : course with a limited number of students***
To enrol, contact the course coordinators before September.

Keywords : Statistical inference (estimation, hypothesis testing), linear regression, analysis of variance, multivariate analyses (Principal Components Analysis, clustering)

Prerequisites for the course :
Basics in R/Markdown programming
Basics in statistics (e.g. sampling, random variables, discrete and continuous distributions, quantiles, etc).
For newly-arrived M1/M2 students wishing to enrol for this course, it is mandatory to enrol for - and attend - the course “BIO-M2-E01-S1 Training in mathematics and computer science”

Course objectives and description :
Biological data are often complex and challenging to analyze due to non-normal distributions, nonlinear relationships, spatial/temporal structures, and high dimensionality, in particular in the era of Big Data.
This course will introduce the students to key concepts and statistical tools for the experimental design and analysis of biological data. More specifically, the students will be made familiar with hypothesis testing, univariate statistical tests (e.g. ANOVA), linear models, descriptive multivariate analyses such as Principal Component Analysis (PCA) and clustering. All these different methods will be illustrated with current questions and data type in biology (e.g. “omics” data), and their associated analytical challenges will be introduced.
The course will alternate theoretical aspects and computer exercises on small datasets with the R Studio software. The students will be assigned a small project involving the different concepts and tools covered by the course.
Note that the Thursday 1st October afternoon will be an open session to answer your questions.

Assessment / evaluation :
- A short written report of the project, to be sent to the coordinators before the 11th October 12:00 pm.

Course material (hand-outs, online presentation available, …) :
This year the course will be given either onsite (default), or remotely depending on the covid situation. In both cases, students will need to have a login from the Biology Departement of the ENS to visualize the course and/or participate to the practicals. More information will be given to the attendees by email.

Suggested readings in relationship with the module content (textbook chapters, reviews, articles) :
French :
Poinsot, D. (2005). Statistiques pour statophobes. Université de Rennes
Millot, G. (2018). Comprendre et réaliser les tests statistiques à l’aide de R : manuel de biostatistique. De Boeck Superieur.
Pagès, J. (2010). Statistiques générales pour utilisateurs. Presses Universitaires de Rennes.
English equivalents :
Van Emden, H. (2012). Statistics for terrified biologists. John Wiley & Sons.
Crawley M.J. (2005) Statistics : An Introduction using R.
Holmes, S., Huber, W., & Martin, T. (2017). Modern statistics for modern biology
Online course in English
>http://rafalab.github.io/pages/harvardx.html