Biomedical data mining with R

This course is designed for those with interest in data analysis and data exploration, in particular for biomedical data. This is a hands-on course where several real life biomedical datasets will be explored.

Schedule: May 14, 21, 28– 9:00 – 13:00; 14:00 – 18:00

Instructors:
Miguel Rocha, Associate Professor, Dept. Informatics, Univ. Minho (coordinator)
Pedro Ferreira, Assitant Professor, Faculdade de Ciências, Univ. Porto; Researcher - i3S

Description:
Biomedical  data  is  currently  being  generated  at  an  unprecedented  scale. Making  sense  of  the  data obtained from different scientific experiments is a big challenge.
This course is designed for those with interest in data analysis and data exploration, in particular for biomedical data. It provides an introduction to the R system. R is a language and  environment  for statistical computing and graphics. The course then follows with the discussion of different approaches to summarize and visualize datasets, perform statistical tests and learn patterns from the data including the application of classical and state-of-the-art methods.
This is a hands-on course where several real life biomedical datasets will be explored. Participants are welcome to bring their own datasets for an initial exploratory analysis and discussion.
No prior programming experience is required but participants should be eager to learn and explore the potential of the R system.
Each participant should bring a laptop.

Total number of hours: 24

PROGRAM

Session 1:
- Introduction to the R system (including installation): vectors, strings, matrices, lists, data frames
- Reading and writing data
- Gentle introduction to programming in R: function definition, data flow control (light introduction)
- Brief introduction to data analysis: tasks, data representation
- Data exploration: descriptive statistical functions, pre-processing
- Visualization: graphics in R

Session 2:
- Statistical Analysis (review): distributions, modeling, hypothesis testing, inference
- Multiple Regression Analysis and correlations
- Dimensionality Reduction: Principle Component Analysis, Multi-Dimensional Scaling
- Clustering: k-means clustering, hierarchical clustering

Session 3:
- Introduction to Machine learning / predictive analysis
- Time series analysis
- Survival analysis
- Case studies; Bring your data: exploring user's datasets

Organization: NEBIUM – Núcleo de Estudos de Bioinformática da Universidade do Minho

Contact (to register and get instructions on payment): mrocha@di.uminho.pt

Related People