Applied Machine Learning for Biological Data

Machine learning has become an important tool for analysing biological and genomic data, helping researchers uncover patterns, make predictions, and gain new insights from complex datasets. From identifying cell types to predicting disease outcomes, these methods are increasingly used across modern life sciences. At the same time, applying machine learning in practice requires more than just theory: it involves choosing the right approach, working with real data, and understanding how to evaluate results.

This course provides a hands-on introduction to applying machine learning methods to biological data using Python. You will work with real-world datasets and learn how to build, evaluate, and improve models, from basic data handling with NumPy and Pandas to machine learning techniques and deep learning with PyTorch. The course also introduces reproducible workflows and modern computational approaches, including containerisation and GPU-accelerated analysis.

Learning outcomes

  • Work with biological datasets using Python tools such as NumPy and Pandas
  • Apply machine learning methods including classification, regression, and clustering
  • Evaluate and improve models using validation, tuning, and appropriate metrics
  • Build reproducible workflows, including basic deep learning and scalable analysis approaches

Target audience

  • Interested in applying machine learning to biological or genomic data
  • Interested in working with real datasets using Python
  • Anyone looking to explore classification, regression, clustering, or deep learning in a biological context
  • Curious about building reproducible and scalable analysis workflows
  • This course is designed for participants with some experience in Python and data analysis who want to extend their skills towards machine learning.

Requirements

  • Just a PC/Laptop with an up-to-date browser Chrome, Safari and Firefox browsers are all supported (some older browsers, including Internet Explorer version 9, may not be)
    • Ideally a two-screen setup so you can follow the workshop while trying on your own

Training material

These recordings from previous workshops allow you to revisit the course content or work through it at your own pace.

Your trainers
  • Sabry Razick (University of Oslo)
  • Pubudu Saneth Samarakoon (University of Oslo)
  • Burcin Buket Ogul (University of Oslo)
  • Milan De Cauwer (SINTEF Norway)
  • Katarzyna Michalowska (SINTEF Norway)
  • Elias Myklebust (Simula Research Laboratory)

Here you can explore the written material and exercises which are available in several languages.