Support
← Back to Learning Hub

Introductory Machine Learning in Biostatistics

Machine learning for health data, clinical prediction and biostatistical modelling.

A structured course for students who want to understand prediction modelling, validation, overfitting, calibration, clinical usefulness and responsible machine learning in medical research.

The course combines statistical thinking, R-based modelling, applied interpretation and health-data examples so students learn not only how models are fitted, but how they should be judged.

Course aim

Prediction, validation and interpretation for health data.

The course is built around medical machine learning as a disciplined biostatistical workflow, not a collection of algorithms.

Course snapshot

5

Core modules

25

Structured lessons

R

Browser-based practice

5

Applied case studies

Biostatistical prediction thinking

The course does not treat machine learning as button-clicking. It explains what a prediction target is, when predictors are measured, how outcomes are defined and why validation must match the clinical question.

Validation before complexity

Students learn why a simple validated model can be more useful than a complex model that leaks information, overfits, or performs poorly on new patients.

R-based practical learning

Selected lessons include R and WebR-style practice so students can connect theory with real modelling workflows while still focusing on interpretation.

Course structure

Five modules from foundations to applied medical ML.

Start with the language of prediction, then move through supervised learning, model evaluation, regularisation, ensembles and applied health-data case studies.

What makes this course different

Designed for responsible prediction, not shortcuts.

Clinical prediction rather than generic machine learning

Validation, calibration and usefulness explained carefully

Overfitting and data leakage treated as central topics

R-based workflow with interpretation-first teaching

Case studies based on health-data-style modelling questions

Clear links between statistics, biostatistics and ML

Case studies

Applied medical ML reports.

Case studies turn the modelling workflow into report-style interpretation with figures, metrics, clinical judgement, limitations and transparent conclusions.

Start the course

Begin with Module 1: Foundations of Machine Learning in Biostatistics.

Start with prediction thinking, predictor timing, validation, overfitting, leakage and the responsible reporting workflow.