High-performance computing (HPC) systems are essential for running large-scale analyses and supporting modern research workflows. While many researchers learn how to use HPC clusters, fewer understand what happens behind the scenes to keep these systems reliable, efficient, and scalable. Managing users, software, workloads, and resources in a shared environment requires a different level of understanding, one that goes beyond simply submitting jobs.
This advanced course focuses on the system administration side of HPC. You will work with the tools and practices used to operate and maintain clusters, including workload management with Slurm, containerised applications, automated configuration, and system monitoring. Through hands-on exercises, you will learn how to manage users and software environments, support scientific applications, and ensure that HPC systems run smoothly in research or industry settings.



These recordings from previous workshops allow you to revisit the course content or work through it at your own pace.
Your trainersHere you can explore the written material and exercises which are available in several languages.