System administration for HPC

High-performance computing (HPC) systems are essential for running large-scale analyses and supporting modern research workflows. While many researchers learn how to use HPC clusters, fewer understand what happens behind the scenes to keep these systems reliable, efficient, and scalable. Managing users, software, workloads, and resources in a shared environment requires a different level of understanding, one that goes beyond simply submitting jobs.

This advanced course focuses on the system administration side of HPC. You will work with the tools and practices used to operate and maintain clusters, including workload management with Slurm, containerised applications, automated configuration, and system monitoring. Through hands-on exercises, you will learn how to manage users and software environments, support scientific applications, and ensure that HPC systems run smoothly in research or industry settings.

Learning outcomes

Manage users, filesystems, services, and software in an HPC environment
Configure and operate workloads using a scheduler such as Slurm
Deploy and maintain scientific applications using containers and software environments
Automate and monitor systems using modern tools for configuration and performance tracking

Target audience

Interested in working with HPC systems or supporting research computing
Identify yourself as a system administrator, technical staff member, or researcher moving into HPC
want to understand how HPC systems are built, maintained, and operated
This course is designed for participants with some experience in Linux or the Unix shell who want to take the next step towards managing HPC systems.
- if you don’t meet all these prerequisites, you can familiarise yourself with the basics through our introductory HPC and Unix course

Requirements

A PC/Laptop with with an up-to-date web browser and access to a terminal with SSH capabilities
- Ideally a two-screen setup so you can follow the workshop while trying on your own

Training material

Video recordings

These recordings from previous workshops allow you to revisit the course content or work through it at your own pace.

Your trainers

Alan O'Cais (University of Barcelona)
Helena Vela (HPCNow! - Do IT Now Group)

Written material

Here you can explore the written material and exercises which are available in several languages.