System administration for HPC

High-performance computing (HPC) systems are essential for running large-scale analyses and supporting modern research workflows. While many researchers learn how to use HPC clusters, fewer understand what happens behind the scenes to keep these systems reliable, efficient, and scalable. Managing users, software, workloads, and resources in a shared environment requires a different level of understanding, one that goes beyond simply submitting jobs.

This advanced course focuses on the system administration side of HPC. You will work with the tools and practices used to operate and maintain clusters, including workload management with Slurm, containerised applications, automated configuration, and system monitoring. Through hands-on exercises, you will learn how to manage users and software environments, support scientific applications, and ensure that HPC systems run smoothly in research or industry settings.

Learning outcomes

  • Manage users, filesystems, services, and software in an HPC environment
  • Configure and operate workloads using a scheduler such as Slurm
  • Deploy and maintain scientific applications using containers and software environments
  • Automate and monitor systems using modern tools for configuration and performance tracking

Target audience

  • Interested in working with HPC systems or supporting research computing
  • Identify yourself as a system administrator, technical staff member, or researcher moving into HPC
  • want to understand how HPC systems are built, maintained, and operated
  • This course is designed for participants with some experience in Linux or the Unix shell who want to take the next step towards managing HPC systems.

Requirements

  • A PC/Laptop with with an up-to-date web browser and access to a terminal with SSH capabilities
    • Ideally a two-screen setup so you can follow the workshop while trying on your own

Training material

These recordings from previous workshops allow you to revisit the course content or work through it at your own pace.

Your trainers
  • Alan O'Cais (University of Barcelona)
  • Helena Vela (HPCNow! - Do IT Now Group)

Here you can explore the written material and exercises which are available in several languages.