Patatrack or: How CMS Learned to Stop Worrying and Love the GPU (Andrea Bocci, CERN)
par
amphi Charpak
The CMS High Level Trigger experience with GPUs and performance portability
To enhance computational efficiency and leverage a broader range of computing resources, the CMS software framework (CMSSW) has been extended to offload parts of the physics reconstruction to NVIDIA GPUs. This approach has been successfully deployed in the CMS High Level Trigger (HLT) since the start of Run 3. However, managing a heterogeneous computing farm that relies equally on CPUs and GPUs introduces additional challenges compared to a traditional, CPU-only setup. This seminar will begin with an overview of CMS's decision to adopt GPUs for the HLT in Run 3, the challenges encountered, and the major accomplishments. It will then explore how this strategy is evolving for Run 4 and beyond to further improve performance, efficiency, and maintainability.
To support multiple back-ends while avoiding the need to develop, validate, and maintain separate implementations of reconstruction algorithms for each, CMS has adopted the Alpaka performance portability library. Alpaka (Abstraction Library for Parallel Kernel Acceleration) is a header-only C++ library that enables performance portability across different architectures by abstracting parallel execution. It supports both serial and parallel execution on CPUs, as well as highly parallel execution on NVIDIA, AMD, and Intel GPUs.
This talk will explore how Alpaka is integrated into the CMS software and build system to maintain a single code base, compile for multiple back ends using different toolchains, and dynamically select the most suitable back-end at runtime. It will also compare the CPU and GPU performance of the CMS High Level Trigger. Finally, the discussion will compare the power efficiency of GPUs and CPUs, demonstrating how the Alpaka-based implementation enables both high performance and energy-efficient computing across diverse architectures.