30 September 2024 to 3 October 2024
Toulouse, France
Europe/Paris timezone

Optimizing PyTorch: Accelerating Training and Inference with Compilation, Custom Kernels, and Beyond

3 Oct 2024, 11:40
25m
Le Village, Auditorium (Toulouse, France)

Le Village, Auditorium

Toulouse, France

31 Allées Jules Guesde, 31000 TOULOUSE
Oral presentation

Speaker

Mr Alvaro Moran (Hugging Face)

Description

In this talk, we'll explore cutting-edge techniques to optimize both training and inference in PyTorch, enabling faster, more efficient model execution. We'll dive into the power of PyTorch's torch.compile to accelerate workflows by fusing operations and generating optimized code, reducing runtime overhead. Additionally, we'll cover the use of custom kernels with tools like Triton, Pallas and CUDA, allowing fine-grained control over GPU and TPU execution for performance-critical tasks. Beyond that, we'll have an overview on various methods like mixed precision, memory optimization strategies, and distributed training, all aimed at achieving optimal performance for large-scale machine learning models.

Contribution length Short

Author

Mr Alvaro Moran (Hugging Face)

Presentation materials