26–28 nov. 2025
LPC Caen and GANIL
Fuseau horaire Europe/Paris

JetParticle-JEPA: Self-Supervised Representation Learning for Jet Tagging in High-Energy Physics

28 nov. 2025, 12:40
20m
G. Iltis (LPC Caen)

G. Iltis

LPC Caen

6 Bd Maréchal Juin, 14000 Caen
Analysis : event classification, statistical analysis and inference, anomaly detection Transformers and Attention-Based Models

Orateur

Guillaume Letellier (Université de Caen Normandie | GREYC)

Description

The Large Hadron Collider (LHC) is designed to probe the limits of the Standard Model and search for new phenomena. Machine Learning (ML) has become a powerful tool in this endeavor, particularly for jet tagging tasks. Large-scale datasets such as JetClass, which contains over 100 million simulated jet events, enable not only the training of supervised models but also the development of foundation models that leverage self-supervised learning (SSL) to uncover the underlying structure of the data without relying on labels.

Our approach, JetParticle-JEPA, is a novel framework based on the Joint Embedding Predictive Architecture (JEPA). The model is trained to predict the properties of masked particles within a jet from their surrounding context, learning in a latent representation space rather than directly reconstructing the input. This design encourages the discovery of abstract and physically meaningful features of jet structure. At its core, JetParticle-JEPA builds upon the Particle Transformer (ParT) architecture, which is ideally suited for the permutation-invariant nature of particle clouds and naturally incorporates pairwise particle interactions. To ensure that the learned representations remain physically meaningful, we incorporate specialized output heads for particle identification, trajectory displacement, and physics-constrained losses. These components serve both to evaluate whether the model captures fundamental physics principles and to inject explicit inductive biases that guide its learning.

Although still under development, preliminary results already surpass concurrent approaches such as J-JEPA and HEP-JEPA on benchmark jet tagging tasks. By learning directly from the data in a self-supervised manner, JetParticle-JEPA offers a promising path toward more accurate and reliable ML models for particle physics, ultimately accelerating the pace of scientific discovery.

References:
- Qu, H., Li, C., & Qian, S. (2022, June). Particle transformer for jet tagging. In International Conference on Machine Learning (pp. 18281-18292). PMLR.
- Katel, S., Li, H., Zhao, Z., Kansal, R., Mokhtar, F., & Duarte, J. (2024). Learning symmetry-independent jet representations via jet-based joint embedding predictive architecture. arXiv preprint arXiv:2412.05333.
- Bardhan, J., Agrawal, R., Tilak, A., Neeraj, C., & Mitra, S. (2025). HEP-JEPA: A foundation model for collider physics using joint embedding predictive architecture. arXiv preprint arXiv:2502.03933.

Auteur

Guillaume Letellier (Université de Caen Normandie | GREYC)

Documents de présentation