EPS-HEP 2025

Name: EPS-HEP 2025
Start: 2025-07-06T16:00:00+02:00
End: 2025-07-11T18:45:00+02:00
Location: PALAIS DU PHARO, Marseille, France

6–11 Jul 2025

PALAIS DU PHARO, Marseille, France

Europe/Paris timezone

Contact

EPS-HEP2025-INDICO@L2IT.IN2P3.FR

Event Tokenization and Next-Token Prediction for Anomaly Detection at the LHC

10 Jul 2025, 09:50

20m

Salle 92 (Palais du Pharo)

Salle 92

Palais du Pharo

Parallel T16 - AI for HEP (special topic 2025) T16 (AI for HEP (special topic 2025))

Ambre Visive (Nikhef - University of Amsterdam)

Advances in Machine Learning, particularly Large Language Models (LLMs), enable more efficient interaction with complex datasets through tokenization and next-token prediction strategies, providing a novel framework for analyzing high-energy physics datasets. This talk presents and compares various approaches to structuring particle physics data as token sequences, allowing LLM-inspired models to learn event distributions and detect anomalies via next-token (or masked token) prediction in proton-proton collisions at the Large Hadron Collider (LHC). By training solely on background events, the model reconstructs expected physics processes, learning properties of the given Standard Model (SM) processes. During inference, both background and signal events are processed, with deviations in reconstruction scores flagging anomalous events—offering a data-driven approach to distinguishing processes or uncovering physics beyond the Standard Model (BSM). This technique is particularly relevant for exploring rare or unexpected signatures, such as four-top-quark production or supersymmetric (SUSY) processes. The method is tested using simulated LHC Run 2 (√s = 13 TeV) proton-proton collision data from the Dark Machines Collaboration, replicating ATLAS conditions, specifically targeting SM and BSM four-top-quark final states. The event tokenization strategies presented in this talk not only enable anomaly detection but also represent a potential new approach for training a foundation model at the LHC. By integrating state-of-the-art ML techniques with fundamental physics principles, this approach paves the way for more adaptive data-driven methods in particle physics, potentially enhancing future searches for new physics at the LHC and beyond.

Ambre Visive (Nikhef) Ambre Visive (Nikhef - University of Amsterdam) Clara Nellist (LAL, Orsay) Polina Moskvitina (Nikhef) Roberto Ruiz de Austri (IFIC) Sascha Caron (High-Energy Physics, Radboud University, The Netherlands and National Institute for Subatomic Physics (Nikhef), The Netherlands)

EPStalk_AVisive_LLMforAnomalyDetection.pdf

EPS-HEP 2025

Contact

Event Tokenization and Next-Token Prediction for Anomaly Detection at the LHC

Salle 92

Palais du Pharo

Speaker

Description

Authors

Presentation materials

Choose timezone

EPS-HEP 2025

Contact

Speaker

Description

Authors

Presentation materials