6–11 Jul 2025
PALAIS DU PHARO, Marseille, France
Europe/Paris timezone

Event Tokenization and Next-Token Prediction for Anomaly Detection at the LHC

Not scheduled
20m
PALAIS DU PHARO, Marseille, France

PALAIS DU PHARO, Marseille, France

Parallel T16 - AI for HEP (special topic 2025) T16

Description

Advances in Machine Learning, particularly Large Language Models (LLMs), enable more efficient interaction with complex datasets through tokenization and next-token prediction strategies, providing a novel framework for analyzing high-energy physics datasets. This talk presents and compares various approaches to structuring particle physics data as token sequences, allowing LLM-inspired models to learn event distributions and detect anomalies via next-token (or masked token) prediction in proton-proton collisions at the Large Hadron Collider (LHC). By training solely on background events, the model reconstructs expected physics processes, learning properties of the given Standard Model (SM) processes. During inference, both background and signal events are processed, with deviations in reconstruction scores flagging anomalous events—offering a data-driven approach to distinguishing processes or uncovering physics beyond the Standard Model (BSM). This technique is particularly relevant for exploring rare or unexpected signatures, such as four-top-quark production or supersymmetric (SUSY) processes. The method is tested using simulated LHC Run 2 (√s = 13 TeV) proton-proton collision data from the Dark Machines Collaboration, replicating ATLAS conditions, specifically targeting SM and BSM four-top-quark final states. The event tokenization strategies presented in this talk not only enable anomaly detection but also represent a potential new approach for training a foundation model at the LHC. By integrating state-of-the-art ML techniques with fundamental physics principles, this approach paves the way for more adaptive data-driven methods in particle physics, potentially enhancing future searches for new physics at the LHC and beyond.

Authors

Ambre Visive (Nikhef) Clara Nellist (LAL, Orsay) Polina Moskvitina (Nikhef) Roberto Ruiz de Austri (IFIC) Sascha Caron (High-Energy Physics, Radboud University, The Netherlands and National Institute for Subatomic Physics (Nikhef), The Netherlands)

Presentation materials

There are no materials yet.