Orateur
Description
The Vera C. Rubin Observatory’s LSST will detect millions of transient candidates through difference image analysis (DIA), issuing real-time alerts to the community. While DIA is highly sensitive, it is prone to spurious detections caused by noise, artifacts, or imperfect subtractions. Filtering these out—known as the "Real/Bogus" problem—typically relies on supervised machine learning trained on large, manually labeled datasets, which are costly and hard to scale.
We present a novel method for training classifiers without human labels, using synthetic source injection on real survey images to generate reliable training sets. Crucially, we address contamination from real, undetected astrophysical events in the negative class, enabling robust learning despite the absence of direct ground truth.
Our approach eliminates the need for extensive human annotation or unrealistic simulations, and additionally enables the discovery of missed transients in archival data. This opens the door to scalable, label-efficient classification in LSST and other large time-domain surveys.