IN2P3 Machine Learning workshop

Europe/Paris
Amphi (CC-IN2P3)

Amphi

CC-IN2P3

21 avenue Pierre de Coubertin CS70202 69627 Villeurbanne cedex
Description

This workshop will cover current development with Machine Learning at IN2P3. 

Physical presence on site at CC-IN2P3 Lyon from 10:30 to 4:30 is recommended.

A Renavisio video conference has been booked (see detailed attached, and renavision user guide  )

 

Registration is closed

Please make sure you are subscribed to MACHINE-LEARNING-L@in2p3.fr on IN2P3 listserv to keep up to date with ML at IN2P3. 

Getting to CC-IN2P3 is just 15' tram from main train station Lyon-Part-Dieu plan

 

Do not forget to bring an ID to enter CC IN2P3!

 

Information resources mentionned during the workshop: AstroLearning , School of Statistics, IML resources

 

 

 

    • 1
      Introduction
      Orateurs: Balázs Kégl (LAL/CNRS), David Rousseau (LAL-Orsay)
    • 2
      Neural Network Tracking for LHCb Vertex Detector

      The LHCb experiment is scheduled for an upgrade at the end of 2018. In 2021, it has to collect data at collisions rate of 40MHz and an average of 5-6 PVs per event from 1MHz and 1-2 PVs per event today. We present an approach of using deep learning to reconstruct particle tracks in the vertex subdetector of LHCb, the Vertex Locator (VELO).

      Orateur: Da Yu Tou (Centre National de la Recherche Scientifique (FR))
    • 3
      TrackML tracking challenge for LHC tracking
      Orateur: David Rousseau (LAL-Orsay)
    • 4
      Active learning for "intelligent" simulation
      Orateur: Vladimir Gligorov (LPNHE)
    • 5
      Generative models for ATLAS calorimetry
      Orateur: Aishik Ghosh (LAL)
    • 6
      Documentation and tutorials discussion
    • 12:35
      Déjeuner au Domus
    • 7
      CTA Cerenkov telescope reconstruction
      Orateur: Mikaël Jacquemont
    • 8
      Transient photometric classification: an astronomical data challenge

      Among the many challenges imposed by the next generation of large scale astronomical surveys, the classification of transient sources is arguably one of the biggest obstacles to be overcomed before we can exploit the full potential of these new instruments. Although most of the standard astrophysical transient studies rely on high resolution spectroscopic observations, the new surveys will mostly deliver low resolution photometric measurements. Machine learning methods are then expected to overcome this sample selection bias providing reliable photometric classifications. In order to have an up to date picture of how different methods behave in this scenario, a new simulated data set is being developed - which will allow machine learning methods to be tested in a controlled environment. Moreover, PLAsTiCC (Photometric LSST Astronomical Time-series Classification Challenge) also aims to be a fertile ground for the development of new approaches based on LSST requirements. In this talk I will discuss the motivations and goals behind this data challenge and give details on how the broader community can engage in the challenge.

      Orateur: Emille Ishida (LPC-UBP)
    • 9
      NN for image reconstruction for medical application
      Orateur: Francoise Bouvet-Lefebvre (IMNC)
    • 10
      Gestion des logs au CCIN2P3 et passage à l'échelle : le ML à la rescousse ?

      Près de 100 millions de "logs" et 1 milliard de métriques sont collectées par jour dans les deux datacentres du CCIN2P3. Ils sont traités via une plateforme implémantant le modèle dit "lambda" : les événements traversent deux "pipeline" en parallèle. La première de faible latence permet un traitement synchrone, presque temps-réel. La deuxième permet un traitement "batch" asynchrone sur toute ou partie des événements passés.
      Les deux "pipeline" permettent de notifier les gestionnaires de service du bon ou mauvais fonctionnement de leurs services grâce à des algorithmes et des règles prédéfinis, at modifiables à souhait.
      Le problème de cette approche est qu'elle ne passe pas à l'échelle. En effet, la quantité de "logs" ne fait qu'augmenter, et le temps humain nécessaire à maintenir le jeu de règles et d'algorithmes qui permettent de détecter les problèmes également.
      Le besoin de trouver une autre stratégie se fait sentir, et les techniques de ML voire de DL semblent prometteuses pour assister par exemple en utilisant les techniques de "outlier detection".

      Orateur: Fabien Wernli (Sysadmin)
    • 11
      Automated training of RAMP on CC GPU cluster

      I will present a work undergone in collaboration with Bertrand Rigaud (CCIN2P3) about the creation of a pipeline to allow ML submissions from RAMP challenges to be automatically trained on the CC infrastructure. This will be used in the future to organise IN2P3-backed challenges (LSST, Euclid) and face a short-term high demand in computing power.

      Orateur: Alexandre Boucaud (Paris-Saclay Center for Data Science / LAL)
    • 12
      Experience with GPU platform, CC and elsewhere
    • 13
      Conclusion
      Orateurs: Balázs Kégl (LAL/CNRS), David Rousseau (LAL-Orsay)