MINUTES CAF MEETING 16/09/2021
                                    https://indico.in2p3.fr/event/24974


Remote (morning)        Aresh, Catherine, David B., David C., Edith, Eric, Fred,
                                      Laurent, Pierre-Antoine,
Remote (afternoon)      Aresh, David B., Fred, Laurent, Pierre-Antoine
Apologized :  Andrea, Jean-Pierre, Arnaud, Stéphane, Catherine (afternoon), David C. (afternoon)

Morning session:

1) Intro (Fred)
   - ATLAS resources usage since previous CAF (3 months) similar as usual :
     500k running slots, dominated by MC (~79%), on grid (above pledge).
   - High disk usage as usual, 270 TB pledged
   - Network (Tape) challenge : to be done 1st (2nd) week of October
         -> network challenge only for Nucleus ?
   - AMI should switch to CERN installation asap
   - C-RSG requests for 2023 : cpu +21%, disk +18%, tape +34%
           -> as usual, pressure on storage is higher than on cpu
   - LHCC review in November -> see links to different presentations
   - CAF-user wkshop on 9th Dec.
          For afternoon session :
                - ~1h of discussion with S&C management
                - ~30mn on software by Pierre-Antoine (exact subject to be refined,
                  but cover eg. Component Accumulator)
                - ~30mn by Aresh on Data carrousel + Tape Challenge results
            from afternoon discussion (Eric) : would be nice to have more feedbacks
                  from users if performance resources fits their requirements
                  (not only that they are available ...)
                     -> we could add specific questions on template for group presentation
                     -> also including use of "new" tools such as jupyter notebooks etc ...

   - pledges for 2022 : to be filled in CRIC for end of september
          -> see slides for preliminary pledges for grid
          -> for CC-IN2P3
                  request for sps, LGD, LGT, GPU remain at same level as in 2021

2) FR-T2-cloud (Fred)
  - regular/monthly reports available on
    https://cernbox.cern.ch/index.php/s/vrq0bs2qJGY72NV
  - FR-cloud = 15% of T2s on this period
   - normal/usual profile of jobs received by activities
   - by country in FR-cloud : France=51%, Japan=38%, Romania=7%,
                                             China=2%, Hong-Kong=1%
        - CPU vs pledge for different sites
            CPPM too low (-> check with ADC resp.
            to change some PandaQueue parameters in CRIC)
        - Storage vs pledge : ok
        - some sites have large amount of dark data on their SCRATCHDISK
              -> procedure to be checked
       - ggus tickets: normal traffic, mostly for transfer/deletion errors

3) Reports
 3a) LCG-FR (Laurent):
      - Network / Tape challenges : see Introduction
      - FR-grid certificate: won't be handled anymore by Renater but same provider
        as for CNRS one -> no news


  3b) DOMA-FR
            -> see introduction slides for link to DOMA documentation in preparation
                 for LHCC review in November

 4) Tour des labos
   CC : hardware/pledges have been bought early
           as consequence, delay for the deployement of new tape library for next year


   CPPM : hardware bought in July, mostly to renew, no/small increase

   IJCLab: hardware to be bought on MatInfo5
                         ANR post-doc starting on ACTS with David Rousseau (same ANR as LAPP)

   IRFU: need replacement for CAF for Jean-Pierre

   LPNHE: hardware bought on MatInfo4 renewal + ~10% increase
                 also new machines for cloud computing
   LAPP: (usual) network instabilities                       
              ANR post-doc starting on ACTS with Jessica Leveque (same ANR as IJCLab)
              Arturo Sanchez Pineda, membre ATLAS mais payé par ESCAPE travaille sur le
              montage d'une chaine d'analyse OpenData
   LPC:  RAS
   LPSC : reduction of pledges for future stop of T2 in 2023
   L2IT : fine with usage of CC resources,
             plenary talk at vCHEP
             ongoing collab with Exa.TrkX US project on tracking GNN,
             now using detailed simulation

Afternoon:
  1) T1 cloud report (Aresh)
      - good availabiliy/reliability
      - FR T1 represents 12% of all T1
      - cpu usage above pledge, with few (3) drops, understood, in last three months
      - Frontier R&D
            - Performance degradation over July due to service overload
            - for newt week tests of direct connection between CC Frontier and CERN DB
     - Preparation of Tape Challenge -> several links given in slides


   2) CC (Fred):
       - sps has 360 TB, >100 TB free
       - LOCALGROUPDISK, TAPE
            ==> few users have a lot of data
            ==> excel file provided with list of main users / per lab
            ==> request for 2022 remain the same as for 2021
       - local batch : run smoothly, up to 3500 jobs pending, then run
                ~3% of ATLAS cpu is done by local users -> mostly by few users   
            ==> request for 2022 remain the same as for 2021
       - gpu usage : ~60% of pledge used
            ==> request for 2022 remain the same as for 2021
       - hpc farm
            ==> no formal request : if user come then will open a ticket to access farm

 3) Software (Arnaud/Fred)
      -> few slides shown on S&C plenary
      -> one slide from Arnaud about recent news
      -> see presentations/developments by Arnaud on b-tag

  4) ADAM

  3) HPC (Fred/Laurent)
     - pie-chart with HPC contribution by country -> no France !
                               HPC represents 34% of ATLAS cpu
     - access to Exascale machine : ongoing effort/collab between
       CC and IDRIS (project FITS)
 

5) IA / Machine Learning (all)
         - a few links to workshops