WP2 fortnightly meeting

Europe/Paris
Description
Join Zoom Meeting
https://cern.zoom.us/j/92926380866?pwd=YkNNb0lPM0RCbjJMajh0SmJBUUV2QT09
Meeting ID: 929 2638 0866
Passcode: 373008
 
    • 11:00 AM 11:10 AM
      News 10m
      Speakers: Rosie Bolton (Square Kilometre Array Organisation) , Xavier Espinal (CERN)
      • Mid-term review will take place second half of November 2020.
        • The reviewer from the EC reviewer has been appointed. 
        • Preparations: ~30mins/WP + ~20mins discussion draft to be circulated with project management by Friday 23 October
        • E-EB meeting in the first week of November (2-6) to discuss and review the slides of each WP.
      • Milestones and deliverables:
        • M2.3 (M22): Second WP2/DIOS Workshop (virtual event) 
          • Proposal: 9th of December
        • M2.4 (M24): Expanded prototype. Verify experiment data access from compute platforms (including commercial clouds)
          • Heads-up to review storage resources deployed. To evolve data lake infrastructure towards a meaningful size, allowing to scale-up data ingestion and throughput challenges. 
      • FDR exercise dates proposal
        • 1st Dress Rehearsal:  approx. one week before the workshop (24th Nov)
        • 2nd Dress Rehearsal: approx. two weeks after the workshop (15th Dec)
      • Misc:
       
       
       
       
       
       
    • 11:10 AM 11:15 AM
      Pilot data lake assessment: EOS EULAKE update 5m
      Speaker: Xavier Espinal (CERN)
      • EOS instance at CERN (aka EULAKE) functional, transfer tests reasonably happy overall.
      • Several issues were fixed
        • Some of them coming from legacy usage (first prototype for early data lake tests)
          • Directory permissions
          • Legacy disk layouts, legacy configurations coming from Wigner times.
          • Remote FSTs running old FST daemons (or unattended) has been set to read-only, let me know in case you want them back (IIRC this is tracked by the depops team)
        • Spotted some syntax problems on the CRIC configuration, they have been fixed.
      • 2 more disk servers have been added to allow setting up different QoS and in preparation for the prototype phase: 163 FS and 460TB in total.
      • xrdcp, GFAL, FTS and RUCIO OK. Webdav, gsiftp and root protocols enabled.
        • Few remaining issues (being followed-up by the DepOps team)
          • xroot TPC when EOS is the source and dCache is the destination  
            • At the time of the opening the TPC sockets, EOS FTSs only accepts UNIX connections and if dCache node accepts only accept gsi outgoing connections they do not understand each other well.
              • The fix is  on the dcache movers config: pool.mover.xrootd.tpc-authn-plugins=gsi,unix
          • ​​​​​​​GSI TPC: the FQDN of the client that is seen at the destination is different from the FQDN that the client see when connecting to the source: tpc origin mismatch 
       
    • 11:15 AM 11:20 AM
      Pilot data lake assessment: Experiment data injection update 5m
      Speakers: Andrea Ceccanti (INFN) , Riccardo Di Maria (CERN) , Yan Grange (ASTRON, the Netherlands Institute for Radio Astronomy)
    • 11:20 AM 11:25 AM
      Pilot data lake assessment: QoS/data lifecycles update 5m
      Speakers: Mr Muhammad Aleem Sarwar (ESCAPE Project) , Paul Millar (DESY)
    • 11:25 AM 11:30 AM
      Pilot data lake assessment: datalake automated tests and monitoring 5m
      Speakers: Rizart Dona (CERN) , Rosie Bolton (Square Kilometre Array Organisation)
      • Dashboards development is ongoing.
      • Gfal tests are running and are successful for all sites at the moment (https://monit-grafana.cern.ch/d/TMScKNjWk/gfal-testing?orgId=51).
      • FTS tests are running and are ~90% successful at the moment (https://monit-grafana.cern.ch/d/000000420/fts-transfers?orgId=51), details of the failures are being followed up on the DepOps meeting as well as in the JIRA tickets.
      • Rucio tests are running.
      • Rucio hermes2 deployment effort is ongoing, hermes1 is not running at the moment thus you cannot see any Rucio events at the dashboard, by the end of the week this will be restored.
      • Continuous development of testing code, token based authZ integration is yet to be implemented among other things (proper error handling/metadata support for FTS transfers, etc.).
      • We need to start coordinating about how to perform the periodical Datalake debugging process, that is, inspect monitoring, identify the current issues of sites/endpoints, take action to solve them.
    • 11:35 AM 11:55 AM
      AOB/Shadow round table 20m

      Please chime in in case you have something to report:
      - Sites: CERN, INFN, DESY, GSI, INFN, Nikhef, RUG, SURFSara, CC-IN2P3, IFAE-PIC, LAPP, INAF, Aarnet
      - Experiments: HL-LHC, FAIR, KM3Net, SKA, CTA