Bi-Weekly Datalake DepOps meeting (Rizart Chairing)

Europe/Paris
Rizart Dona (CERN)
Description

Weekly meeting to discuss progress on EDLK JIRA issues: https://jira.skatelescope.org/issues/?filter=15115

Zoom room: https://skatelescope.zoom.us/j/97713259777?pwd=Q2EwSWZ3NkRaazFRSy9YT3Y5UmdJZz09

    • 11:00 11:30
      Hot topics

      Deletion Issues:

      ## RUCIO Scope deletion campaign

      * In progress

      ## RUCIO RSE deletion campaign

      * ALPAMED-DPM still not completely drained

           - Frederic: Tried to contact person responsible but not solved yet

           - Riccardo: 27 RSEs currently, (26 "actual", one is JUPYTERLAB-SCRATCH)

      ## RUCIO Reaper issues

      * 4 reaper instances with 6 threads each (24 RSEs in parallel should be handled)

      * reaper chunk size: from 100 -> 1000, how many deletions to do in bulk in one RSE, there might be a deployment issue with helm charts

      * greedy will be ON during DAC21, this means all non rule protected data will be deleted ASAP

      Other:

      ## MaxSpace configuration in CRIC (only when non greedy policy applies)

      Here's the list of RSEs that need to be checked:

      ALPAMED-DPM - DONE(today)
      AWS_WEBDAV - TODO
      CNAF-STORM - DONE (+ TAPE)
      INFN-ROMA1 - TODO(?)
      PIC-DCACHE - DONE(today)
      PIC-DCACHE-TAPE - DONE(today)
      PIC-INJECT - DONE(today)
      SARA-DCACHE - TODO

      *ACTION on Rizart*: Create tickets for the TODO

      ## OIDC support in ESCAPE Rucio:

      Missing templates issue fixed, looking to consolidate the configuration. We need to clarify if any experiment will try the token flow for DAC21

          - Paul: Maybe somebody mentioned it, but not sure

          - Will iterate again tomorrow during WP2 Meeting

      ## AWS_WEBDAV -> not AWS any more, hosted at Openstack@CERN, talks S3

      ## CTA -> 2 use cases

      * Long haul -> PIC (by Agustin)

      * Reprocessing which is done once/year, but will not do during DAC21

    • 11:30 11:40
      Datalake health

      Dashboards:

      RSES:

      ### LAPP-WebDAV

      AuthN/Z status:
      - X.509/VOMS RED (not going to be solved any time soon, maybe never)

      ### INFN-ROMA1

      * A ticket/email is needed to Alessandro

      * Everything fails

      ### FAIR-ROOT

      * Some issues, Paul-Niklas Kramp & team are looking into it

      ### ALPAMED-DPM

      * Some issues, not sure when will they be fixed, Frederique Chollet is working on it

      * Info: It is a federation of storages

      * Paul M: Maybe try to add storages one at a time in order to figure out where the federation fails

      AuthN/Z Tests:

      * Missing RSEs are now included

      * TODO: Create tickets for red ones

       Red:

      * EULAKE-* (TODO-Rizart)

      * ALPAMED-DPM (TODO?-Chollet)

         - Andrea: DPMs will probably never be able to support OIDC functionality (DPM expertise is needed to clarify)

         - Frederic: I think latest DPM supports OIDC

         - Andrea: Two levels of support, one is OIDC in general and second one based on the WLCG Profile

      * DESY-DCACHE (TODO-Paul Millar )

          - Paul Millar : Config in place, but something there could be wrong

      * INFN-NA-DPM* (TODO?)

      * INFN-ROMA1 (TODO-Alessandro)

      Feedback from Alessandro:
      Indeed, we have a whole SAN down (network down), we're working on it since this morning when it dropped the communication.

      More news later... thanks,

      * SARA-DCACHE (TODO-Alexander)

      Other:

      * Action(Aleem/Rizart): understand sara-swift RUCIO issues:

          - Upload works with rclone but not with rucio-upload

          - Gfal, metalink issue, server does not support metalink functionality

           - Aleem: Will going to test AWS_WEBDAV new config so if that works we can replicate the same for SARA-SWIFT

           - Aleem: We should contact the dev/support of gfal and ask

    • 11:40 12:00
      AOB