Bi-Weekly Datalake DepOps meeting (Rohini chairing)

Europe/Paris
Rohini Joshi (SKA Organisation)
Description

Weekly meeting to discuss progress on EDLK JIRA issues: https://jira.skatelescope.org/issues/?filter=15115

Zoom room: https://skatelescope.zoom.us/j/97713259777?pwd=Q2EwSWZ3NkRaazFRSy9YT3Y5UmdJZz09

    • 10:00 10:30
      Hot topics
      • Yan Grange: Discussion around timeouts issues and their impact on workflows https://jira.skatelescope.org/browse/EDLK-105
        • Issue still present on file sizes >= 60GB
        • Might be gridftp protocol related, need to investigate alternate protocols to isolate the issue.
        • Possible solution: Track logs live with Xavi, better synchronisation needed. Riccardo to provide relevant contact information to Yan.
      • New AWS RSE
        • In progress, documentation will come
        • Google Cloud RSE up next
      • Recent Rucio upgrade
        • Upgrade to 1.25 went well 
        • Next intermediate release will contain metadata fix for this issue: https://github.com/rucio/rucio/issues/4360 
      • SKA Rucio prototype instance
        • https://gitlab.com/ska-telescope/src/ska-rucio-prototype 
    • 10:30 10:40
      Datalake health
      • FAIR-ROOT struggles
        • One of largest outages in a while last week, fixed last week
        • Issue reappeared last night, possibly xrootd bug. Plan to move to xrootd5.1.1 going forward
        • xrootd4 support is being dropped in the near future (announced last week)
        • JIRA ticket will be opened if the issue occurs again
      • The issue of a large number of streamed transfers is being monitored here: https://jira.skatelescope.org/browse/EDLK-117
        • Subset of this will be fixed in part via OIDC support, once it is available
      • OIDC support (one of our oldest tickets) is in progress!
        • https://jira.skatelescope.org/browse/EDLK-5 

       

    • 10:40 11:00
      AOB