Pilot data lake assessment: datalake automated tests and monitoring5m
(CERN), Rosie Bolton
(Square Kilometre Array Organisation)
Dashboards development is ongoing.
Gfal tests are running and are successful for all sites at the moment (https://monit-grafana.cern.ch/d/TMScKNjWk/gfal-testing?orgId=51).
FTS tests are running and are ~90% successful at the moment (https://monit-grafana.cern.ch/d/000000420/fts-transfers?orgId=51), details of the failures are being followed up on the DepOps meeting as well as in the JIRA tickets.
Rucio tests are running.
Rucio hermes2 deployment effort is ongoing, hermes1 is not running at the moment thus you cannot see any Rucio events at the dashboard, by the end of the week this will be restored.
Continuous development of testing code, token based authZ integration is yet to be implemented among other things (proper error handling/metadata support for FTS transfers, etc.).
We need to start coordinating about how to perform the periodical Datalake debugging process, that is, inspect monitoring, identify the current issues of sites/endpoints, take action to solve them.
AOB/Shadow round table20m
Please chime in in case you have something to report:
- Sites: CERN, INFN, DESY, GSI, INFN, Nikhef, RUG, SURFSara, CC-IN2P3, IFAE-PIC, LAPP, INAF, Aarnet
- Experiments: HL-LHC, FAIR, KM3Net, SKA, CTA