DataLake DepOps team launches! Please all DepOps team members fill in this doodle poll to find a regular (weekly) 30 minute slot https://doodle.com/poll/gkkiqc386brf3xfn
Orateurs:
Rosie Bolton(Square Kilometre Array Organisation), Xavier Espinal(CERN)
ESCAPE EB last thursday 8th of June. Summary from the different WP leads. Main topic was on INFRAEOSC03 participation.
Milestones and deliverables:
M2.2 (M18): Initial datalake pilot with at least 3 sites
Means of verification: Progress report, Active monitoring of the activity website.
M2.3 (M22): Second WP2 workshop to analyse performance of the datalake pilot
Means of verification:Workshop summary report.
M2.4 M24) : Expanded prototype (M24): RI data access from compute platforms (incl. commercial clouds)
M2.5 (M30): Datalake extension, serve data to external compute providers
M2.6 (M32): IS016363 certification process underway
D2.2 (M24): Assessment and analysis of performance of the pilot datalake
Lead: SKAO, deliverable type: Report.
D2.3 (M40): Final assessment and analysis of the full prototypes
Lead: CERN, deliverable type: Report.
Deployment and Operations team update (Rosie)
11:10
→
11:30
Task updates: activities and short-term plans20m
Orateurs:
Andrea Ceccanti(INFN), Paul Millar(DESY), Rosie Bolton(Square Kilometre Array Organisation), Xavier Espinal(CERN), Yan Grange(ASTRON, the Netherlands Institute for Radio Astronomy)
Task-1 Datalake
ESCAPE datalake pilot is getting through the consolidation phase. Datalake activities are progressing well in general terms.
Storage and orchestration layer stable, datalake is currently populated with some real data from several experiments: LOFAR, LSST, ATLAS and CMS data (moderate volumes). Workflows and pipelines implementation to test data access is progressing and started exercising data access.
Monitoring and dashboards being set-up: Transfer quality matrix, Storage QoS tests, Network performance monitor (PerfSonar), Job benchmarking, etc.
Live "view" of a datalake activity: transfers in-flight, data volume, throughputs, etc.
Easy green/yellow/red spotting grid to spot issues on data replication. Main source of information for the operations and deployment team.
Substantial contribution to the RUCIO development team and into the XCache (XrootD) core team in SLAC: QoS, token integration, RUCIO API, XCache authentication, etc.
Consolidating synergies between ESCAPE and WLCG e.g. first implementation for storage QoS endpoints in the ESCAPE datalake
Content delivery and caching taking-off outside the CERN testbed. Coordinated initiative for content delivery and caching proposed and started two weeks ago: https://indico.in2p3.fr/event/21381/
EC projects
Initiative to understand possible synergies with the ARCHIVER EC-funded project to address the ISO16363 self assessment (trustworthy digital repositories).
Involvement of the ESCAPE datalake in the MECHANICS project (ICT-40-2020 call)
Software accessibility/distribution start to be discussed as we are trying to understand how to run benchmark workflows for different sciences (coll. with OSSR). Main aim from WP2 perspective is to have "standard candles" to assess datalake performance on a regular basis (hammercloud machinery)
We are investing some effort to have some of the services containerized for future easy deployment on e.g. K8s (XCache, RUCIO). Having this centralised would be beneficial (coll. with OSSR)
Initiatives for getting onboard Australia and South-Africa sites slowed down a lot due to covid-19. Still I think this should be revived after summer