Activities
T2.1 Plans for DAC21
----------------------------------------------------------------------------------------------------------------------------------------------
Proposal to re-gather in the Data Injector Demonstrators forum with focus on:
Experiments needs and interests (always open to sciences/partners proposals)
Data registration and/or buffering injection
Data life-cycle aka QoS (cross-task activity with T2.2)
Deterministic vs. non-deterministic RSEs
Data/dataset size and volume (effect on namespace, tape-disk moving behaviour, etc.)
Exploring “not-Rucio-aware site” use case
Data registration and/or buffering injection from non-Rucio sites (e.g. FTS)
Data handling and management of Telescope or Astronomy facilities,
and addressing scenarios from Photon and Neutron sciences
HTTP-based storage solution → easy to use and to be adopted by communities
Data Preparation
Calibration, reprocessing, formatting, etc.
Usually drives a conspicuous part of the computing model and data management
(e.g. MonteCarlo campaigns, RAW→AnalysisObjectData→format-user-friendly, etc.)
Experiment perspective drives the activity
Data Injector+Access Demonstrators
Proposal for a cross-task&WP forum → Data Injector+Access Demonstrators
Expanding fortnightly forum hosted every other Wednesday at 1100 CET
Joint effort with T2.3 and WP5
Data Access and User Analysis Platform (e.g. JupyterLab-Rucio integration)
Caching layer and integration
… (T2.3 leading)
Leveraging effort in EU-funded projects → ESCAPE+CS3MESH4EOSC
LOFAR and CERN have already activities and effort in both projects
Rucio + CERN + CS3MESH4EOSC = GSoC2021
Data Lake Activities
Outcome/takeaway of FDR shows room for improvements in view of DAC21
Several partners started to deploy parallel Rucio instances
Perfect opportunity to jump on challenging activities and improvements
Motto should be: Try, Test, Assess, and Report
Experiment interests/needs as keystones
Explore and discover new orchestration tools and phase-spaces
“metadata”, “multi-VO”, “Rucio Auth schema/policies for ESCAPE”, “automatix”, “bb8”, etc.
Activities are already on-going
----------------------------------------------------------------------------------------------------------------------------------------------
Task 2.3
----------------------------------------------------------------------------------------------------------------------------------------------
The most important task from the T2.3 point of view is the actual integration. This touches a lot with many of the activities in the other tasks.
Define end-to-end use cases, including cases that are part of the “not-Rucio aware” use case mentioned in Task 2.1
Use those to define the ways in which data is to be accessed by software running on central processing facilities.
AAI implementation is relevant here
Access to RUCIO from within jupyter lab containers
External data access (e.g. VO -> WP4; or external archives)
----------------------------------------------------------------------------------------------------------------------------------------------