WP2 fortnightly meeting

Europe/Paris
Xavier Espinal (CERN)

ESCAPE WP2 fortnightly meeting

Participants:Aleem, Andrea, Aris, Bastien, Bruzzese, Daniele Cesini, Daniele Spiga, Diego, Enrico, Ghita, Gonzalo, Ian, Maisam, Marcelo, Neyroud, Paul Millar, Paul Musset, Raymond, Rizart, Rohini, Rosie, James, Ron, Stephane, Xavier, Yan, Zheng

 

Introduction (Xavier)

  • Aris from WP2 gave a talk to the Rucio community workshop in Fermilab.
  • Brussel Escape Workshop
    • WP2 is going well
      • Storage, data movement, AAI are well advanced
      • QoS defined and close to have minimalistic approach deployed with some endpoints: EC + Raid-ed + tape
      • Data access is one of the major focus now and we are actively building up some jobs/workflows as a demonstrators on how to get, process and store data on the datalake.
    • ESCAPE technical coordinator was appointed, to start end of May. Will take care of WP cross coordination.
    • Test Science Projects (TSP): emphasis on variety of project/experiment participating on the datalake activities (same approach we are taking with LOFAR data).
  • New ESCAPE recruitment arrived at PIC

Task leaders round table

Task leaders went through a round table to report about the status and plans of the different tasks (detailed text in teh agenda)

LOFAR data status report has been postponed

AOB

  • Xavier proposed to introduce a periodic sites and experiments quick round-table with the aim  to have a quick round-table, will go through the list of sites and experiments quickly. Please chime in in case you have something to report.
    • Sites: CERN, INFN, DESY, GSI, INFN, Nikhef, RUG, SURFSara, CC-IN2P3, IFAE-PIC, LAPP, INAF, Aarnet
    • Experiments: HL-LHC, FAIR, KM3Net, SKA, CTA

Action list:

  • Setup automated storage endpoint smoke tests with a couple of QoS endpoints (PaulMillar/Aleem)
  • Setup a Twiki page about the different RIs needs and workloads characteristics regarding compute and data processing (Yan)
  • Present a first version of the ESCAPE Datalake Dashboard at the next WP2 meeting on the 25th of Mar (Rizart)
  • (TBC) Present the LOFAR data status with data flow, QoS workflow and possible data access (Aleem & co.)
 
 
 
 
There are minutes attached to this event. Show them.
    • 11:00 AM 11:10 AM
      News 10m
      Speakers: Rosie Bolton (Square Kilometre Array Organisation) , Simone Campana (CERN) , Xavier Espinal (CERN)
      • Aris gave a talk to the Rucio workshop
      • Brussel Escape Workshop
        • WP2 is going well
          • Storage, data movement, AI well advanced
          • QOS close to have a pilot
        • WP cross coordination : technical coordinator was appointed.
        • Emphasis on variety of project/experiment participating
      • Testing :
        • Most important to do test on caching
        • For CTA, collabotation between LAPP, CC-IN2P3 and maybe PIC to do xcache testing
      • New ESCAPE recruitment arrived at PIC

       

    • 11:10 AM 11:40 AM
      Task updates: current activities and short-term plans 30m
      • Datalake (Xavi)
      • QoS and orchestration (Paul)
      • Integration with compute (Yan)
      • Network (Rosie)
      • AAI (Andrea)
      Speakers: Andrea Ceccanti (INFN) , Paul Millar (DESY) , Rosie Bolton (Square Kilometre Array Organisation) , Xavier Espinal (CERN) , Mr Yan Grange (ASTRON, the Netherlands Institute for Radio Astronomy)

      2.1 Datalake (Xavi)

      • LOFAR data injection and distribution
      • HTTP-TPC enabled on EOS instance for the Datalake.
      • Start testing token based access to the Datalake from CERN HTCondor 
      • Activity on caching infrastructures progressing.
      • Prototyping a Datalake dashboard, goal is to give an overall perspective of the datalake in action: in-flight transfers, files, sites, perfsonar, etc. and the transfer matrix for all sites.
      • TPC in EOS
        • Elvin wrote some documentation on XRootD and HTTP TPC
        • For now only in EOSPPS, on the production instances normaly before June

      2.2 QoS and orchestration (Paul)

      • QoS mini-workshop at CERN 2020-02-24 / 2020-02-25 with Martin, Mario, Aris, Aleem and Paul.  Outcome was a workshop report, containing ATLAS QoS use-case capture, architecture discussion, and next steps.
      • Very prelimiary discussion with SKA at ESCAPE progress meeting about how to go about capturing SKA QoS use-case.
      • Aleem working to drive LOFAR workflow.  The goal is to use this to drive QoS activity: staging data to the target location, with desired QoS.

      2.3 Integration with compute (Yan)

       

      • Computing: Have a list of questions, checked with CMS, LOFAR and got input from SURF. Next step is to share with all the ESFRI partners and ask for what their use cases would require
      •  LOFAR data and code: LOFAR use case has been prepared. It does ship with minimal data to make sure the readme and compilation are OK. Trying to get access to the data lake data. Also want to see if we can put our use case data in there so that we can process it. The data we want to put in is public data.
        • Configuration of LOFAR test:
          • Should work easily.
          • Few parameters to change to adapt to a site
          • Dependancy only on Singularity and Go
      • Xavi requested a placeholder in the wiki for computing need of the different use case/experiment

       

      2.4 Network (Rosie)

      Meeting held 09/03/2020

      Still waiting on the PerfSONAR machines at GSI to be set up (ACTION: Rizart to check up on this, Paul please to help as needed)

      Rizart has been in touch with Alex Dodson at AARENT: Perth PerfSONAR box should be ready by now (due last week) but no confirmation received. (ACTION: Rizart to chase up on this)

      Rizart explained how to get lightweight CERN accounts to that we can get access to the Grafana instance for the development of monitoring dashboards - this is work in progress but Rizart will create a "how to" guide so new members can get accounts and contribute to the dashboard development and ultimately, use.

      Dashboards will need to be developed and ready to use by the summer, but we anticipate being early in this.

      Ron, Raymond and Yan had done some data transfer tests with SARAO (Simon Ratcliffe) - some trouble with Ilifu cloud (storage was down) - we agreed an action for SKA to take responsibility for the interaction with South African colleagues to get an RSE in ESCAPE, but to include Yan in email traffic. 

      EOS storage at AARNET (Perth and Melbourne) has successfully been used within the SKA RAL rucio instance to transfer data from Australia to UK. Most work done by Crystal Chua, Rohini supporting with testing. Now that this works it should be much easier to also include storage in the ESCAPE rucio too. Authentication was a major challenge.

      New SKA team member, Jimmy Cullen will start at SKA March 16th. We will try to schedule an intensive few days so Rizart can get Jimmy up to speed. However, travel restrictions may mean that this has to be done via videocon in the first instance, with a larger team meeting happening later once travel restrictions ease.

      Rosie will doodle for a regular meeting slot once Jimmy is onboard.

      2.5 AAI (Andrea)

      I worked closely with Riccardo di Maria, Diego Ciangottini and Daniele Spiga
      on fine-grained authorization in support of data caching and access with
      XRootD. This both for GSI/VOMS-based authZ and token-based authz. XRootD VOMS
      support seems fairly limited for group-based authorization, limitations were
      discussed and reported by Riccardo and Diego to XRootD developers, which have
      suggested an alternative version of the VOMS plugin which fulfills the
      requirements for VOMS group and role-based authz  on a multi-tenant XRooTD
      instance.

      On the token-based authz side, we exercised the multi-tenant scope-based authz
      scenario using the WLCG profile against an XRootD instance and found problems
      also there; Brian Bockelman has been given access to the VM to troubleshoot the
      problem (details not disclosed here, ask me in chat) in the Scitokens library
      code. Brian confirmed the problem and proposed a patch, which Diego tested
      and fixes the problem.

      So now we know how to enforce tenant separation on XRooTD with both VOMS/GSI
      and token-based authz. This is interesting as it enables support for protecting
      embargoed data for CMS (Diego is already doing tests on real data) and other
      interested ESCAPE experiments.

      As announced at the progress meeting, I will organize a webinar on ESCAPE AAI to
      support integration activities in other technical WPs.

      I will circulate a doodle by the end of the week to propose a date. It would be
      good to record the webinar so it can be used as a reference in the future.

    • 11:40 AM 11:50 AM
      LOFAR data injection, distribution and access: status update 10m
      Speakers: Aristeidis Fkiaras (CERN) , Riccardo Di Maria (CERN) , Ron Trompert (SURFsara) , Yan Grange (ASTRON, the Netherlands Institute for Radio Astronomy)
    • 11:50 AM 12:00 PM
      AOB 10m

      Introducing a periodic sites and experiments quick round-table:
      The aim is to have a quick round-table, will go through the list of sites and experiments quickly. Please chime in in case you have something to report.