Batch Processing in ESAP

Europe/Paris
https://meet.google.com/ehs-bvfh-sxm

https://meet.google.com/ehs-bvfh-sxm

Description

Discussion about batch processing in ESAP.

Expected user experience:
Authentication and authorization
Finding/identifying data
Selecting data
Finding/identifying software
Selecting software
Finding/identifying compute resources
Selecting compute resources
-- All points are similar to the interactive processing. Understand differences and specific needs.
Asynchronous job run mechanisms
Monitoring solutions

Discussion:

Sara: brief description of the discussion points on the agenda.
Display indico page of "WP5 ESAP Focus Months"
https://indico.in2p3.fr/event/21650/
and poses a question mark on the concrete distinction between "Interactive analysis" and "batch processing".
The Stelio's anlysis on the last meeting included batch processing.
It is not yet clearly defined what is meant by software in ESAP.
Introduces Rosetta as a possible tool to run software/ exploit computing resource. Rosetta could be installed by data centers to provide access to their computing resources through Rosetta (Rosetta used as a sort of plug-in).

Gareth: asks a brief summary of what Rosetta is/does

Stefano: provides a brief Rosetta demo:
- Idea behind Rosetta: an aggregator of computing resources
- in Rosetta definition each software is a software container
- it has a container entry point
- support command line access also
- show a software container run and web access
- a file manager is supported, tests can be done with not massive datasets

Sara: Rosetta and Dirac are not mutually exclusive, the computing resources can be described as "supporting DIRAC access" or "supporting Rosetta access" and listed in ESAP.

Matthias: what is the interface between this different tools Rosetta vs ESAP and Rosetta vs DIRAC

Stefano: they should be at the same level, so someone can choose to add support to one or the other. How to integrate is ESAP is an open point. At the moment computing resources are listed and accessed with a redirect url. Only a branch has an integration of DIRAC resources. An explanation of how Gareth integrated could be really useful. Not how run, but how select dirac resources in ESAP, because the same could be done for Rosetta.

Gareth: DIRAC have the server managing the jobs on the data center, the user has a DIRAC client. A container packaging both DIRAC client and ESAP backend has been run. Following the example done for Rucio, it has basically created an archive on the ESAP page.

John: the work done in Rosetta must not be duplicated in ESAP

Mattias: ESAP is proposed as a toolkit. What of Rosetta functionalities has to be included, from the point of view of CTAO, to build the platform. Rosetta has to be available as a toolkit through ESAP. There is the  idea/need to be able to build a science platform for CTAO  through the ESAP toolkit during the 2022.

Stefano: in Rosetta 20% of the work is front-end and 80% is back-end and the back-end can be easily be used as another service in ESAP. There are 3 ways to integrate Rosetta in ESAP:
1- a redirect link as it is done for binder
2- Stefano writes a few wrappers for API calls to Rosetta in the ESAP back-end
3- the same way used now for DIRAC: Stefano writes a thin Rosetta client  
  
Gareth: Display a video demo: https://www.youtube.com/watch?v=WyOs2mwPh2w

Stefano: the DIRAC integration now is bound to a specific set of computing resources. It is not a way to choose data and then software and then run. We are not producing the expected behavior.

Gareth: there are use cases where do not need data (simulations). 
The integration with the shopping basket is a huge thing. The authentication issue should be solved to have a fine grained authorization giving users access to different kind of software based on group membership.
Three or four concrete use cases could be useful.

John: meet difficulties trying to involve ESFRI other than CTAO.

Gareth: we need to clarify what we mean with batch processing.

Stefano: asks clarification about data selection in current DIRAC integration

Gareth:  ConCORDIA is a set of containers for CORSIKA simulations on DIRAC. The archive is not a data archive, but a software archive. 

Stefano: we should not distinguish between interactive and batch processing. Batch should just have a connect button. 

~all: General agreement on that. Software metadata description is needed.

Matthias: what are the next steps about metadata? CTAO is in the software metadata definition phase and it is really interested in this part.

Workspace/resource pasted in chat:
https://git.astron.nl/astron-sdc/escape-wp5/workflows/metadata-examples
https://git.astron.nl/groups/astron-sdc/escape-wp5/-/wikis/ESAP/Compute-resource-metadata
https://git.astron.nl/groups/astron-sdc/escape-wp5/-/wikis/ESAP/Data-resource-metadata

John: the next step from the ESAP perspective: Dave is preparing some proposal post on github and will present them at the next Monday meeting.
Three things to push on forward:
1_ work on metadata, we have an immediate plan
2_ continuing this meeting series having a next meeting on use cases.
3_ have a discussion with Nico and Stelios added to this group.

Dave: love the idea of a python client. Is it possible avoid that the user has to chose different client for different software working toghether to make clients to look similar?

Gareth: has not much control on DIRAC client, but can check.

Klaas: planning to have an asynchronous functionality in ESAP, intended for queries, but can be used for starting batch software. We could have a single interface for starting a batch job and then have  workers which implements python APIs that do the work for DIRAC or Rosetta or others.

Dave: we should do the same work for binderhub

John: original design:
https://git.astron.nl/groups/astron-sdc/escape-wp5/-/wikis/ESAP/Asynchronous-ESAP
Probably it's what Rosetta already does

Stefano: we need to make the resources to look the same 

John: we have to deliver something realistic.

A restricted meeting about technical details should be organaized including Nico and Stelios, Gareth, Stefano
Sara will organize this meeting

John will organize the next meeting about batch processing. Target: to review use cases.

Matthias: Km3net can be a candidate to give some input on use cases.

Sara: suggests that to work around the need of x509 authentication, the user not having a proper x509 certificate could be provided with one "fake certificate" created behind the scenes and suitably linked to the user identity verified using tokens.

Useful comments collected from the ESCAPE Rocketchat :

  • grange

    Thanks Sara! Good to read. I have one minor comment: working around the X509: within WP2 there was some discussion on this last year. If I remember correctly (but I can't find it back) there is a CA for more ad-hoc X509 authentication that may make life easier than self-signed. I'll try to dig up the details (or ask around).                                                                                                                                                                  .......                                                                                                                                                                    

          Just for completeness: the comment I made on ad-hoc certificates was about RCauth on which Micha also       sent a message on the mailing list.

  • rdimaria

    Hello, on this

    ```
    Sara: suggests that to work around the need of x509 authentication, the user not having a proper x509 certificate could be provided with one "fake certificate" created behind the scenes and suitably linked to the user identity verified using tokens.

    ```
    be careful. Fake certs will not work at storage level, hence in the DataLake

 

Next steps:

  • Dave will present his ideas on metadata at our Monday 2022-01-24 WP5 technical meeting. The CTA team are interested in engaging with this explicitly (Matthias can't be there, but Gareth will be).
  • Sara will convene a meeting of Stefano, Nico, Gareth, Stelios to focus on technical solutions for integrating Rosetta and ESAP. The idea is that this focused technical group will make a proposal that we can then discuss with a range of stakeholders.
  • We agreed that pushing forward with batch processing specifically should be driven by use cases. To date, CTAO has been the most active in suggesting these. John will:
    • Solicit batch processing use cases in a call to WP5.
    • Invite developers and use-case owners to a meeting to work through all use cases and flow them to an expected model for batch  processing in ESAP.
  • John will ask Ian Bird what support is available for AAI/IAM given the recent departure of Andrea Ceccanti.
Il y a un compte-rendu associé à cet événement. Les afficher.
L'ordre du jour de cette réunion est vide