Goals
- Produce a design, and ultimately an implementation, for the interaction between Jupyter notebooks and ESAP.
- Bootstrap (part of) a concrete ESAP development roadmap.
Expected Workflow
Roughly speaking, we expect the workflow to look something like:
- Using ESAP, the user identifies data to process (catalog entries and/or files);
- Using ESAP, the user identifies a software payload appropriate to their use case (a pre-configured notebook, or just the libraries, etc, that they need);
- Using ESAP, the user identifies a JupyterHub system that is appropriately local to the data and capable of executing the payload;
- The user is launched into a notebook environment;
- Results an be collected from the notebook in some meaningful way (back to ESAP? to the data lake?).
We also discussed an alternative model, where the users starts in the notebook and uses an ESAP plugin fur Jupyter to search for data and software and bring it to their notebook. I don't think we were motivated to pursue this further, though.
Technical Discussion
(In brief; we covered a lot of ground).
- Stelios and James have done some previous work on trying to pass information to Jupyter through the spawner API, but met with limited success. Alternative models might be possible.
- Presumably files identified by the user in ESAP would be retrieved directly from the data lake (or other bulk storage) directly by Jupyter, rather than passing through ESAP.
- Replicating complex catalog queries which were constructed on ESAP in the Jupyter environment might be challenging, so the results of queries (rather than the query itself) should probably be passed to Jupyter. (Detailed mechanisms for this were inconclusively discussed).
- User queries are (probably?) ephemeral as far as ESAP is concerned, and are not stored in an ESAP database.
Next steps
We agreed to make this a regular meeting with a high cadence (weekly, at least at first) in an attempt to get real technical traction.
Actions
- John to talk to Stelios to see what other IDA WG members should be involved (if any) and if we can re-use the old IDA WG meeting slot or should find another, then set up a recurring event for this meeting.
- ASTRON (Nico, Yan, John) to produce an ESAP architecture document (or, at least, the start of one).
- Rohini to start the process of producing some diagrams of how this work might fit together.