Summary for a discussion item for the next WP5 meeting, regarding JHub & computing cluster integrations.
Several of the WP5 teams are working on using similar tools and technologies to setup parallel computing platforms with JupyterHub as a user interface, and Spark, Condor as well as other tools for running batch jobs in a distributed manner. It is worth discussing and planning how we can collaborate and potentially integrate the work that is being done. One possible way of integrating may be to descibe how/and if it is possible to allow users from a single Jupyterhub deployment, to access Spark or other clusters deployed at a remote data processing center. This may be initially built on prototype services as a proof of concept project, but for a production service we will have to deal with authentication and authorization aspects, and how to federate authentication and user information between the two.
In addition, I also described example notebook containers as a base for custom user JupyterHub images, as well as a Notebook with examples on how to use VO tools for discovery, access and visualization.
The Dockerfile can be found here:
https://github.com/stvoutsin/escape-wp5-dev/blob/master/docker/vo/Dockerfile
And the example notebook here:
https://github.com/stvoutsin/jhub-notebooks/blob/master/notebooks/escape/escape_demo.ipynb
Finally, the presentation from this talk has been upload here:
https://github.com/stvoutsin/escape-wp5-dev/blob/master/doc/20200608-ESCAPEWP5.pdf