JupyterLab with Apptainer
Running JupyterLab on a Compute-Cluster
JupyterLab is a web-based interactive environment for working with code, and data. It is widely used in scientific computing and is suited to be executed on a compute-cluster. This article illustrates how to add JupyterLab to an Apptainer container image, and how to start the a container instance on HPC infrastucture.
Why Use JupyterLab?
JupyterLab 1 is an excellent choice for scientific computing due to its flexibility, customization options, and ability to support a wide range of programming languages, libraries, and tools. Here are some reasons why you might choose to use JupyterLab and Jupyter Notebooks 2 in scientific computing:
- Ease of Use: JupyterLab is designed to be user-friendly, with features like auto-completion, syntax highlighting, and debugging capabilities that can help reduce the learning curve for new users.
- Reproducibility: Jupyter Notebooks (
.ipynb
files) are self-contained documents that include code, output, and visualizations, making it easy to reproduce results and share them with others. - Easy Data Exploration: You can load and visualize various data formats, such as CSV, Excel, JSON, and more, using popular libraries like Pandas, NumPy, and Matplotlib.
- Interactive Visualizations: Jupyter notebooks integrate with popular visualization libraries like Plotly, Bokeh, and Altair, enabling you to create interactive, web-based visualizations of your data.
- Flexibility: Support of a wide range of programming languages, including Python, R, Julia, and MATLAB, allowing you to use the language that best suits your needs.
- Version Control: You can use version control systems like Git to manage changes to your notebooks and collaborate with others.
- Extensions for Specific Domains: There are many extensions available for JupyterLab that can help with specific tasks in scientific computing.
Jupyter Containers
The Jupyter team maintains the Jupyter Docker Stack 3 with corresponding public container images on DockerHub 4 and Quay 5. These container images can be use on any compute-cluster.
Login to a cluster submit node and start a JupyterLab container:
apptainer run docker://jupyter/base-notebook
The JupyterLab service daemon will print log information to the Terminal, including the connection address and a so called access token described in the next section.
Stop JupyterLab by pressing CTRL
-C
once you have finished working.
Access Token
stdout
# Information from the logs of the JupyterLab service
Or copy and paste one of these URLs:
http://lxbk0725:12345/lab?token=598fb99446ce46996e9a74ea28bea3d2c7fd2591be4431f6 http://127.0.0.1:12345/lab?token=598fb99446ce46996e9a74ea28bea3d2c7fd2591be4431f6
A Jupyter Access Token 6 like the example above is a secure, randomly generated string that authenticates and authorizes users to access JupyterLab notebooks and resources, providing an additional layer of security and control over notebook sharing and collaboration.
The access token includes the host name and port number required by the user to connect.
In the simplest case you can use the HTTP address which includes the host name of the executing node (lxbk0725
in the example above) to connect to JupyterLab with your web browser. In case you accessed the cluster from an external network you will need to setup SSH port forwarding as explained in the next section.
Remote Access
Access to JupyterLab from outside networks requires an additional step. SSH Port Forwarding is a technique that creates a secure, encrypted tunnel between your local machine and a remote server, allowing you to access the remote server’s services or applications as if they were running locally by forwarding specific ports from the remote server to your local machine.
Use SSH to configure port forwarding to Jupyter running on the cluster:
ssh -vv -N -L localhost:12345:lxbk0725.gsi.de:12345 virgo.hpc.gsi.de
# …logs will be print to the terminal …use ctrl-c to close
Once port forwarding has been started use the HTTP address with the IP address 127.0.0.1
(localhost) to connect to JupyterLab.
Build a Container
The following section implies that you want to add Jupyter to an existing container. Alternatively you could use the Jupyter Docker Stack as foundation and build from there by adding your components on top.
To integrate JupyterLab into your Apptainer container, you’ll typically extend an existing configuration that includes your working environment. Following assumes you have Python 3 available in our container and uses a Python virtual environment 7 to install JupyterLab.
Adding JupyterLab to an Apptainer Container Definition File 8:
apptainer.def
%post
# …add in an appropriate place in the post-section
mkdir /app
cd /app
python3 -m venv venv
. venv/bin/activate
# install JupyterLab including some extensions
pip3 install \
\
jupyter \
jupyterlab-spellchecker \
jupyterlab-git \
jupyterlab-lsp 'python-lsp-server[all]'
%environment
export JUPYTERLAB_PORT=54321
%runscript
. /app/venv/bin/activate
exec jupyter lab --no-browser --ip 0.0.0.0 --port $JUPYTERLAB_PORT
%startscript
. /app/venv/bin/activate
exec jupyter lab --no-browser --ip 0.0.0.0 --port $JUPYTERLAB_PORT
The example above prepares the use of apptainer instance
9 to manage your JupyterLab installation:
Apptainer, also allows you to run containers in a “detached” or “daemon” mode where the container runs a service. A “service” is essentially a process running in the background…
Following environment variables is used to set the service port:
Variable | Description |
---|---|
JUPYTERLAB_PORT |
This defines the default port used when a JupyterLab instance is started. The configuration above sets the port to 54321 . Note that in case you share a host with other users you may be required to adjust the port number to avoid collisions. |
Build the container and copy the image to shared storage:
1apptainer build apptainer.sif apptainer.def
# Make sure to store the container image on persistent storage for example Lustre
export LUSTRE_HOME=/lustre/$(id -ng)/$USER
export APPTAINER_CONTAINERS=$LUSTRE_HOME/containers
2cp apptainer.sif $APPTAINER_CONTAINERS/jupyterlab.sif
- 1
- Build the Apptainer container image from the definition file 10
- 2
- Copy the JupyterLab container images to your Lustre directory
Start the container instance on the current host:
- 1
- Overwrite the default port configuration with an environment variable
- 2
- Start your JupyterLab instance on the local node
Start a JupyterLab instance on a cluster node:
1export LUSTRE_HOME=/lustre/$(id -ng)/$USER ; cd $LUSTRE_HOME
2srun --nodes=1 --ntasks-per-node=1 --time=01:00:00 --pty bash -i
# …once your allocation has been granted
3export APPTAINER_CONFIGDIR=$LUSTRE_HOME/.apptainer
4apptainer instance start $LUSTRE_HOME/jupyterlab.sif jupyterlab
- 1
- Configure your working directory, typically on shared storage
- 2
-
Allocate an interactive session on a compute node with
srun
- 3
- Make sure the Apptainer configuration path is located on writable storage
- 4
- Start your JupyterLab instance on the compute node
Read the access token from the log-files of your JupyterLab instance:
>>> cat $APPTAINER_CONFIGDIR/instances/logs/$(hostname)/$USER/jupyterlab.err \
| grep -o 'http.*lab?token.*' | sort | uniq
http://127.0.0.1:54321/lab?token=041ded5856f8bfe5202c5118027016f990390729ff7d0433
http://lxbk0724:54321/lab?token=041ded5856f8bfe5202c5118027016f990390729ff7d0433
Footnotes
Project Jupyter Documentation
https://docs.jupyter.org/en/latest↩︎Jupyter Notebook Documentation
https://jupyter-notebook.readthedocs.io↩︎Jupyter Docker Stacks
https://jupyter-docker-stacks.readthedocs.io
https://github.com/jupyter/docker-stacks↩︎Jupyter Project, DockerHub
https://hub.docker.com/u/jupyter↩︎Project Jupyter, Quay.io
https://quay.io/organization/jupyter↩︎Security in the Jupyter Server
https://jupyter-server.readthedocs.io/en/stable/operators/security.html↩︎Virtual Environments, Python Documentation
https://docs.python.org/3/library/venv.html↩︎Apptainer Definition Files, Apptainer User Manual
https://apptainer.org/docs/user/main/definition_files.html↩︎Instances - Running Service, Apptainer User Manual
https://apptainer.org/docs/user/main/running_services.html↩︎Build a Container, Apptainer User Manual
https://apptainer.org/docs/user/main/build_a_container.html↩︎