Apptainer Containers Overview

An Introduction to Linux Containers for HPC

Container
Author

Victor Penso

Published

November 29, 2023

Abstract

Apptainer is a container run-time platform for HPC compute clusters favoring integration over isolation. This integrative approach allows to support resources specific to HPC infrastructure like InfiniBand network fabrics, efficient parallel computing on many core CPUs and dedicated accelerators like GPUs. It is intentionally designed to facilitate research computing and scientific applications.

Questions

  • Why use containers?
  • What are container images?

Objectives

  • Learn how to use exiting containers
  • Understand container definition files

Using Apptainer

Apptainer 1 enables users to interact with containers transparently. Execute programs inside a container (as if running on the host). Redirect I/O, use command pipes, pass arguments, access files, sockets and ports. Apptainer is updated regularly 2 and most Linux distributions have packages available in their repositories. On Fedora for example, install Apptainer using the official apptainer package 3:

# install apptainer
sudo dnf install -y apptainer

# verify functionality
apptainer run docker://alpine

Users of MacOS and Windows can execute Linux containers in a virtual environment on their computer. It is not necessarily required to have Apptainer installed on your machine. Many HPC systems will have the apptainer command pre-installed, including the capability to build container images on the cluster nodes. However it is often preferable to use container images in the local environment as well, and copy them on-demand to a target HPC system. This approach ensures that you work with the exact same environment under all circumstance.

Why use containers?

  1. Many applications in the HPC environment have become very complex in regards to software dependencies and details of configuration. For many years HPC infrastructure providers have struggled to build a common environment to facilitate the requirements of all user-communities on a single host platform. Containers enable complete decoupling of application environments from the host platform and other users. In addition containerization prevents any interference between user environments, an issue that created a lot of friction for cluster operations in the past.
  2. Containers change the user environment into a swappable component. Allowing users to have a custom application environment “packaged” into a container image. This container image can be executed within any host that provides container support. Which includes their own computer and a growing number of HPC systems.
  3. The most important benefit for users is the complete independence from the HPC host environment. This allows the freedom of choice to select any Linux distribution as a foundation for the container image. Furthermore users can install any combination of compilers, libraries and other software components in versions they require.
  4. Most container images are build programmatically using definition files. This ties into the concepts and goals of reproducible science. Images can not only be preserved without any ties to the host infrastructure used for execution, but are based on a recipe detailing how the image was build in the first place.
  5. Updates to both containerized applications and the host infrastructure are not interrelated. This benefits HPC administrators by enabling them to select the most suitable host environment to support the growing complexity of hardware. At the same time users can make changes to container images aligned with their scientific schedule and other boundary conditions.

Container Images

Sub-Command Description
pull Download a container image from a remote resource

What are container images?

  • A container images stores a file-system tree containing a software environment (compilers, frameworks, libraries, etc.) for a given application.
  • Additionally container images store configuration metadata used during container launch to create a desired application run-time environment.
  • Container images are typically read-only (immutable) snapshots, which makes them host independent and portable. They can be moved freely between different infrastructures, and are published in container registries.
  • New images are often derived from an existing image. Usually the container definition is maintained with a version control system.
  • It is easy to switch between multiple versions of container images simplifying the roll-out of updates and roll-back in case of problems.

The container image default format for Apptainer is called SIF (Singularity Image File), a compressed read-only container file, by convention suffixed with .sif. Download container images 4 from a remote location, for example a container registry like DockerHub, with apptainer pull. This downloads all require container layers and combines those layers into an image file stored to the container cache in the path ~/.apptainer/cache by default.

Many use-cases can build on an existing container images. Pre-build images are published by many organizations in a public container registry. A registry is centralized storage for container images. All container runtime systems (like Apptainer) have a standard way to download images from a registry. The most popular image registry is is DockerHub 5, the origin of a common protocol used to communicate with a container registry.

Environment Variable Description
APPTAINER_CONTAINER Absolute path to storage location for container images
# Set the path to a directoy storing container images...
export APPTAINER_CONTAINER=$LUSTRE_HOME/containers

# ...download a container image from a container registry
apptainer pull $APPTAINER_CONTAINER/jupyter.sif docker://quay.io/datascience-notebook:latest

Above example shows the download of a container image provided by Project Jupyter 6, pulled from a container registry called Quay (by RedHat) using the standard docker:// protocol. It is up to the users to select an appropriate image for their application. The recommendation is to use container images provided by the developer community of a given software ecosystem.

Interact with Containers

Sub-Command Description
exec Execute a command within a containerized environment
shell Start an interactive shell session in a containerized environment

The Jupyter datascience-notebook container image downloaded in the previous section includes a Python environment supporting the NumPy package 7 for scientific computing. Users interact with container images 8 using a selection of Apptainer sub-commands. The sub-command exec executes a specified command within a container. Below a very simple Python program utilizing the NumPy package:

#!/usr/bin/env python

import numpy as np

array = np.array([
    [3, 7, 1],
    [10, 3, 2],
    [5, 6, 7]
])

print(np.sort(array, axis=1))
# Implement the NumPy program above in an example file...
$EDITOR example.py

# ...make the example file an executable program
chmod +x example.py

# Execute the Python program in the downloaded container image
apptainer exec $APPTAINER_CONTAINER/jupyter.sif ./example.py

The shell sub-command starts an interactive shell within a container:

>>> apptainer shell $APPTAINER_CONTAINER/jupyter.sif         
Apptainer> ipython
Python 3.11.6 | packaged by conda-forge | (main, Oct  3 2023, 10:40:35) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.16.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: 

Build Containers

Sub-Command Description
build Build a new container image from a definition file

Create a new container image 9 with the help of a container definition file (or “def file” for short). The following example builds a container image for the GNU Hello 10 software, based on a definition file called hello.def:

BootStrap: docker
From: quay.io/fedora/fedora:latest

%labels
Maintainer John Snow <j.snow@example.com>

%post
version=2.12.1
archive=hello-$version.tar.gz

dnf install -y wget tar gzip gcc make
dnf clean all

wget https://ftp.gnu.org/gnu/hello/$archive
tar xvzf $archive -C /opt
rm $archive
cd /opt/hello-$version

./configure
make
make install

%runscript
/usr/local/bin/hello

Run the following command providing the container image name and the definition file as arguments to build a container image:

apptainer build $APPTAINER_CONTAINER/hello.sif hello.def

Test the functionality of the container image by executing the hello application in the container:

>>> apptainer run $APPTAINER_CONTAINER/hello.sif                
Hello, world!

>>> apptainer exec $APPTAINER_CONTAINER/hello.sif hello --version | head -n1
hello (GNU Hello) 2.12.1

Definition Files

Container images are build from a definition file 11, basically a blueprint listing each step to install software components within the container. The files are divided into two parts:

  1. The header defines the Linux distribution used as foundation to build the container image (root) file-system. We recommend to select the distribution best supported by the software you want to use.
  2. The rest of the definition is comprised of sections to install additional software and configure the container run-time environment.

Sections

The second part of the definition file is broken into sections 13. Each section adds different content or executes commands at different times during the container image build (note that multiple sections of the same name can be included). The build option --section allows to limit execution to a specific section or sections. The table below presents a brief overview of all possible sections:

Section Description
%setup …executed on the host…before container build
%files …copy files into the container before %post
%post …install software…create configurations
%test …validate the container
%environment …define environment variables 14 (not made available at build time)
%startscript …executed at instance start command
%runscript …executed at run command
%labels …add metadata to the file /.singularity.d/labels.json
%help …help text…

The definition file below illustrates many sections as examples:

%setup
    touch /file1
    touch ${APPTAINER_ROOTFS}/file2

%files
    /file1
    /file1 /opt

%environment
    export LISTEN_PORT=12345
    export LC_ALL=C

%post
    apt-get update && apt-get install -y netcat
    NOW=`date`
    echo "export NOW=\"${NOW}\"" >> $APPTAINER_ENVIRONMENT

%runscript
    echo "Container was created $NOW"
    echo "Arguments received: $*"
    exec echo "$@"

%startscript
    nc -lp $LISTEN_PORT

%test
    grep -q NAME=\"Ubuntu\" /etc/os-release
    if [ $? -eq 0 ]; then
        echo "Container base is Ubuntu as expected."
    else
        echo "Container base is not Ubuntu."
        exit 1
    fi

%labels
    Author alice
    Version v0.0.1

%help
    This is a demo container used to illustrate a def file that uses all
    supported sections.

Local Images

Repeatedly building containers can become a time consuming process. Apptainer supports to build containers from local images. This enables to maintain a list of base containers as foundation for later reuse. Following definition builds a base-container from the latest Fedora release including developer tools, compilers and a selection of additional useful packages.

BootStrap: docker
From: quay.io/fedora/fedora:latest

%post

dnf install -y @development-tools \
      gcc-c++ gcc gcc-gfortran git git-delta gnupg2 \
      python3 python3-pip python3-setuptools python3-boto3 \
      psmisc rsync tmux tree wget unzip \
      bat curl fd-find fzf findutils \
      hostname iproute netcat neovim make patch \
dnf clean all
# ...build the container image
apptainer build fedora-base.sif fedora-base.def

Another container definition file can then reuse a local image with the following bootstrap configuration 15:

Bootstrap: localimage
From: fedora-base.sif

Note that Apptainer supports another mechanism called multi-stage build 16 in a single container definition file.

Container Cache

Environment Variable Description
APPTAINER_CACHEDIR Cache folder for images from a container registry.
APPTAINER_TMPDIR Temporary directory to build container file-systems.

By default the container build cache is located in the directory $HOME/.apptainer. Details about the container build environment are available in the Apptainer User Guide 17. Pay attention to the storage consumed by the container cache, since the amount of available space may be limited depending on your environment.

# show storage capacity used by the cache
apptainer cache list

# ..detailed view
apptainer cache list -v

# clean up everything
apptainer cache clean

The environment variable listed in the table on top are used to configure the paths to the container caches. Following example stores all container artifacts on volatile storage in the /var/tmp directory by setting both variables:

# create a user working directory
pushd $(mktemp -d /var/tmp/$USER-apptainer-XXXXXX)

# locate all artifacts within this directory
export APPTAINER_TMPDIR=$PWD
export APPTAINER_CACHEDIR=$PWD

This is useful in make sure that all build-artifact don’t linger in your environment.

Test Sandboxes

Option Description
--sandbox Work with a container image (root)-filesystem

A sandbox provides a writable (ch)root directory to interactively work with a container image. This is obviously not a reproducible method to build an image, but it is helps testing during implementation of definition files:

# ...create a container within a writable directory
apptainer build --fix-perms --sandbox rootfs/ docker://quay.io/rockylinux/rockylinux:8

# ...make changes within the container
apptainer shell --writable --fakeroot --home $PWD rootfs/

# ...build a new container from the sandbox
apptainer build apptainer.sif rootfs/

Do not forget to delete the rootfs directory afterwards, since these can consume multiple Gigabytes of storage.

Docker Compatibility

Container build and runtime tools have a complex interrelationship and overlap of functionality. The most widely used container tools outside of HPC are Docker 18 and Podman 19, which are mostly compatible to each other. Chances are high that you will encounter projects providing a Dockerfile 20 as container definition. Consider the example below for GNU Hello introduced previously:

FROM quay.io/fedora/fedora:latest

ARG version=2.12.1
ARG archive=hello-$version.tar.gz

RUN dnf install -y wget tar gzip gcc make
RUN dnf clean all

RUN wget https://ftp.gnu.org/gnu/hello/$archive
RUN tar xvzf $archive -C /opt
RUN rm $archive

WORKDIR /opt/hello-$version

RUN ./configure
RUN make
RUN make install

ENTRYPOINT /usr/local/bin/hello

The notation above differs significantly from an Apptainer definition file. In order to build a container image from a Dockerfile, use the build sub-command 21 of Podman. Please consult the corresponding documentations for more details.

# Build a container image using a `Dockerfile` in the working directory...
podman build -t hello .

# ...and execute the container image to test its functionality
podman run --rm -it localhost/hello:latest

For the purpose of this article it is interesting to understand the compatibility of Apptainer and Docker container images. Apptainer uses a different container format SIF, as discussed above. Fortunately it is easy to convert between both formats:

# Create an OCI container archive...
podman push localhost/hello:latest oci-archive:/tmp/hello-latest.tar

# ...and use this archive to create a SIF container image
apptainer build hello.sif oci-archive:/tmp/hello-latest.tar

The two commands above use an standardized container image format called OCI Image Layout 22 as intermediate step during conversion. There is a lot more details to the conversion then described here. We leave it to the reader to continue to investigate this.

Footnotes

  1. Apptainer
    https://apptainer.org↩︎

  2. List of Apptainer Releases, GitHub
    https://github.com/apptainer/apptainer/releases↩︎

  3. Fedora Apptainer Package
    https://src.fedoraproject.org/rpms/apptainer↩︎

  4. Downloading Images, Apptainer User Guide
    https://apptainer.org/docs/user/latest/quick_start.html#downloading-images↩︎

  5. Container Registry, Docker Manual
    https://docs.docker.com/registry/↩︎

  6. Project Jupyter, Quay.io
    https://quay.io/organization/jupyter↩︎

  7. Numpy Package
    https://numpy.org/↩︎

  8. Interacting with Images, Apptainer User Guide
    https://apptainer.org/docs/user/main/quick_start.html#interacting-with-images↩︎

  9. Build a Container, Apptainer User Guide
    https://apptainer.org/docs/user/main/build_a_container.html↩︎

  10. GNU Hello
    https://www.gnu.org/software/hello↩︎

  11. Definition Files, Apptainer User Guide
    https://apptainer.org/docs/user/main/definition_files.html↩︎

  12. Bootstrap Agents, Apptainer User Guide
    https://apptainer.org/docs/user/main/definition_files.html#other-bootstrap-agents↩︎

  13. Sections, Apptainer User Guide
    https://apptainer.org/docs/user/main/definition_files.html#sections↩︎

  14. Environment and Metadata, Apptainer User Guide
    https://apptainer.org/docs/user/main/environment_and_metadata.html#environment-and-metadata↩︎

  15. Build Modules, Apptainer User Guide
    https://apptainer.org/docs/user/main/appendix.html#build-localimage↩︎

  16. Multi-Stage Builds, Apptainer User Guides
    https://apptainer.org/docs/user/main/definition_files.html#multi-stage-builds↩︎

  17. Apptainer User Guide - Build Environment
    http://apptainer.org/docs/user/main/build_env.html#cache-folders↩︎

  18. Docker Documentation
    https://docs.docker.com/↩︎

  19. Podman Documentation
    https://podman.io/docs↩︎

  20. Dockerfile Reference, Docker Documentation
    https://docs.docker.com/engine/reference/builder/↩︎

  21. Podman Build Manual Page
    https://docs.podman.io/en/latest/markdown/podman-build.1.html↩︎

  22. OCI Image Layout Specification, Open Containers Initiative
    https://github.com/opencontainers/image-spec/blob/main/image-layout.md↩︎