Container exercises

Exercise Container-1: Time-travel with containers

Imagine the following situation: A researcher has written and published their research code which requires a number of libraries and system dependencies. They ran their code on a Linux computer (Ubuntu). One very nice thing they did was to publish also a container image with all dependencies included, as well as the definition file (below) to create the container image.

Now we travel 3 years into the future and want to reuse their work and adapt it for our data. The container registry where they uploaded the container image however no longer exists. But luckily (!) we still have the definition file (below). From this we should be able to create a new container image.

  • Can you anticipate problems using the definition file here 3 years after its creation? Which possible problems can you point out?

  • Discuss possible take-aways for creating more reusable containers.

 1Bootstrap: docker
 2From: ubuntu:latest
 3
 4%post
 5    # Set environment variables
 6    export VIRTUAL_ENV=/app/venv
 7
 8    # Install system dependencies and Python 3
 9    apt-get update && \
10    apt-get install -y --no-install-recommends \
11        gcc \
12        libgomp1 \
13        python3 \
14        python3-venv \
15        python3-distutils \
16        python3-pip && \
17    apt-get clean && \
18    rm -rf /var/lib/apt/lists/*
19
20    # Set up the virtual environment
21    python3 -m venv $VIRTUAL_ENV
22    . $VIRTUAL_ENV/bin/activate
23
24    # Install Python libraries
25    pip install --no-cache-dir --upgrade pip && \
26    pip install --no-cache-dir -r /app/requirements.txt
27
28%files
29    # Copy project files
30    ./requirements.txt /app/requirements.txt
31    ./app.py /app/app.py
32    # Copy data
33    /home/myself/data /app/data
34    # Workaround to fix dependency on fancylib
35    /home/myself/fancylib /usr/lib/fancylib
36
37%environment
38    # Set the environment variables
39    export LANG=C.UTF-8 LC_ALL=C.UTF-8
40    export VIRTUAL_ENV=/app/venv
41
42%runscript
43    # Activate the virtual environment
44    . $VIRTUAL_ENV/bin/activate
45    # Run the application
46    python /app/app.py

Exercise Container-2: Build a container and run it on a cluster

Here we will try to build a container from the definition file of our example project.

Requirements:

  1. Linux (it is possible to build them on a macOS or Windows computer but it is more complicated).

  2. An installation of Apptainer (e.g. following the quick installation). Alternatively, SingularityCE should also work.

Now you can build the container image from the container definition file. Depending on the configuration you might need to run the command with sudo or with --fakeroot.

Hopefully one of these four will work:

$ sudo apptainer build container.sif container.def
$ apptainer build --fakeroot container.sif container.def

$ sudo singularity build container.sif container.def
$ singularity build --fakeroot container.sif container.def

Once you have the container.sif, copy it to a cluster and try to run it there.

Here are two job script examples:

#!/usr/bin/env bash

# the SBATCH directives and the module load below are only relevant for the
# Dardel cluster and the PDC Summer School; adapt them for your cluster

#SBATCH --account=edu24.summer
#SBATCH --job-name='container'
#SBATCH --time=0-00:05:00

#SBATCH --partition=shared

#SBATCH --nodes=1
#SBATCH --tasks-per-node=1
#SBATCH --cpus-per-task=16


module load PDC singularity


# catch common shell script errors
set -euf -o pipefail


echo
echo "what is the operating system on the host?"
cat /etc/os-release


echo
echo "what is the operating system in the container?"
singularity exec container.sif cat /etc/os-release


# 1000 planets, 20 steps
time ./container.sif 1000 20 ${SLURM_CPUS_PER_TASK} results

Exercise Container-3: Building a container on GitHub and running it on a cluster

You can build a container on GitHub (using GitHub Actions) or GitLab (using GitLab CI) and host the image it on GitHub/GitLab. This has the following advantages:

  • You don’t need to host it yourself.

  • But the image stays close to its sources and is not on a different service.

  • Anybody can inspect the recipe and how it was built.

  • Every time you make a change to the recipe, it builds a new image.

If you want to try this out:

  • Take this repository as starting point and inspiration.

  • Don’t focus too much on what this container does, but rather how it is built.

  • To build a new version, one needs to send a pull request which updates VERSION and modifies the definition file (in this case conda.def).

Exercise Reproducibility-5: Building a container on a cluster

This may not be easy and you will probably need help from a TA or the instructor but is a great exercise and we can try to do this together.

A good starting point is the Apptainer User Guide, particularly the documentation about definition files.

A good test is to build the container on one computer and try to run it on another one. A big benefit of this exercise is that it will clarify to you which dependencies your code really has because you have to document them - there are no shortcuts.