...
Jacamar CI is an HPC-focused CI/CD driver which allows scheduling of CI/CD jobs as jobs within a workload manager like Slurm. The advantage of running the CI/CD jobs directly on the cluster is the availability of all software and also the hardware (i.e. FPGAs) of the systems for these jobs.
For the setup, we need to install a GitLab runner that will continuously run in the user space. This runner will be registered in a GitLab project and execute incoming CI/CD jobs using Jacamar CI as a custom executor.
Log in to one of the cluster frontends and create a new folder for the CI (preferably on the parallel file system). Change into the created directory. This folder will - at the end of this guide - contain all required data and executables.
1. Installation of Jacamar CI Custom Executor
Go to Jacamar CI releases and download the most recent RPM package. The version - w/Capabilites
is not necessary, because all following will be done in user space.
For version 0.12.1
you can use the following link
Code Block |
---|
curl -L --output jacamar.rpm "https://gitlab.com/api/v4/projects/13829536/packages/generic/jacamar-ci/0.12.1/jacamar-ci-0.12.1-1.el7.x86_64.rpm" |
Put the RPM file into your working directory and rename it to jacamar.rpm
. Use the following command to extract the Jacamar CI binary
Code Block |
---|
rpm2cpio jacamar.rpm | cpio -idmv > jacamar |
The binary is now found at ./opt/jacamar/bin/jacamar
. jacamar-auth
is not necessary. Beware that the folder jacamar
is not writeable by default.
You can delete the RPM package now.
Now we need to create a configuration for Jacamar CI. Create a new file named jacamar-config.toml
and insert the following content
Code Block |
---|
[general]
executor = "slurm"
data_dir = "/PATH/TO/WORK/DIR/data" |
You may adjust the path to the data directory accordingly. Note, that this data will only be temporal for the execution of a job. However, keep in mind the specified path should be accessible from all compute nodes!
2. Installation of GitLab Runner
Download the most recent version of the official GitLab runner and make it executable:
...
configuration files.
A good location is a subdirectory in the project path assigned to your project in /scratch/...
. For example:
Path | Comment |
---|---|
| CI Configuration files |
| CI Data files |
1. Setup Environment with Modules
Jacamar CI and GitLab-Runner are available within our modules system. In order to load the modules you can use
Code Block |
---|
module reset
module load tools
module load gitlab-runner
module load jacamar |
This loads the gitlab-runner
and jacamar
binaries in the latest versions into your environment. In the following setup you need the absolute paths of the binaries. These are
Path to Latest GitLab Binary
Code Block |
---|
/opt/software/pc2/EB-SW/software/gitlab-runner/latest/bin/gitlab-runner |
Path to Latest Jacamar
Code Block |
---|
/opt/software/pc2/EB-SW/software/jacamar/latest/bin/jacamar |
2. Registration of GitLab Runner
Now we need to setup CI/CD in your Gitlab project.
Go to the settings of your project on Gitlab and enable the feature CI/CD
in the General
section under Visibility, project features, permissions
.
In the now appearing CI/CD
section (under Settings
), go to Runners
. There you will find a two step setup guide to connect a new runner to your project under the Specific Runner
headingby clicking on New project runner
.
To execute these two steps run register the new runner and generate a new configuration file execute the Gitlab runner on Noctua inside your CI Configuration directory
:
Code Block |
---|
./usr/bin/gitlab-runner register --config=jacamar-config.toml |
Follow the steps. Enter the instance URL
and registration token
from the GitLab page. If you are asked for the executor type, choose custom
.
Afterwards, the file jacamar-config.toml
was created.
3. Make GitLab Runner use Jacamar CI
Now we need to configure the GitLab runner to use our custom executor jacamar
we installed in step 1.
Therefore, edit the configuration file gcijacamar-config.toml
.
Below First add the following lines to the top of the file:
Code Block |
---|
[general]
executor = "slurm"
data_dir = "/scratch/PATH/TO/WORK/DIR/.../data" |
You may adjust the path to the data directory accordingly. Note, that this data will only be temporal for the execution of a job. However, keep in mind the specified path should be accessible from all compute nodes.
Inside your runners
definition, add the two following lines
Code Block |
---|
pre_cloneget_sources_script="module reset" |
This will load the default modules of Noctua including Slurm, which is required for the custom executor.
Also, Jacamar CI requires at least git
version +2.9
. The pre-installed system version of git
should be is sufficient.
To also allow artifact uploads to the GitLab server, gitlab-runner
must be in the PATH
variable. This requires setting the PATH
variable in the environment settings of the configuration. Again, replace the /PATH/TO/WORK/DIR
with the path to your working directory
Code Block |
---|
environment = ["PATH=/PATH/TO/WORK/DIR/usr/opt/software/pc2/EB-SW/software/gitlab-runner/latest/bin:/opt/software/pc2/EB-SW/software/jacamar/latest/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin"] |
...
Code Block |
---|
[runners.custom] config_exec_timeout = 3600 config_exec = "/opt/software/PATHpc2/TOEB-SW/WORKsoftware/DIRjacamar/optlatest/bin/jacamar" config_args = ["config","--no-auth", "config", "--configuration", "/scratch/PATH/TO/WORK/DIR/jacamar-config.toml"] prepare_exec = "/opt/software/PATHpc2/TOEB-SW/WORKsoftware/DIRjacamar/optlatest/bin/jacamar" prepare_args = ["prepare", "--no-auth", "prepare"] run_exec = "/PATH/TO/WORK/DIR/optopt/software/pc2/EB-SW/software/jacamar/latest/bin/jacamar" run_args = ["run", "--no-auth", "run"] cleanup_exec = "/PATH/TO/WORK/DIR/optopt/software/pc2/EB-SW/software/jacamar/latest/bin/jacamar" cleanup_args = ["cleanup", "--no-auth", "cleanup", "--configuration", "/scratch/PATH/TO/WORK/DIR/jacamar-config.toml"] |
The config should look similar to this when you are done
Code Block |
---|
concurrent = 1 check_interval = 0 shutdown_timeout = 0 [general] executor = "slurm" data_dir = "/scratch/PATH/TO/WORK/DIR/.../data" concurrent = 1 check_interval = 0 [session_server] session_timeout = 1800 [[runners]] name = "Jacamar Test Runner" url = "https://git.uni-paderborn.de/" token = "TOKEN" executor = "custom" limit = 0 request_concurrency = 1 environment = ["PATH=/PATH/TO/WORK/DIR/usr/opt/software/pc2/EB-SW/software/gitlab-runner/latest/bin:/opt/software/pc2/EB-SW/software/gitlab-runner/latest/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin"] pre_cloneget_sources_script="module reset" [runners.custom_build_dir] [runners.cache] [runners.cache.s3] [runners.cache.gcs]cache] MaxUploadedArchiveSize = 0 [runners.custom] config_exec_timeout = 3600 config_exec = "/PATH/TO/WORK/DIR/usropt/software/pc2/EB-SW/software/jacamar/latest/bin/jacamar" config_args = ["config","--no-auth", "config", "--configuration", "/scratch/PATH/TO/WORK/DIR/jacamar-config.toml"] prepare_exec = "/opt/PATH/TO/WORK/DIR/usrsoftware/pc2/EB-SW/software/jacamar/latest/bin/jacamar" prepare_args = ["prepare", "--no-auth", "prepare"] run_exec = "/PATH/TO/WORK/DIR/usropt/software/pc2/EB-SW/software/jacamar/latest/bin/jacamar" run_args = ["run", "--no-auth", "run"] cleanup_exec = "/opt/PATH/TO/WORK/DIR/usrsoftware/pc2/EB-SW/software/jacamar/latest/bin/jacamar" cleanup_args = ["cleanup", "--no-auth", "cleanup", "--configuration", "/scratch/PATH/TO/WORK/DIR/jacamar-config.toml"] |
You may want to increase the value of concurrent
to allow the GitLab runner to schedule multiple jobs at once. That's it. Now To test the GitLab runner, execute it with the jacamar configuration:
Code Block | ||
---|---|---|
| ||
gitlab-runner run --config=jacamar-config.toml |
In the runners list of your repository (Settings
-> CI/CD
-> Runners
) you will find the status of your runner under Assigned project runners
.
As long as your GitLab runner is executed on Noctua, it will process the CI jobs from your project.
Concurrent CI job execution
There are multiple layers of concurrency in the runner configuration. You can define multiple [[runners]]
sections to let Jacamar create multiple runners with different configurations, but this is not required for concurrency.
The concurrent
variable at the top defines the total limit for all runners combined. Further, each runner has two variables in its section: limit
and request_concurrency
. Both limit the number of concurrent CI jobs a runner will execute.
For example, if you have 1 runner, set concurrent
to 5, set limit
to 0 to deactivate the limit
variable and set request_concurrency
to 5 to execute at most 5 CI jobs concurrently.
You can find more information in the respective documentation of Jacamar variables and Gitlab Runner variables.
4. Add CI file to repository
To run a CI job, you only need to create a .gitlab-ci.yml
file in your project and commit it to GitLab. GitLab will use your new runner for the jobs.
In the .gitlab-ci.yml
you will need to specify the variable SCHEDULER_PARAMETERS
to make it work with our Slurm installation. In this variable, you should specify your project account and the partition where the jobs should be executed.
Also the id_tokens
variable has to be specified as demonstrated in the example (context).
Example
Code Block |
---|
test: stage: build id_tokens: CI_JOB_JWT: aud: https://git.uni-paderborn.de variables: SCHEDULER_PARAMETERS: "-A PROJECT_ACCOUNT -p normal -t 0:05:00" script: - echo "Hello from " $(cat /etc/hostname) |
Change the PROJECT_ACCOUNT
to the name of your project (The name that you usually pass to sbatch
via the -A
option).
...
5. Execute GitLab Runner
The GitLab runner needs to be executed to fetch new CI jobs from GitLab. The best way is to use a systemd service which can restart the runner after a reboot of the frondend nodes.
Create a systemd user service file at .config/system/user/name.service
which looks like the following example.
Code Block |
---|
[Unit]
Description=Jacamar User Service
After=network.target
[Service]
ExecStart=/PATH/TO/WORK/DIR/gitlab-runner run --config=/PATH/TO/WORK/DIR/jacamar-config.toml
WorkingDirectory=/PATH/TO/WORK/DIR/
[Install]
WantedBy=default.target |
The service can be enabled with systemd --user enable name.service
and started with systemd --user start name.service
. Use systemctl --user daemon-reload
after changing the service file.
Now, To make the gitlab-runner continue executing after closing your SSH session, you can use screen
or tmux
.
The login nodes may be rebooted, e.g. during maintenance or an update. Afterwards, you need to login and start the gitlab-runner anew.
Now, you can schedule a pipeline in your GitLab project. The runner will fetch the created jobs and execute them on the specified partition.
Shared runners can only be created for the whole Gitlab instance. Group runner can be created for groups where the owner status is available. In every other case register a runner and create a service file for each repository.
Troubleshooting
...