Using NVIDIA GPUs with Julia

 

Basic Tutorial

Step 1: Get an interactive session on a GPU node

srun -A <YOUR_PROJECT> -N 1 -n 128 --exclusive --gres=gpu:a100:4 -p gpu -t 1:00:00 --pty bash

Step 2: Load the JuliaHPC module

ml lang ml JuliaHPC

Step 3: Install CUDA.jl in a local Julia project environment

mkdir jlcuda cd jlcuda julia --project=.

Once the Julia REPL pops up:

As of April 2022, you should see an output like this:

Note that CUDA.jl is automatically using the local CUDA installation (because the JuliaHPC exports the necessary environment variables) and all four NVIDIA A100 GPUs are detected.

Step 4: Run a matrix multiplication on one of the NVIDIA A100 GPUs

Notice that the multiplication is much faster on the GPU.

CUDA-aware OpenMPI

Allows you to send GPU arrays (i.e. CuArray from CUDA.jl) via point-to-point and collective MPI operations.

Example

Proceed as in the basic CUDA tutorial above but also ] add MPI , i.e. install MPI.jl next to CUDA.jl. After using MPI you can then use MPI.has_cuda() to check whether the used MPI has been compiled with CUDA support. (The easiest way to get a CUDA-aware MPI that works with Julia is to use the JuliaHPC module and export OMPI_MCA_opal_cuda_support=1.)

You should now be able to run the following code (if stored in a file cuda_mpi_test.jl) from the shell via mpirun -n 5 julia --project cuda_mpi_test.jl.

Code

Output

Useful References