OpenMPI issues

OpenMPI issues

Noctua 1

  • when using ls1 Mardyn compiled with GCC 11.2.0 and OpenMPI 4.1.1 there can be a memory leak in a few mpi-ranks that will eventually kill the simulation because it runs out of memory. Other combinations of GCC and OpenMPI might be affected, too.

    • A suitable workaround is to use the Intel compilers and Intel MPI.

Noctua 2

  • Errors/Warnings when using /tmp for example by OpenMPI:

    • Background: To guarantee a sufficient isolation between user, jobs and nodes the /tmp directory is redirected to an isolated directory on the parallel file system that only exists during the compute job. Some programs might expect /tmp to reside in main memory and might issue a warning or an error if it isn’t.

    • Workaround:

      • for OpenMP:

        • If you compile OpenMPI yourself and run programs with mpirun, please include “orte_tmpdir_base = /dev/shm” in your openmpi-mca-params.conf.

        • If you compile OpenMPI yourself and run programs with srun, please make sure that you have compiled OpenMPI with PMIx-support. We have set SLURM_PMIX_TMPDIR="/dev/shm” globally which will become effective then.

  • OpenMPI on gpu partition:

    • Occasionally MPI jobs may fail on the gpu partition due to UCX error messages such as (We are investigating the issue):

      [1648816196.947405] [n2gpu1201:476591:0] mm_posix.c:206 UCX ERROR open(file_name=/proc/476593/fd/44 flags=0x0) failed: No such file or directory [1648816196.947422] [n2gpu1201:476591:0] mm_ep.c:158 UCX ERROR mm ep failed to connect to remote FIFO id 0xc000000b000745b1: Shared memory error ...
  • Some applications might not flush their output into stdout/stderr. In these cases please use the srun option -u or --unbuffered in your job scripts.

  • Warning regarding mpirun usage:
    When using mpirun as the MPI launcher, you may encounter warning messages similar to the following:

    ... Failed to modify UD QP to INIT on mlx5_0: Operation not permitted

To avoid this issue, we recommend using srun as the MPI launcher on our HPC clusters. However, if you have to use mpirun to launch your MPI application, you can resolve this problem by either:

  1. Setting the following environment variables in your Slurm jobscript:

    export OMPI_MCA_btl='^uct,ofi' export OMPI_MCA_pml='ucx' export OMPI_MCA_mtl='^ofi'
  2. Or running mpirun with the following options:

    mpirun --mca btl ^uct,ofi --mca pml ucx --mca mtl ^ofi