Python Package Management

Recommendations

  1. Use environments (e.g. conda)

  2. Use the parallel file system (i.e. /scratch/<group>/<username>/)

Note that HOME is not a good choice for storing Python packages since

  1. it is not designed for parallel use,

  2. it will very likely lead to quota issues (HOME is limited to 20 GB), and

  3. it causes unnecessary backups of reproducible data.

Self-Learning Mini-Tutorials

Conda

which python # load module ml lang ml Miniforge3 which python conda create --name myenv python=3.9.5 conda activate myenv # won't work if shell integration hasn't been set up already # shell integration conda init . $HOME/.bashrc # update shell # or `source $(conda info --base)/etc/profile.d/conda.sh` for immediate + non-permanent conda activate myenv # works now which python conda list conda install numpy # so far so good, but where is our environment (and the packages etc.) located? conda env list # it shouldn't live in $HOME but on the parallel FS # how to resolve this? # 1) provide `--prefix <path>` to `conda create` (and then full path to `conda activate`) # 2) conda config --add envs_dirs /scratch/<group>/<username>/.conda/envs # conda config --add pkgs_dirs /scratch/<group>/<username>/.conda/pkgs # let's recreate it properly... conda deactivate myenv # conda env remove --name myenv .... but let's just rm -rf $HOME/.conda conda config --add envs_dirs /scratch/pc2-mitarbeiter/bauerc/.conda/envs conda config --add pkgs_dirs /scratch/pc2-mitarbeiter/bauerc/.conda/pkgs cat $HOME/.condarc # (potentially) redo the above.... # then `which python`

Pip (global)

which python pip # command not found # load module ml lang ml Python/3.9.5 which python which pip # note that the default path for global packages etc. is $HOME/.local/lib/python<ver>/site-packages # to change it we can put the following into our .bashrc # export PYTHONUSERBASE=/scratch/<group>/<username>/.local export PYTHONUSERBASE=/scratch/pc2-mitarbeiter/bauerc/.local # you might also want to add export PATH=/scratch/pc2-mitarbeiter/bauerc/.local/bin:$PATH pip install numpy ls /scratch/pc2-mitarbeiter/bauerc/lib/python3.9/site-packages

However, if you want/need to use it, we generally recommend to use `pip` within a conda environment!

Poetry

export POETRY_CACHE_DIR=/scratch/pc2-mitarbeiter/bauerc/.pypoetry # or `poetry config cache-dir /scratch/pc2-mitarbeiter/bauerc/.pypoetry` (config file is here: `$HOME/.config/pypoetry/config.toml`) poetry new myproject cd myproject poetry show # list installed packages poetry add numpy poetry show # how to run something in the environment or use it interactively? which python # still points to the general one.... WHY?! python import numpy # fails.... WHY?! poetry run which python poetry shell which python python import numpy # works within the poetry shell exit # leave the shell # where is the actual virtualenv: `poetry env list --full-path`

Virtualenv (this is not `python -m venv`!)

Â