Python Package Management
Recommendations
Use environments (e.g. conda)
Use the parallel file system (i.e.
/scratch/<group>/<username>/
)
Note that HOME is not a good choice for storing Python packages since
it is not designed for parallel use,
it will very likely lead to quota issues (HOME is limited to 20 GB), and
it causes unnecessary backups of reproducible data.
Self-Learning Mini-Tutorials
Conda
which python
# load module
ml lang
ml Miniforge3
which python
conda create --name myenv python=3.9.5
conda activate myenv # won't work if shell integration hasn't been set up already
# shell integration
conda init
. $HOME/.bashrc # update shell
# or `source $(conda info --base)/etc/profile.d/conda.sh` for immediate + non-permanent
conda activate myenv # works now
which python
conda list
conda install numpy
# so far so good, but where is our environment (and the packages etc.) located?
conda env list
# it shouldn't live in $HOME but on the parallel FS
# how to resolve this?
# 1) provide `--prefix <path>` to `conda create` (and then full path to `conda activate`)
# 2) conda config --add envs_dirs /scratch/<group>/<username>/.conda/envs
# conda config --add pkgs_dirs /scratch/<group>/<username>/.conda/pkgs
# let's recreate it properly...
conda deactivate myenv
# conda env remove --name myenv .... but let's just rm -rf $HOME/.conda
conda config --add envs_dirs /scratch/pc2-mitarbeiter/bauerc/.conda/envs
conda config --add pkgs_dirs /scratch/pc2-mitarbeiter/bauerc/.conda/pkgs
cat $HOME/.condarc
# (potentially) redo the above....
# then `which python`
Pip (global)
which python
pip # command not found
# load module
ml lang
ml Python/3.9.5
which python
which pip
# note that the default path for global packages etc. is $HOME/.local/lib/python<ver>/site-packages
# to change it we can put the following into our .bashrc
# export PYTHONUSERBASE=/scratch/<group>/<username>/.local
export PYTHONUSERBASE=/scratch/pc2-mitarbeiter/bauerc/.local
# you might also want to add
export PATH=/scratch/pc2-mitarbeiter/bauerc/.local/bin:$PATH
pip install numpy
ls /scratch/pc2-mitarbeiter/bauerc/lib/python3.9/site-packages
However, if you want/need to use it, we generally recommend to use `pip` within a conda environment!
Poetry
export POETRY_CACHE_DIR=/scratch/pc2-mitarbeiter/bauerc/.pypoetry
# or `poetry config cache-dir /scratch/pc2-mitarbeiter/bauerc/.pypoetry` (config file is here: `$HOME/.config/pypoetry/config.toml`)
poetry new myproject
cd myproject
poetry show # list installed packages
poetry add numpy
poetry show
# how to run something in the environment or use it interactively?
which python # still points to the general one.... WHY?!
python
import numpy # fails.... WHY?!
poetry run which python
poetry shell
which python
python
import numpy # works within the poetry shell
exit # leave the shell
# where is the actual virtualenv: `poetry env list --full-path`
Virtualenv (this is not `python -m venv`!)
Â