Skip to content

Python Virtual Environments

There are a range of virtual environments used with Python and this page will only provide a basic introduction to virtualenv and conda environments.

Virtualenv

The default version of environment when first logging in to the HPC is Python 3.10.12. Other versions are available to load as modules. At the time of writing they are:

  • python/3.10.13
  • python/3.11.6

virtualenv is installed as a default package and is a tool to create isolated Python environments https://virtualenv.pypa.io/en/latest/.

Getting started and creating your first virtual environment is as simple as:

1
2
3
4
5
6
$ virtualenv tutorial
created virtual environment CPython3.8.10.final.0-64 in 4045ms
  creator CPython3Posix(dest=/users/k1234567/builds/tutorial, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/users/k1234567/.local/share/virtualenv)
    added seed packages: pip==21.3.1, setuptools==60.5.0, wheel==0.37.1
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator

In this example the current working directory is builds in the home directory of a user k1234567. This will create a python virtual environment with the same basic properties as the existing Python environment but isolated from the original. Virtualenv works in two phases, the first phase is called python discovery which sets the version of python to be used in the virtual environment. It is possible to choose alternative versions of python using the -p flag and the format is explained fully in the virtualenv docs here. You will see a confirmation that virtualenv has created the environment with some information about the creator, seeder and activators used.

To activate the newly created environment:

1
2
$ . tutorial/bin/activate
(tutorial) k1234567@erc-hpc-login1:~/builds$

Activating the environment just prepends the installed binaries in the virtual environment to the PATH environment variable and modifies the shell prompt to indicate the active environment. While the virtual environment is active any packages installed with pip will be added to the virtual environment rather than the default environment. Even if the default environments version of Python was to be changed the virtual environment would remain in a steady state with the version it was built with. As the installed binaries have been prepended to the PATH you leave the virtual environment by typing deactivate.

Conda environments

Anaconda is a distribution of Python and R with 1500 pre-installed packages for data science available as a module which can be loaded using module load anaconda3/2022.10-gcc-13.2.0. Conda is a package and environment manager that comes with the Anaconda distribution.

Creating a conda environment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ conda create --name conda_env
Collecting package metadata (current_repodata.json): done
Solving environment: done


==> WARNING: A newer version of conda exists. <==
  current version: 22.9.0
  latest version: 24.3.0

Please update conda by running

    $ conda update -n base -c defaults conda



## Package Plan ##

  environment location: /users/k1234567/.conda/envs/conda_env



Proceed ([y]/n)?

Conda will provide output as it builds the environment. On CREATE messages regarding updating conda can be ignored as it needs to be managed by a user with adminstrative privileges. The path to the files that will be used to manage the environment will be displayed and you will be asked for confirmation to proceed. Following confirmation conda will confirm that the environment has been built as expected and display information on how to activate and deactivate the environment.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
#
# To activate this environment, use
#
#     $ conda activate conda_env
#
# To deactivate an active environment, use
#
#     $ conda deactivate

When an environment is activated the shell prompt will be modified to display the name of the environment. Once you have deactivated a conda environment (and whenever you subsequently log in) you will see that a (base) conda environment is activated by default.

Packages can be installed using conda install. You are advised to install packages into an activated environment and not the (base) environment.

The conda documentation is well-maintined and has a helpful cheat-sheet with the most important information about using conda.

Using a conda environment in a slurm job script

If you have set up the conda environment in your normal shell in your slurm job script you can 'source' your bash script and then activate the environment:

1
2
source /users/${USER}/.bashrc
source activate environmentname

Creating a new conda packages cache location

On occasion the conda environments and package cache are too large for the default location. To deal with this, an alternative location for new environments and packages can be specified:

1
2
mkdir -p /scratch/users/k1234567/conda/pkgs
conda config --add pkgs_dirs /scratch/users/k1234567/conda/pkgs

You should then be able to create conda environments in non-standard locations that do not take up storage in your default location:

1
2
3
cd /scratch/users/k1234567/conda
conda create --prefix ./new-environment
conda activate /scratch/users/k1234567/conda/new-environment

Mamba

Mamba is a reimplementation of conda written in C++, which offers faster dependency resolutions for complex environments and therefore shorter installation times: https://mamba.readthedocs.io/en/latest/. Continuing with conda as your standard approach is fine, as mamba is fairly new relative to conda, however, where conda can sometimes get stuck and take a noticeably long time to solve new environments, mamba is an available alternative and solution. Both mamba and conda are available through the following Miniforge3 module:

1
2
3
module load miniforge3/24.1.2-0-gcc-13.2.0
mamba --version
mamba init

Although Miniforge3 is the recommended installer for Mamba, it can still be installed through the anaconda3/2022.10-gcc-13.2.0 module using conda:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
conda create -n mamba_environment -c conda-forge mamba
conda activate mamba_environment
(mamba_environment) mamba create -n new-environment -c <channel> <packages-to-install>

# Run 'mamba init' to be able to run mamba activate/deactivate
# and start a new shell session. Or use conda to activate/deactivate.

# (mamba_environment) $ mamba init
# (mamba_environment) $ source ~/.bashrc

(mamba_environment) mamba activate new-environment