Python Virtual Environments¶
There are a range of virtual environments used with Python and this page will only provide a basic introduction to virtualenv and conda environments.
The default version of environment when first logging in to the HPC is Python 3.8.10. Other versions are available to load as modules. At the time of writing they are:
virtualenv is installed as a default package and is a tool to create isolated Python environments https://virtualenv.pypa.io/en/latest/.
Getting started and creating your first virtual environment is as simple as:
1 2 3 4 5 6
In this example the current working directory is
builds in the home directory of a user k1234567.
This will create a python virtual environment with the same basic properties as the existing Python environment but isolated from the original.
Virtualenv works in two phases, the first phase is called python discovery which sets the version of python to be used in the virtual environment.
It is possible to choose alternative versions of python using the -p flag and the format is explained fully in the virtualenv docs here.
You will see a confirmation that virtualenv has created the environment with some information about the creator,
seeder and activators used.
To activate the newly created environment:
Activating the environment just prepends the installed binaries in the virtual environment to the
PATH environment variable and modifies the shell prompt to indicate the active environment.
While the virtual environment is active any packages installed with
pip will be added to the virtual environment rather than the default environment.
Even if the default environments version of Python was to be changed the virtual environment would remain in a steady state with the version it was built with.
As the installed binaries have been prepended to the
PATH you leave the virtual environment by typing
Anaconda is a distribution of Python and R with 1500 pre-installed packages for data science available as a module which can be loaded using
module load anaconda3/2021.05-gcc-9.4.0. Conda is a package and environment manager that comes with the Anaconda distribution.
Creating a conda environment:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Conda will provide output as it builds the environment. On CREATE messages regarding updating conda can be ignored as it needs to be managed by a user with adminstrative privileges. The path to the files that will be used to manage the environment will be displayed and you will be asked for confirmation to proceed. Following confirmation conda will confirm that the environment has been built as expected and display information on how to activate and deactivate the environment.
1 2 3 4 5 6 7 8 9 10 11
When an environment is activated the shell prompt will be modified to display the name of the environment. Once you have deactivated a conda environment (and whenever you subsequently log in) you will see that a (base) conda environment is activated by default.
Packages can be installed using
conda install. You are advised to install packages into an activated environment and not the (base) environment.
The conda documentation is well-maintined and has a helpful cheat-sheet with the most important information about using conda.
Using a conda environment in a slurm job script¶
If you have set up the conda environment in your normal shell in your slurm job script you can 'source' your bash script and then activate the environment:
Creating a new conda packages cache location¶
On occasion the conda environments and package cache are too large for the default location. To deal with this, an alternative location for new environments and packages can be specified:
You should then be able to create conda environments in non-standard locations that do not take up storage in your default location:
1 2 3
Mamba is a reimplementation of conda written in C++, intended on increasing speeds for dependency resolution and installation time: https://mamba.readthedocs.io/en/latest/. Continuing with conda as your standard approach is fine, as mamba is fairly new relative to conda, however, where conda can sometimes get stuck and take a noticeably long time to solve new environments, mamba is an available alternative and solution.
1 2 3 4 5 6 7 8 9 10 11