Skip to content

TensorFlow

TensorFlow is an end-to-end open source platform for machine learning.

Installation

There are a range of available installation options for TensorFlow on CREATE.

Note

To avoid memory limitations on the login nodes, request an interactive session to complete your installation process. Please see the documentation on how to request more resources when using CREATE HPC.

Tip

To save space in your home directory, the following example assumes you have created a non-standard conda package cache location, however, this is not a requirement and the standard method will work just as well.

Using a Virtual Environment

If you require the latest stable version of TensorFlow, pip is recommended as TensorFlow is only officially released to PyPI. Make sure to first setup your virtual environment:

1
2
3
4
5
6
7
8
9
/scratch/users/k1234567$ module load python/3.11.6-gcc-13.2.0
/scratch/users/k1234567$ virtualenv tensorflow-venv -p `which python`
created virtual environment CPython3.11.6.final.0-64 in 1326ms
  creator CPython3Posix(dest=/users/k2370184/tensorflow-venv, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/users/k2370184/.local/share/virtualenv)
    added seed packages: pip==23.1, setuptools==67.6.1, wheel==0.40.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator
/scratch/users/k1234567$ source tensorflow-venv/bin/activate
(tensorflow-venv) /scratch/users/k1234567$ pip install tensorflow

Or you can use conda, although it may not have the latest version, it is still a great option for repeatable analysis and is much easier to use for dependency management:

1
2
3
/scratch/users/k1234567/conda$ module load anaconda3/2023.09-0-gcc-13.2.0
/scratch/users/k1234567/conda$ conda create --prefix ./tensorflow-conda -c conda-forge tensorflow=2.10 python=3.9.13
/scratch/users/k1234567/conda$ conda activate /scratch/users/k1234567/conda/tensorflow-conda

Using Modules

TensorFlow is also available as a module on CREATE: py-tensorflow/2.14.0-gcc-11.4.0-cuda-python-3.11.6.

Note that if you run module load py-tensorflow it will default to our latest version.

1
2
3
import tensorflow as tf

print(tf.reduce_sum(tf.random.normal([1000, 1000])))

Testing Tensorflow on the GPU

Install the GPU-enabled version of tensorflow:

1
(tensorflow-venv) /scratch/users/k1234567$ pip install 'tensorflow[and-cuda]'

With the following commands you can test your tensorflow install on a GPU:

1
2
3
import tensorflow as tf

print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Using Singularity

TensorFlow is also available through the Singularity containerisation tool:

1
singularity pull docker://tensorflow/tensorflow:2.11.0-gpu

With nvidia support enabled, the following example command can be used to test your TensorFlow container:

1
singularity exec --nv tensorflow_2.11.0-gpu.sif python -c 'import tensorflow as tf; print(tf.sysconfig.get_build_info()["cuda_version"])'

Using TensorFlow in a Jupyter notebook

For a complete guide on how to launch Jupyter on CREATE HPC, then please refer to our guide document here. The following example makes use of the virtual environment created above and installs jupyterlab directly there:

1
2
/scratch/users/k1234567$ source tensorflow-venv/bin/activate
(tensorflow-venv) /scratch/users/k1234567$ pip install jupyterlab

However, when using CREATE modules and self-installed software, please make note of what python version is being used to avoid potential incompatibility issues.

Create a batch script for TensorFlow and Jupyter

Take note

Due to the resource overhead of both Jupyter and TensorFlow, please make sure you request a sufficient amount of compute resources via sbatch to avoid potential kernel instability issues when using your Jupyter notebook.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/bin/bash -l

#SBATCH --job-name=ops-tensorflow
#SBATCH --partition=gpu
#SBATCH --gres=gpu
#SBATCH --mem=4GB
#SBATCH --time=01:00:00
#SBATCH --signal=USR2

# get unused socket per https://unix.stackexchange.com/a/132524
readonly IPADDRESS=$(hostname -I | tr ' ' '\n' | grep '10.211.4.')
readonly PORT=$(python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')
cat 1>&2 <<END
1. SSH tunnel from your workstation using the following command:

ssh -NL 8888:${HOSTNAME}:${PORT} ${USER}@hpc.create.kcl.ac.uk

and point your web browser to http://localhost:8888/lab?token=<add the token from the jupyter output below>

When done using the notebook, terminate the job by
issuing the following command on the login node:

    scancel -f ${SLURM_JOB_ID}

END

source tensorflow-venv/bin/activate
jupyter-lab --port=${PORT} --ip=${IPADDRESS} --no-browser

printf 'notebook exited' 1>&2
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/bin/bash -l

#SBATCH --job-name=ops-tensorflow
#SBATCH --partition=cpu
#SBATCH --mem=4GB
#SBATCH --time=01:00:00
#SBATCH --signal=USR2

# get unused socket per https://unix.stackexchange.com/a/132524
readonly IPADDRESS=$(hostname -I | tr ' ' '\n' | grep '10.211.4.')
readonly PORT=$(python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')
cat 1>&2 <<END
1. SSH tunnel from your workstation using the following command:

ssh -NL 8888:${HOSTNAME}:${PORT} ${USER}@hpc.create.kcl.ac.uk

and point your web browser to http://localhost:8888/lab?token=<add the token from the jupyter output below>

When done using the notebook, terminate the job by
issuing the following command on the login node:

    scancel -f ${SLURM_JOB_ID}

END

source tensorflow-venv/bin/activate
jupyter-lab --port=${PORT} --ip=${IPADDRESS} --no-browser
printf 'notebook exited' 1>&2

Once you have submitted your batch script via sbatch, you can use the instructions printed in the slurm output to launch Jupyter notebook in your browser and test your TensorFlow:

1
2
3
4
5
6
[1]: import tensorflow as tf
     tf.debugging.set_log_device_placement(True)
     a = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]])
     b = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
     c = tf.matmul(a, b)
     print(c)