Python on the HPCC
Python versions 2.7 and several 3.x versions are available on the HPCC compute servers. By default, the python command will invoke version 2.7. To invoke a Python 3 version on a compute server, first enable with the module load python/VERSION command, for example ‘module load python/3.10.4’ to load Python 3.10.4 (the current latest version on the Wharton HPCC systems). We highly recommend using Python 3.8+, only falling back to 2.7 if a required module does not support version 3 (very, very rare at this point), as both Python 2.7 and Python 3.6 are now officially ‘deprecated’ and considered unsupported.
![]() |
NOTE: All examples below assume that you will be using Python 3 |
Submitting a Basic Python Job
![]() |
NOTE: You’ll need to first qlogin to install new modules, or ‘do Python’. Please do not run code on the login nodes. |
Create a .py file
Create a myfile.py file with program content, for example:
import random rlist = [] for i in range(10): rlist.append(random.gauss(0,1)) outfile = open('mylist.txt', 'w') outfile.write(str(rlist) + "n")
Create a job script
Create a script file (myfile.sh in this example) file, for example:
#!/bin/bash #$ -N jobname #$ -j y # join output and error module load python/3.10.4 # <- load a recent version of Python 3 python myfile.py
Submitting the job script
Submit the job with:
qsub myfile.sh
Setting up your own Python environment
While the above may be okay for extremely simple code, generally you’ll need to install some Python modules to do your Python work. Here’s how to set things up properly in the HPCC for your Python projects.
![]() |
NOTE: Remember that with each qlogin or job script, Python 3 (if used) must be re-enabled and your virtualenv (if used) re-activated. These module and source commands can be added to your job script or ~/.bashrc file, in the MPI SELECTION section, between the if > fi. A ‘complete’ example: |
## START MPI SELECTION if [[ $(hostname -s | grep "^hpcc[0-9]*$") ]]; then module load mpi/openmpi-x86_64 module load python/3.10.4 module load gcc/11.3.0 # <- also a more recent compiler! fi ## END MPI SELECTION
Python provides functions and features via what are called modules. The recommended way to install one or more Python modules is with the pip command within a virtualenv-created directory. ‘virtualenv’ creates a self-contained directory that will hold a set of python modules. In this way, you may organize a different set of modules, maybe with different versions, for different projects.
Log on to a Compute Node
Please do all Python work on a compute node:
qlogin -now no
Enable Python 3
If you didn’t set up Python3 in your ~/.bashrc file as recommended above (highly recommended!), you will need to:
module load python/3.10.4
We also recommend that you install a more recent compiler, as many Python modules will not install properly without one:
module load gcc/11.3.0
Create a Project Folder
If you don’t already have a directory specifically for your project, create one. The below command creates a ‘projectA’ directory (modify for your project name!!) in your home directory (~ is shorthand for ‘my home directory’ … you can also use $HOME).
mkdir ~/projectA
Create a Project Virtual Environment
Change your directory to the project directory, and create a virtual environment. I like ‘venv310’, where ‘venv’ means ‘virtual environment’, and ‘310’ tells me the Python version I’m running under:
cd ~/projectA python -m venv venv310
Activate the virtualenv
Now you’re ready to ‘activate’ the virtual environment, and ‘do stuff’, whether that’s run code, or install modules.
source venv310/bin/activate
Update the virtualenv
We highly recommend that you initially update a few packages when you create a new virtual environment.
python -m pip install -U pip python -m pip install -U setuptools wheel
Install the modules that you need into the active virtualenv
python -m pip install pandas matplotlib
Now you can work interactively in Python (generally just for simple testing), or log out of the compute host, so you can run batch jobs with qsub:
logout
Submitting a Virtual Environment Python Job (qsub)
Create or Modify a job script
Create a script file (myfile.sh in this example) file, which is in the projectA directory. Note the only difference between this script and the ‘Simple’ script at the top is the virtual environment activation (source) line:
#!/bin/bash #$ -N jobname #$ -j y # join output and error module load python/3.10.4 source venv310/bin/activate python myfile.py
Run it like any normal qsub job:
qsub myfile.sh
A note about standard output aka print
The Grid Engine job queuing can buffer output of a job. If you would like to see print statements from a running job as they occur, say for debug or progress messages, please do one of the following:
print(f"print some text here with a {variable}", flush=True)
OR add the ‘-u’ option to your python command in your job script:
python -u myfile.py