Job Management

Resource Defaults

The HPC3 is designed to prevent any job or user from interfering with the resources (slots, cores, and memory) of other jobs and users. While not foolproof, there are some limitations to be aware of that are present across the cluster. One possible way to avoid running out of resources for a particular workflow is to split the work into a larger number of smaller jobs.

  • Slots/Cores: maximum of 64 (128 in the short.q queue, 32 in the mem.q (RAM requested per core >= 16GB) running slots per team*, 1 core per slot, 2 hyperthreads per core, 2GB RAM assigned (default) per slot
  • Queued jobs: maximum of 200 total running and queued jobs per team**
  • RAM: 1024GB of total RAM (all jobs combined) per team* in the default (hpc3.q) queue, 2048GB in the mem.q queue, maximum of 50GB for any interactive single (qrsh or qlogin) job. See Using More RAM (below) for details.
  • GPUs: maximum of 4 concurrently-running GPU queue (gpu.q) slots per team. Each task has a 4 hour time limit. See Using GPUs (below) for details.

* a team is faculty + all RAs & Co-Authors combined
** See Array Jobs for how to queue many more than 500 jobs!

While these are the generous defaults, we understand that sometimes they might not fit your needs. Please contact research-computing@wharton.upenn.edu to discuss your requirements, and we will work with you to get the job done.

Submitting Jobs

The scheduler (we use Univa Grid Engine) controls all user access to the HPC3’s compute nodes. All jobs must use the qrsh command or qsub command with a job script, which submit jobs to the HPC3 in an orderly fashion and allocates available resources. You’ll want to visit each specific software page (Mathematica, Matlab, Stata, etc.) for more information about creating the job script and submitting the job to a queue.

Basic job submission: qsub your-job-script.sh

qsub is one of a set of related job submission and monitoring commands that the Open Grid Scheduler can be managed with. For more information on the other related commands and the many options that are available, see the manpages (man qsub) that are available on the system.

THIS SYSTEM IS NOT FOOLPROOF. If it is determined that you seem to be ‘gaming’ the system in any way at the expense of other users, your access to these systems may be revoked without warning.

Python Example

  • Create a software script file (called demo.py here), either on the cluster, or on your laptop/desktop and upload to the cluster:
import time
mytime = str(time.time())
print 'Hello World!\nIt has been {0} seconds since the epoch!'.format(mytime)
  • Create a job script file (called demo.sh here) containing one simple line:
python demo.py
  • Submit to the cluster with qsub demo.sh
  • Output will be in the same directory to demo.sh.oXXXXXX and demo.sh.eXXXXXX files, where XXXXXX is the job ID
  • You can watch the output while its running with tail -f demo.sh.oXXXXXX (quit this with Control-c)
  • See “More qsub Options” below for more details on options to qsub that you can add on the command line, or in your script

SAS Example

  • Create a software script file (called demo.sas here):
proc iml;
a={ 1 2 3 };
run;
print a;
run;

endsas;
  • Create a job script file (called demo.sh here) containing one simple line:
sas -nodms demo.sas
  • Submit to the cluster with qsub demo.sh
  • Output will be in the same directory to demo.sh.oXXXXXX and demo.sh.eXXXXXX files, where XXXXXX is the job ID
  • You can watch the output while its running with tail -f demo.sh.oXXXXXX (quit this with Control-c)

Using More RAM

To use more than the default 2GB of RAM for a job, modify the m_mem_free option (default 2G) of the job, either in your job script:

#$ -l m_mem_free=12G

python myPythonCode.py

Or as an option to your qsub command: qsub -l m_mem_free=12G job_script.sh

That option will be passed to all slots used for a job, and each task in a task array job, so if you’re doing a parallel job (except MATLAB, see below), each worker will have the adjusted value.

Keep in mind the RAM limits:

  • RAM: 1024GB of RAM total (all jobs combined) per project team*
  • RAM: 240GB of RAM per compute host / any single job (for a limited number of hosts)

* For example: if you request 50GB of RAM (-l m_mem_free=50G) 100 task Array Job, you will have 20 running jobs (1024/50=20.38), and 80 queued jobs.

MATLAB Parallel (parpool) Worker RAM

Since MATLAB parallel workers are started from within your master MATLAB job via a separate, internal architecture, requesting more RAM in your master job will not be passed along to the workers. To request more RAM for each of your MATLAB parallel workers, modify your UGE job (.sh) script as below, making sure the export MATLAB_WORKER_RAM=XG line is above the line where you call MATLAB:

export MATLAB_WORKER_RAM=XG
matlab < mymatlabcode.m

That will give you ‘X’ number of GB of RAM per worker node when your parallel job starts. Be aware of the user RAM totals available when setting this number. X * #workers needs to be < total RAM per user.

Using GPUs

Please only use the GPU queue for work that requires a GPU. To run your job on a server with a GPU, use the -q gpu.q option, either in your job script:

#$ -q gpu.q

python -u myPythonCode.py

Or as an option to your qsub, qrsh, or qlogin command: qsub -q gpu.q job_script.sh

Keep in mind the GPU limits:

  • Time limit: 4 hours
  • GPU Memory: 16GB (NVIDIA V100 GPU)
  • RAM: 50GB of RAM per GPU server, so for any single gpu.q task or job

More qsub Options

There are a number of very useful options to the qsub command. You can add these options to the qsub command itself, OR in your job script file, preceded by ‘$#‘. Here is a short list of a few of our favorites:

Option Description
-q queuename NO LONGER RECOMMENDED! See Using More RAM, or -l m_mem_free=XG, below. Run this job in a specific ‘queue’ (often ‘bigram’, or ‘vbigram’)
-l m_mem_free=##G alter the amount of RAM your job requires (replace ## with a number)
-j y ‘join’ the output and error files (one output file only)
-o fileORdirname force the output file to be named something, OR write to a specified directory
-m e -M username@wharton.upenn.edu send an e-mail at job completion (DON’T use this in a task array (-t) job!!)
-N jobname give the job a specific name (to differentiate between many running jobs?)
-sync y wait until this job is complete before ‘moving on’ the the next qsub job (in a loop?)
-t x-y run an ‘array’ of jobs, starting with job x, ending with job y (a Task Array)
-pe mpi # run this job as an MPI job, with a # of workers (your software must support MPI)
-pe openmp # run this job as an OpenMP (SMP) job, with a # of workers (your software must support OpenMP threading)

Here is an example demo.sh with a few options added:

#$ -N series003
#$ -j y
#$ -o myoutputdir
#$ -t 1-1600

matlab -nodisplay -r mycode($SGE_TASK_ID)

These options will: name the job ‘series003’, join the output and error files, write the job output to the myoutputdir directory (that directory must exist! Otherwise it will write all output to a myoutputdir file instead), and run 1600 copies of the job (tasks: a 1600-task job), passing the $SGE_TASK_ID environment variable to the MATLAB code as the only argument.

Job Output

Output will be in the same directory where you started your job to (where ***** is the scheduler job ID):

  • Command Output: your-job-script.sh.o*****
  • Error Output: your-job-script.sh.e*****

You can specify file redirection in your job script with command < your-commands-file > your-output-file

You can watch the output while its running with tail -f your-job-script.sh.o***** and quit this with Ctrl-C

Job Messaging

It’s easy to get notified via email or e-mail-to-SMS when your job is completed. Submit your job with flags to email when completed:

qsub -m e your-job-script.sh

By default, the notification will go to the address in your ~/.forward file, which is pre-populated with your Wharton e-mail address: username@wharton.upenn.edu. You can send these messages to a different e-mail address — for example your SMS e-mail address — by adding the -M option:

qsub -m e -M #########@vtext.com your-job-script.sh

That would e-mail #########@vtext.net at the end of the job. You can even e-mail multiple addresses, by separating them with a comma in the -M option:

qsub -m e -M #########@vtext.com,username@wharton.upenn.edu your-job-script.sh

Be careful: don’t use this option with task array jobs (well, unless they are small, say < 100), as you will get one e-mail for each completed task.

Monitoring Jobs and Hosts

There are some useful commands for job and host monitoring from the command line:

  • qstat: displays the status of the HPC3 queues, by default displaying your job information. It includes running and queued jobs
  • qram: displays the RAM usage of each of your running jobs/tasks
  • hpccstatus: displays further cluster job details, including available slots
  • qhost: displays the status of all the nodes of the HPC3
  • qacct -j: display accounting info about completed jobs

man command for more usage information.

Altering Jobs

The qalter command is used to alter queue options for queued and running jobs. Be sure to check out more usage information with man qalter.

Deleting Jobs

The qdel command is used to kill queued and running jobs. Be sure to check out more usage information with man qdel.

Array Jobs

Submitting Multiple Runs of a Job with one command!
If you plan on running many similar jobs (for example: MCMC on different data, optimization on a different set of inputs, etc.), instead of submitting dozens, or thousands of individual qsub commands, try an Array Job instead. Take a look at our Array Jobs page for details. You’ll be glad you did!

Chaining Jobs

You may need to chain jobs to perform operations on output files of previous jobs or to ensure you only use certain resources. To run jobs that are chained together, you simply need to add the -hold_jid JOBID option to your second (or third, or fourth) job, where JOBID is the job ID number (or name!! preferred!) of the prerequisite job. So the second job will queue, but not start until the first job completes. You can also chain tasks, so each task in a second array job will start after the prerequisite task completes from the first job. Use -hold_jid_ad JOBID for that.