Array Jobs

Automated Multiple Runs of a Job
If you plan on running many similar jobs (for example: MCMC on different data, optimization on a different set of inputs, etc.), instead of submitting dozens, or thousands of individual qsub commands, try an Array Job instead.

Univa Grid Engine’s ‘qsub -t’ Method

Univa Grid Engine’s ‘qsub’ command combined with the -t option allow for the submission of an ‘array’ of jobs. When a job is launched via ‘qsub -t n[-m[:s]]’ you will have the environment variables SGE_TASK_ID and SGE_TASK_FIRST, and if you add m you will have SGE_TASK_LAST, and if you add s you will have SGE_TASK_STEPSIZE. So -t sets the ‘index numbers’ associated with the job, like so:

  • n is the first index number
  • m is the last index number (optional)
  • s is the step size (optional, defaults to 1)

Example 1: Hello World

So for example, let’s run 5 ‘hostname’ jobs (not strictly “Hello World”, but actually more demonstrative), and see what we see via qstat and get in our output:

That launches the job, you should see something like this:

Notice how it defaulted the s (step) to :1. Now take a look at qstat:

Great! The job is running. When it completes, let’s look at the output:

So we had five jobs on 5 different hosts, and each had a different SGE_TASK_ID.

If we wanted to continue the next ‘set of 5’ jobs:

Notice that we changed the -t to 6-10.

Example 2: 10 scripts

So how can we use this in our code? Consider the following task: run 10 R script files named mycode-1.R (or mycode-1.m, etc) through mycode-10.R. Create those 10 scripts (the painful part), now do:

Example 3: 10 data files

While that’s moderately useful, let’s say you have 10 tab-separated text data files to evaluate with the same code … name the data files mydata-1.txt through mydata-10.txt, and a Matlab script file called ‘mydataread.m’:

Then run:

And the output from the number 3 data file (:)):

 

Will be different than the number 6 data file:

Pretty useful!

Shell Loop Method

Shell languages like Bash (which is the default in examples in this documentation) provide the while, for, and foreach constructs. Their discussion is outside the scope of this document, however here is a simple example (one of many ways) of how you can modify your job script for multiple runs:

Imbox_content

LOOP Warning: Make sure that you do not create an infinite loop by using while [[ 1 ]], or something that will always evaluate to true.