Here are some tips if you’re running into trouble on the Wharton HPCC.
Interactive (qrsh
, qlogin)
Errors
If you try to qlogin
or qrsh
, and receive the following error:
qlogin Error
This generally means that queue is busy, a common occurrence. Please add '-now no
‘ option to your qlogin
or qrsh
command, like:
$ qlogin -now no $ qrsh -now no stata
Note that with qrsh
the '-now no'
option is to 'qrsh'
, not to the command you’re running (‘stata’ in this example).
Investigating Failed Jobs
If a job or jobs have failed, you can explore why in a couple of ways.
Log Files
Take a look at your output files, which are by default JOBNAME + .o + JOBID + . + TASKID
. Look for typos, missing packages or libraries, etc.
qacct
Examine the output from qacct -j JOBID + . + TASKID
. Look for ru_maxrss
of > 5242880
(bytes) for a default RAM job (5GB), or N x 1024 x 1024
(where N is GB you requested) f you’ve requested > the default job RAM.
Reporting Trouble
If you’ve read through these tips and the Tools Page for the particular software package that you are running, and still have an issue, please send an e-mail to research-computing@wharton.upenn.edu, and include as many details as you can think of, particularly:
- an example JOB ID (the best detail for us!) and TASK ID (if an array job)
- the exact commands you were running when you saw the trouble
- any errors (feel free to copy/paste) that you received