Stata is a complete, integrated statistical package that provides everything you need for data analysis, data management, and graphics. More information can be found at: http://www.stata.com/
Stata on the HPC3
Interactive Stata Sessions
- For graphical Stata, please login to: hpc3-desktop.wharton.upenn.edu Once you have started a MATE desktop, open a Terminal and use the
qlogin
command, thenxstata-se
will start the GUI. - Textual:
qrsh stata-se
Submitting a Stata Job
Create Stata Commands File
Create a .do file with your commands, for example:
display "Hello, world" exit
Create Stata Job Script
Create a .sh file with at least the following contents:
#!/bin/bash stata-se do your-commands-file.do > filename_${JOB_ID}.log 2>&1
Notice the ‘>’ and ‘2>&1’ … these send the output to a log file with a JOB_ID number, so that if you run the code multiple times, you will have separate logs. It’s even more important for Array Jobs, which you’ll want to add SGE_TASK_ID to your log name, as well:
#!/bin/bash stata-se do your-commands-file.do > filename_${JOB_ID}_${SGE_TASK_ID}.log 2>&1
A fleshed out script (/usr/local/demo/Stata/demo.sh) that works well for most Stata jobs:
#!/bin/bash #$ -N stata_demo #$ -j y # set up the Stata job DOFILE=$1 LOG=$DOFILE.${JOB_ID}$(if [[ ! ${SGE_TASK_ID} == 'undefined' ]]; then echo ".${SGE_TASK_ID}"; fi).log shift 1 echo $@ # say what you're doing echo running stata dofile: ${DOFILE}, logging to: ${LOG}$(if [[ -n $@ ]]; then echo ", with args: $@"; fi) # RUN THE JOB stata-se do $DOFILE > $LOG 2>&1 $@
Submit Stata Job
$ qsub demo.sh myDOfile.do arg1 arg2 arg3
More information on job submission, monitoring, and control at Job Management
Install Personal Packages to PERSONAL Directory
Wharton Computing will generally not install non-Stata Corp developed packages (user-written packages). You are welcome to install them to your own PERSONAL directory.
There are a tremendous number of user-written programs for Stata available which, once installed, act just like official Stata commands. Some are conveniences, like outreg for formatting regression output. Others calculate results Stata itself does not, such as polychoric for polychoric correlations. A few represent major extensions of Stata’s capabilities, such as ice and mim for multiple imputation or gllamm for mixed models. Most of these programs are stored at Boston College’s Statistical Software Components archive (or SSC).
You are welcome to install any user-written commands you desire to use on Wharton’s HPC systems, which will be installed into your PERSONAL (a Stata variable) directory. To see what directories Stata is currently using, use the ‘sysdir’ command in Stata. Using your PERSONAL directory, you don’t need to worry about programs you install causing problems for others. On the other hand, this means you need to install the user-written programs you want yourself. Wharton Computing does not try to identify some useful set of user-written programs and make them available to everyone, as our user base is extremely diverse.
Finding User-Written Programs
If you know the name of the program you want to use, you can go directly to Installing User-Written Programs. However, it’s much more common to know what you want to do without knowing what program (if any) can do it. This is a job for Stata’s findit command.
For example, suppose you wanted to do something with Heckman selection models but don’t know what command to use. Type:
. findit heckman
The result is a tremendous amount of information. The findit command first searches Stata’s official help files and notes that there is an official heckman command and several other related commands (this makes findit a powerful tool for figuring out how to do things in Stata in general, not just for finding user-written programs). It then searches Stata’s web site and locates several FAQ entries, plus an example on UCLA’s large statistics web site. It then begins to list relevant user-written programs, organized into “packages.” Programs that were described in the Stata Journal or the older Stata Technical Bulletin are listed first.
Installing User-Written Programs
If you know the name of the package you want to install, you can install it by typing in Stata:
. ssc install package
Updating User-Written Programs
It can be important to maintain the latest versions of any user-written programs you install. Sometimes updates will include important bug fixes, though the SSC archive has quality control measures in place to try to catch bugs before the program is distributed.
The easiest way to check that your user-written programs are up-to-date is to type:
. adoupdate
The adoupdate command notes where each package was downloaded from and goes back to that location to see if a more recent version is available. If there is, you can install the latest version by typing:
. adoupdate, update
You can get a list of the packages you’ve installed by typing:
. ado dir
This can be very helpful for catching things like having downloaded stb0067_3 rather than ice. You can remove a package by typing:
. ado uninstall package
where package should be replaced by either the name of the package you want to remove or the number it is given by ado dir, including the brackets around it.
For example, suppose you downloaded some earlier version of ice that was associated with a Stata Journal article. Just typing:
. ssc install ice
will fail because you already have a copy of ice.ado and all the other related files, and the installer refuses to overwrite them. Thus you need to identify and remove the older version. To do so, type:
. ado dir
If the results included the entry:
[6] package st0067_3 from http://www.stata-journal.com/software/sj7-4 SJ7-4 st0067_3. Update: Multiple imputation of missing...
you could remove it by typing either:
. ado uninstall st0067_3
or
. ado uninstall [6]
Then
. ssc install ice
will successfully install the latest version. You should then type:
. adoupdate
periodically to ensure that ice stays up-to-date.