Stata

Stata is a complete, integrated statistical package that provides everything you need for data analysis, data management, and graphics. More information can be found at: http://www.stata.com/

Stata on the HPC3

Interactive Stata Sessions

  • For graphical Stata, please login to: hpc3-desktop.wharton.upenn.edu Once you have started a MATE desktop, open a Terminal and use the qlogin command, then xstata-se will start the GUI.
  • Textual: qrsh stata-se

Submitting a Stata Job

Create Stata Commands File

Create a .do file with your commands, for example:

display "Hello, world"
exit

Create Stata Job Script

Create a .sh file with at least the following contents:

#!/bin/bash

stata-se do your-commands-file.do > filename_${JOB_ID}.log 2>&1

Notice the ‘>’ and ‘2>&1’ … these send the output to a log file with a JOB_ID number, so that if you run the code multiple times, you will have separate logs. It’s even more important for Array Jobs, which you’ll want to add SGE_TASK_ID to your log name, as well:

#!/bin/bash

stata-se do your-commands-file.do > filename_${JOB_ID}_${SGE_TASK_ID}.log 2>&1

A fleshed out script (/usr/local/demo/Stata/demo.sh) that works well for most Stata jobs:

#!/bin/bash
#$ -N stata_demo
#$ -j y

# set up the Stata job
DOFILE=$1
LOG=$DOFILE.${JOB_ID}$(if [[ ! ${SGE_TASK_ID} == 'undefined' ]]; then echo ".${SGE_TASK_ID}"; fi).log
shift 1
echo $@

# say what you're doing
echo running stata dofile: ${DOFILE}, logging to: ${LOG}$(if [[ -n $@ ]]; then echo ", with args: $@"; fi)

# RUN THE JOB
stata-se do $DOFILE > $LOG 2>&1 $@

Submit Stata Job

$ qsub demo.sh myDOfile.do arg1 arg2 arg3

More information on job submission, monitoring, and control at Job Management

Install Personal Packages to PERSONAL Directory

Wharton Computing will generally not install non-Stata Corp developed packages (user-written packages). You are welcome to install them to your own PERSONAL directory.

There are a tremendous number of user-written programs for Stata available which, once installed, act just like official Stata commands. Some are conveniences, like outreg for formatting regression output. Others calculate results Stata itself does not, such as polychoric for polychoric correlations. A few represent major extensions of Stata’s capabilities, such as ice and mim for multiple imputation or gllamm for mixed models. Most of these programs are stored at Boston College’s Statistical Software Components archive (or SSC).

You are welcome to install any user-written commands you desire to use on Wharton’s HPC systems, which will be installed into your PERSONAL (a Stata variable) directory. To see what directories Stata is currently using, use the ‘sysdir’ command in Stata. Using your PERSONAL directory, you don’t need to worry about programs you install causing problems for others. On the other hand, this means you need to install the user-written programs you want yourself. Wharton Computing does not try to identify some useful set of user-written programs and make them available to everyone, as our user base is extremely diverse.

Finding User-Written Programs

If you know the name of the program you want to use, you can go directly to Installing User-Written Programs. However, it’s much more common to know what you want to do without knowing what program (if any) can do it. This is a job for Stata’s findit command.

For example, suppose you wanted to do something with Heckman selection models but don’t know what command to use. Type:

. findit heckman

The result is a tremendous amount of information. The findit command first searches Stata’s official help files and notes that there is an official heckman command and several other related commands (this makes findit a powerful tool for figuring out how to do things in Stata in general, not just for finding user-written programs). It then searches Stata’s web site and locates several FAQ entries, plus an example on UCLA’s large statistics web site. It then begins to list relevant user-written programs, organized into “packages.” Programs that were described in the Stata Journal or the older Stata Technical Bulletin are listed first.

Installing User-Written Programs

If you know the name of the package you want to install, you can install it by typing in Stata:

. ssc install package

Updating User-Written Programs

It can be important to maintain the latest versions of any user-written programs you install. Sometimes updates will include important bug fixes, though the SSC archive has quality control measures in place to try to catch bugs before the program is distributed.

The easiest way to check that your user-written programs are up-to-date is to type:

. adoupdate

The adoupdate command notes where each package was downloaded from and goes back to that location to see if a more recent version is available. If there is, you can install the latest version by typing:

. adoupdate, update

You can get a list of the packages you’ve installed by typing:

. ado dir

This can be very helpful for catching things like having downloaded stb0067_3 rather than ice. You can remove a package by typing:

. ado uninstall package

where package should be replaced by either the name of the package you want to remove or the number it is given by ado dir, including the brackets around it.

For example, suppose you downloaded some earlier version of ice that was associated with a Stata Journal article. Just typing:

. ssc install ice

will fail because you already have a copy of ice.ado and all the other related files, and the installer refuses to overwrite them. Thus you need to identify and remove the older version. To do so, type:

. ado dir

If the results included the entry:

[6] package st0067_3 from http://www.stata-journal.com/software/sj7-4
       SJ7-4 st0067_3.  Update: Multiple imputation of missing...

you could remove it by typing either:

. ado uninstall st0067_3

or

. ado uninstall [6]

Then

. ssc install ice

will successfully install the latest version. You should then type:

. adoupdate

periodically to ensure that ice stays up-to-date.