Stata

Stata is a complete, integrated statistical package that provides everything you need for data analysis, data management, and graphics. More information can be found at: http://www.stata.com/

Stata on the HPCC

Interactive Stata Sessions

  • Graphical (requires X Forwarding): qrsh xstata-se
  • Textual: qrsh stata-se

Submitting a Stata Job

Create Stata Commands File

Create a .do file with your commands, for example:

Create Stata Job Script

Create a .sh file with at least the following contents:

Notice the ‘>’ and ‘2>&1’ … these send the output to a log file with a JOB_ID number, so that if you run the code multiple times, you will have separate logs. It’s even more important for Array Jobs, which you’ll want to add SGE_TASK_ID to your log name, as well:

A fleshed out script (/usr/local/demo/Stata/demo.sh) that works well for most Stata jobs:

Submit Stata Job

More information on job submission, monitoring, and control at Job Management

Install Personal Packages to PERSONAL Directory

Wharton Computing will generally not install non-Stata Corp developed packages (user-written packages). You are welcome to install them to your own PERSONAL directory.

There are a tremendous number of user-written programs for Stata available which, once installed, act just like official Stata commands. Some are conveniences, like outreg for formatting regression output. Others calculate results Stata itself does not, such as polychoric for polychoric correlations. A few represent major extensions of Stata’s capabilities, such as ice and mim for multiple imputation or gllamm for mixed models. Most of these programs are stored at Boston College’s Statistical Software Components archive (or SSC).

You are welcome to install any user-written commands you desire to use on Wharton’s HPC systems, which will be installed into your PERSONAL (a Stata variable) directory. To see what directories Stata is currently using, use the ‘sysdir’ command in Stata. Using your PERSONAL directory, you don’t need to worry about programs you install causing problems for others. On the other hand, this means you need to install the user-written programs you want yourself. Wharton Computing does not try to identify some useful set of user-written programs and make them available to everyone, as our user base is extremely diverse.

Finding User-Written Programs

If you know the name of the program you want to use, you can go directly to Installing User-Written Programs. However, it’s much more common to know what you want to do without knowing what program (if any) can do it. This is a job for Stata’s findit command.

For example, suppose you wanted to do something with Heckman selection models but don’t know what command to use. Type:


The result is a tremendous amount of information. The findit command first searches Stata’s official help files and notes that there is an official heckman command and several other related commands (this makes findit a powerful tool for figuring out how to do things in Stata in general, not just for finding user-written programs). It then searches Stata’s web site and locates several FAQ entries, plus an example on UCLA’s large statistics web site. It then begins to list relevant user-written programs, organized into “packages.” Programs that were described in the Stata Journal or the older Stata Technical Bulletin are listed first.

Installing User-Written Programs

If you know the name of the package you want to install, you can install it by typing in Stata:

Updating User-Written Programs

It can be important to maintain the latest versions of any user-written programs you install. Sometimes updates will include important bug fixes, though the SSC archive has quality control measures in place to try to catch bugs before the program is distributed.

The easiest way to check that your user-written programs are up-to-date is to type:

The adoupdate command notes where each package was downloaded from and goes back to that location to see if a more recent version is available. If there is, you can install the latest version by typing:


You can get a list of the packages you’ve installed by typing:


This can be very helpful for catching things like having downloaded stb0067_3 rather than ice. You can remove a package by typing:

where package should be replaced by either the name of the package you want to remove or the number it is given by ado dir, including the brackets around it.

For example, suppose you downloaded some earlier version of ice that was associated with a Stata Journal article. Just typing:


will fail because you already have a copy of ice.ado and all the other related files, and the installer refuses to overwrite them. Thus you need to identify and remove the older version. To do so, type:


If the results included the entry:


you could remove it by typing either:

or

Then

will successfully install the latest version. You should then type:

periodically to ensure that ice stays up-to-date.