Running Research Code in the Cloud

Part of the recent Wharton High-Performance Computing Cluster (HPCC) upgrade was augmentation with Amazon EC2 cloud resources. Wharton Research Computing recently piloted the deployment of a cloud-bursting solution with Univa UniCloud. After much help from friendly initial users, we have worked out most of the kinks. Now, we have a workflow that allows any Wharton researcher to use as much compute capacity, on an as-needed basis, as a given project scale or deadline dictates.

The ability to go beyond the local supercomputer fits nicely into a gradiated support model. A research programmer may start on a laptop or departmental desktop, coding data analysis or tweaking a simulation algorithm. But once that code has been written and debugged, there may be a need to run the analysis on a thousand different data segments, or run ten-thousand iterations with a parameter sweep of initial seed values. This kind of scaling requires a dedicated resource, such as the Wharton HPCC system.

Even though HPCC has allowances for 100 simultaneous running jobs per user and up to 100GB of memory per run, it is still a shared resource with finite capacity. Special dispensations can be accommodated, but only on a temporary basis and only when there is capacity to spare. What then, you ask? To quote Casey Kasem, “Keep your head in the clouds and your feet on the ground.”

Feet on the ground is the local, always-available, Wharton HPCC system. HPCC is a cluster of Linux compute nodes – dedicated server hardware for running research code. HPCC uses Univa Grid Engine as its job scheduler. In short, anything that can be run at a Linux command prompt can be put into a script and run at scale, which is how most research code is launched. Whether written in Matlab, Python, Fortran or C++, that command line, in a shell script, can then be placed into the Grid Engine job queue to await execution based on fair-share policies and available computing resources.

A full-capacity scenario would be one such job running per CPU core of the entire cluster, with the most recently submitted jobs waiting for their turn. As running jobs are completed, the waiting jobs are launched, giving all users dedicated CPU and memory per job for maximum efficiency.

Head in the clouds is Amazon EC2, with effectively unlimited capacity. If running 100 jobs at a time is not enough with a looming deadline, sending jobs to additional CPUs in the cloud may be the right option. If we can run 1,000 jobs at a time, weeks or even months of research time can be saved. Utilizing a cloud node in this way is very similar to the workflow already exercised on HPCC, with the main difference being that jobs are submitted to a specific cloud queue.

On HPCC, there are a few policies in effect that poll the cloud queues and act accordingly. Let’s take a look at the XML of one of these policies:

<rule applicationName="default-bug" name="burst-bug">
    <xPathVariable name="__neededNodes__" xPath="number(resourceData[@resource='aws_bug']/neededNode)"/>
    <applicationMonitor type="receive">
        <actionCommand>/opt/unicloud/hpccrules/activateOneNode.sh AWS-bug</actionCommand>
    </applicationMonitor>
    <condition metricXPath="__neededNodes__" evaluationOperator=">=" triggerValue="1">
        <description>Number of pending jobs in the queue</description>
    </condition>
    <description>Burst condition will be met if 1 or more jobs are pending in the queue</description>
</rule>

This policy will check to see if my user, “bug,” has any needed nodes – aka there is a job waiting in bug’s cloud queue and it needs a CPU. By default, there are no available cloud nodes running, but as soon as there is a need, we “burst,” and one additional node will be launched with the activateOneNode.sh command. My AWS API keys (Amazon credentials), AMI machine image (OS & software), instance type (size of CPU and RAM), etc. are defined within UniCloud under the AWS-bug profile. Every user of HPCC can therefore utilize a separate account with customized compute node options.

UniCloud wraps up Puppet configuration management to tweak the cloud nodes. We do a combination of baking our preferred software into the initial CentOS 7 Linux image ahead of time, and configuring user options and launching grid engine services with the puppet agent at boot time.

Let’s take a look at a bit of the puppet configuration:

class tortuga::compute_aws_bug {
    $piuser = 'bug'
    $piuid = 123456
    $pigid = 123456

    group { $piuser:
        gid => $pigid,
    } ->
    user { $piuser:
        uid => $piuid,
        gid => $piuser
    }

  $pihomedir = "/opt/home/$piuser"

  file { ['/opt/home', $pihomedir]:
      ensure => directory,
  } ->
  mount { $pihomedir:
      ensure => mounted,
      atboot => true,
      device => "10.10.10.1:${pihomedir}",
      dump   => 0,
      fstype => 'nfs',
      pass   => 0,
      options => 'defaults',
  }

  $pihomelink = "/home/$piuser"

  file { $pihomelink:
       ensure => 'link',
       target => $pihomedir,
  }

}

This small bit of puppet configuration verifies that the user exists and has a home directory mounted within the cloud compute node. Home directories are securely shared via the NFS protocol over a Virtual Private Cloud (VPC) network. With a few UniCloud policies and some puppet config, boom – we have a customized cloud computing environment that seamlessly extends the HPCC environment!

Note that there are quite a few moving pieces behind the scenes that make this seem transparent, not to mention the 15,000+ lines of code that define the HPCC system configuration. For instance, say we have a job script that looks something like this:

#!/bin/bash
#$ -N sweep1
#$ -t 1-1000
#$ -cwd
#$ -j y
#$ -m e -M bug@wharton.upenn.edu
python sweep1.py $TASK_ID

This job script defines how to run our research code, saying:

the job name is “sweep1”
run 1000 itterations
from within the current working directory
join standard and error output files
email me when the job ends
run the python script sweep1.py, passing the number of the current iteration as an option

A complete run of this script might look like this:

by default, no cloud nodes are running
copy sweep1 code, data and job script to the /opt/home/bug cloud home directory
submit the job script with: qsub-aws sweep1.sh
the UniCloud policy polls the cloud queue every ten minutes
a node will boot if there is a cloud job waiting (qw)
our custom CentOS 7 image starts and is configured with Puppet
once up, nodes stay up for running jobs (r) and for the billable hour (about 50 minutes)
the jobs run and save results to the cloud home directory
when there are no waiting or running jobs, a node is terminated

As long as there are jobs waiting, UniCloud will “burst” jobs, launching AWS compute nodes up to the maximum configured limit. Billing is kept in check by configuring a CloudWatch billing metric, so that if the projected cost exceeds the budget for the month we are notified via email.

This workflow is easily enabled after a one-time account and billing setup by our Research Computing staff. After that, the primary researcher may share cloud resources with designated research assistants.

If you have questions, or would like to set up your own cloud bursting on HPCC, please let us know at: research-computing@wharton.upenn.edu

About Gavin Burris

As a specialist in Linux and high-performance computing, Burris enjoys enabling faculty within The Wharton School of the University of Pennsylvania by providing effective research computing resources. Burris has been involved in research computing since 2001. Current projects find Burris working with HPC, big data, cloud computing and grid technologies. His favorite languages are Python and BASH. In his free time, he enjoys bad cinema, video editing, synthesizers and bicycling.