Parallel IPython with Univa Grid Engine (SGE)

I frequently sing the praises of Python as the go-to language for research programming. Python is extremely fast, in terms of human development time. There are typically more CPU cores available versus research assistants with specific programming skills. On that note, Python is also extremely fast, in terms of compute time, thanks to the already optimized and compiled C/C++ code in modules such as numpy and scipy.

Enter IPython (soon to be rename Jupyter). IPython provides a number of improvements over vanilla Python. My favorites are the improved interactive shell for the terminal and web-enabled “notebooks” that allow a better way of quickly developing code.

IPython also allows for easily scaling up code in a parallel fashion. When working with IPython, one can start a “cluster” of “engines.” These engines are managed by a “controller.” Typically, one would launch one engine per CPU on a system. The engines then accept work from the “client” IPython shell. This is all automated with the IPython ipcluster command on a personal system. But how do we take advantage of this feature in an HPC environment that has many many more CPUs behind a job scheduler? When submitting non-interactive jobs to a job queue, we cannot predict when and what compute nodes of the cluster will be provided. Furthermore, we cannot have different jobs stomping on the setup and tear-down of our cluster of IPython engines.

So, we need a scriptable setup that allows for dynamic resources and simultaneous, non-conflicting clusters of engines. IPython does provide “profile” support for PBS and SGE job schedulers. I have adapted an SGE profile setup and a virtualenv to facilitate a single-script solution on the Wharton HPCC system.

Let’s first configure the profile and virtualenv to make this possible, with the following terminal commands.

Once we have set the profile and virtualenv in our HPCC account, we get to the meat of the python script. Most of this code is wrapper for starting and stopping the IPython cluster engines in a fail-proof way.

Once the above is saved to the ipcode.py script, we can submit it to grid engine. This script is a template for whatever work is required in the above “do some work” section. In this example, we are simply comparing the results returned from a local versus parallel run of the lambda function that takes x to the power of ten. Please see the IPython Parallel Multiengine documentation for different methods of dividing work across engines, including farming out python function calls.

This single script can then be submitted to the job queue with qsub. Note the “8” in the command line. This tells our script to run eight engines. This one ipcode client script, one ipcontroller and eight ipengines is a total of ten slots in the queue.

Please see /usr/local/demo/Python/IPython/ on HPCC for the demo code from this post.

As a specialist in Linux and high-performance computing, Burris enjoys enabling faculty within The Wharton School of the University of Pennsylvania by providing effective research computing resources. Burris has been involved in research computing since 2001. Current projects find Burris working with HPC, big data, cloud computing and grid technologies. His favorite languages are Python and BASH. In his free time, he enjoys bad cinema, video editing, synthesizers and bicycling.