Julia is Here!

Welcome to Julia

Julia is now installed across all compute servers of Wharton’s HPC systems. From the Julia Website:

Julia is a high-level, high-performance dynamic programming language for technical computing, with syntax that is familiar to users of other technical computing environments. It provides a sophisticated compiler, distributed parallel execution, numerical accuracy, and an extensive mathematical function library. Julia’s Base library, largely written in Julia itself, also integrates mature, best-of-breed open source C and Fortran libraries for linear algebra, random number generation, signal processing, and string processing. In addition, the Julia developer community is contributing a number of external packages through Julia’s built-in package manager at a rapid pace. IJulia, a collaboration between the Jupyter and Julia communities, provides a powerful browser-based graphical notebook interface to Julia.

To build a bit of documentation, I went through some of the excellent Quantitative Economics Julia lecture, augmented for our environment (HPCC), below. If you:

Setting Up Your jupyter Notebook for Julia (GUI interactive Julia)

On hpcc.wharton.upenn.edu:

(login-server) $ setup-jupyter-notebook.sh

Creating Wharton HPCC Jupyter notebook environment...
PLEASE WAIT...

… the above can take a long time, DON’T INTERRUPT, output chopped for brevity …

(login-server) $ qlogin
(compute-server) $ source /opt/rh/rh-python35/enable
(compute-server) $ source ~/.virtualenvs/jupyter-notebook-py35/bin/activate
(compute-server) $ julia
               _
   _       _ _(_)_    | A fresh approach to technical computing
  (_)     | (_) (_)   | Documentation: https://docs.julialang.org
   _ _   _| |_  __ _  | Type "?help" for help.
  | | | | | | |/ _` | |
  | | |_| | | | (_| | | Version 0.6.0 (2017-06-19 13:05 UTC)
 _/ |\__'_|_|_|\__'_| |
|__/                  | x86_64-redhat-linux

julia> Pkg.add("IJulia")
INFO: Initializing package repository /home/wcit/hughmac/.julia/v0.4
... takes quite a long time, output chopped for brevity ...
julia> quit()
(compute-server) $ exit
(login-server) $ notebook-py35

… carefully follow the instructions presented …
NOTE: the ssh hpcc … port forward is in another Terminal window on OSX or MobaXterm window on Windows, on your local (desktop, laptop) system

Once your jupyter notebook webpage is up, you can do New (pulldown menu) > Julia

If you’re starting up on a new day, just:

  • logon to HPCC
  • $ notebook-py35
  • follow instructions
  • In browser: New (pulldown menu) > Julia

So that’s the whirlwind GUI Notebook interactive setup, which should get you through “Setting up Your Julia Environment” in the Quantitative Economics Julia lecture.

Using Julia with Interactively from the Command Line

You can run Julia interactively from the command line. It’s simple! To start, from hpcc.wharton.upenn.edu:

(login-server) $ qlogin
(compute-server) $ julia
               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: http://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?help" for help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.6.0 (2017-06-19 13:05 UTC)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |  x86_64-redhat-linux

julia>

Now that you’ve got Julia started, I recommend going through the comprehensive Quantitative Economics Julia lecture. You can skip downloading and installing …

Using Julia with Scripts (non-interactive Julia)

Take a look at the Julia demo files in /usr/local/demo/Julia on the HPCC. It works similarly to other research software in the HPCC environment:

  • create a Julia script file (.jl)
  • create a job script file (.sh) that calls the Julia script file (.jl)
  • submit the job script file with ‘qsub

Using Multiple Cores with Julia

You can use ‘-pe openmp # ‘ as an option to qsub (or ‘#$ -pe openmp # ‘ in your bash job script), where ‘#’ is the number of cores, up to 16 (number of cores on a box). Keep in mind that the more cores you request, the longer it may take for the job start, as that many cores must become available on the same box in the cluster. It might be ‘worth the wait’ for a longer running job, but likely not for short-running jobs.

Then you would read the environment variable ‘NSLOTS’ using some Julia code like:

#!/usr/bin/env julia
slots = parse(Int, ENV["NSLOTS"])
addprocs(slots)
nheads = @parallel (+) for i=1:20000000000
Int(rand(Bool))
end
print(nheads)

Again, you would launch that with something like:

qsub -pe openmp 8 -N parjulia -j y -b y myparcode.jl

As a test, I ran that with 2, 4, and 8 cores, and got times of:

Cores Time
2 40.889s
4 21.828s
8 13.980s

So pretty good scaling. There’s a way to use multiple cores on multiple boxes, which would allow you to scale even larger (but at some performance hit for network latency), which I will document in a future post.

Multiple Tasks from One Job: Julia Array Jobs

Very often your best bet is to run many tasks in an array job. You get linear scaling, and the jobs don’t have to wait till all cores are available. That’s ‘-t 1-#‘ option in qsub. That would launch ‘#’ of jobs, and each one would have the environment variable ‘SGE_TASK_ID’ (similar to the ‘NSLOTS’, above) set to the task number, which you can then use in Julia:

#!/usr/bin/env julia
for x in ARGS
    println(x)
end

The magic is the variables.txt file:

A B C
1 2 3
0.11 Acme IBM
Gold Silver Bronzea

And the job script:

#!/bin/bash
# filename: array_job.sh
#$ -j y
#$ -N Julia_Array_Job
# run with 'qsub -t 1-$(wc -l <variables.txt) array_job.sh'
VARS=$(sed -n ${SGE_TASK_ID}p variables.txt)
echo "Launching Julia with VARS = $VARS"
julia array_julia.jl $VARS

Launch with:

qsub -t 1-$(wc -l <variables.txt) array_job.sh

See the files in /usr/local/demo/Julia/array_job.

With two decades of experience supporting research and more than a decade at The Wharton School, Hugh enjoys the challenges and rewards of working with world-class researchers doing Amazing Things with research computing. Robust and scalable computational solutions (both on premise and in The Cloud), custom research programming solutions (clever ideas, simple code), and holistic, results-focused approaches to projects are the places where Hugh lives these days. On weekends you're likely to find him running through the woods with a topo map and compass, orienteering.