Rclone

rclone is a command line program to copy files and directories to and from a large number of cloud storage solutions, including Box & Dropbox, S3, OneDrive, Google Drive, and many others. It is a straightforward and solid, yet powerful tool for use in our clustered environment, providing copy up and down service to and from our your favorite storage solution.

rclone is installed across Wharton’s HPC systems, and can be run from any login or compute node. For larger copy operations, we highly recommend running the job on a compute node.

Configuration

Before you begin copying files, you will need to configure rclone. It’s easy! The only tricky bit is you will need to do this config step on our graphical HPCC-Desktop system, so that when rclone wants to open a web browser, there is one for it to use. So, the step-by-step:

  1. browse to and log on to (Wharton credentials) the HPCC Desktop system. For further details on our FastX service, see our Access page
  2. once you have a new desktop open in your browser or FastX client, open a command line terminal by right-clicking the desktop and choosing ‘Open in Terminal’
  3. start the rclone configuration process by typing: 'rclone config'

A browser will open up, and connect to Box (or Dropbox, if you chose 7 instead of 5). Authenticate, and then you can ‘Authorize’ rclone to access your Box / Dropbox files.

That’s it for setup! Here’s a short (2:48) video to walk you through the above process:

Syncing Files

While I say ‘syncing’ and ‘sync’, please use the ‘copy’ command for best results!! The best documentation on usage is rclone’s own documentation. The above video demos a few commands, as well.

I recommend that you have a single folder within your Box / Dropbox dedicated to your HPCC research files. For example, I have an HPCC folder in my Box account. On the HPCC, I want to locate it at ~/Box/HPCC, so I copy it down from Box:

I have created command line ‘aliases’ to assist with this process. In my ~/.bashrc file, I put:

I logged out and back in, and now to copy my ~/Box/HPCC directory up to the HPCC directory in my Box cloud account, I just type 'boxup', and 'boxdn' to copy in the other direction. And when I run a job script, I can just add 'boxup' on the line after I’ve done the work and written the output, and Rclone will copy my files to my Box account!

For example:

TIP: if you’re running in interactive mode, most software products have a way to run a system or os command. Take Stata for example, which uses ‘shell’ or ‘!’, like so:

Unfortunately, aliases aren’t ‘active’ in Stata ‘shell’, I don’t know why (let me know if you do!). You could write a DO files, like:

Stash it somewhere (like '~/ado/rclone.do'), and call it from Stata like:

For more advanced use, I recommend looking through the vendor’s documentation.