Data Protection

Wharton’s user data space (/home) are built on redundant, network attached storage, making it a reliable, safe place for you to store your data and code.

Snapshots

Our networked file systems include automated daily and weekly ‘snapshots’ of all user data space, providing our users with multiple user-accessible copies of all of their data. We snapshot:

  • daily and always have the last seven (7) daily snapshots available
  • weekly and always have the last five (5) weekly snapshots available

Reserve Snapshots

In addition to user-accessible snapshots we also perform daily syncing to a secondary filesystem, which is also snapshotted, but on a longer-retained, less-frequent schedule, including:

  • weekly and always have the last five (5) weekly snapshots available
  • monthly and always have the last seven (6) monthly snapshots available

This is for disaster recovery (if our primary network storage has problems that we cannot resolve), and for assisting you with recovering something older than our on-HPCC snapshots contain.

Recovering Files

Depending on what you need to recover, there are two basic paths to file recovery in our environment.

Self Service

Most file recovery can be accomplished by you, the user, allowing for the fastest and most selective recovery.

Your snapshots are available in the .snapshots directory. Because of the ‘.’ (dot), these files are hidden when you do an ‘ls‘. Trust us: they are there. Or look for yourself.

To explore and recover files, log on to the HPC3 with your ssh client, and ‘cd‘ into the .snapshots directory:

$ cd .snapshots
$  ls -1
username-snap-202309150509
username-snap-202309210509
username-snap-202309220508
username-snap-202309230509
username-snap-202309240509
username-snap-202309250508
username-snap-202309260509
username-snap-202309270508
username-snap-202309280509
$ cd username-snap-202309210509
$ ls *.sh
script1.sh    script2.sh   script3.sh
$ cp script2.sh $HOME

That’s it! You can also be able to use an SFTP client or the MobaXterm SSH Browser to do recovery … remember the ‘.’ (dot), which will make the .snapshots directory invisible. Trust that it’s there!

Imbox_content
NOTE: .snapshots directories are not available from a SMB (Windows File System) share (mapped network drive).

 

Imbox_content
NOTE: Permissions in .snapshots directories are identical to those in your ‘normal’ directories. Only those with proper permissions can browse and restore your files.

Assisted Recovery

If the Self Service method (above) isn’t adequate—generally if the files have been out of the user space for more than 6 weeks—please contact research-computing@wharton.upenn.edu with as much detail as you can provide. Path and name of files, and when the files were last in your user space are the most important details.

This can take up to 2 business days to complete. We thank you for your patience!

Other Best Practices

We recommend that you also use a Repository Service (Version Control) for code management, and Dropbox syncing for your data. Both of these methods provide the ability to recover multiple file versions, along with other valuable features.