User Tools

Site Tools


dataarchving

This is an old revision of the document!


Data Archiving

You might have data that is rarely, if ever, needed, but that you can't delete. You may want to remove it from the cluster storage to save on disk usage fees. Below are two approaches we suggest to achieve this.

We recommend you have two high-quality copies of all original data and difficult-to-reproduce data, and that they reside in different physical locations.

A USB harddrive you bought on Amazon is NOT a high-quality copy.

We suggest you implement both approaches described below, or something similar.

Desktop RAID

Purchase a small good-quality desktop RAID system to store your data. Typically this will be called NAS (Network-Attached Storage), and you can configure it with as many drives as you need. Buy 3.5“ enterprise-class (aka server-class) drives and set them up in redundant RAID configuration. This means that if one of the disks in the system fails, the others will maintain the data and you can replace the bad drive without losing any data. However you must have someone check on the system periodically to check it's condition, and setup email and other alerts so it tells you when there's an issue. Virtually all hard drives fail within 5 years of production.

PMACS HPC:Archive System

This is a service that provides very easy access to a modern robot-controlled high-availability tape archiving system. It provides a simple filesystem-view interface with simple file retrieval. Custom linux commands are provided for the user to make their archiving copies. Note that this is an archiving service, and is not meant to be a regular backup service. You are able to retrieve files, but such retrievals are expected to be rare.

Pricing is $0.015/GB/mo = $0.18/GB/year = $180/TB/year. This is a great price!

Your data is stored on mirrored tapes, meaning there is a redundant copy on a different set of tapes. However both copies reside in the same physical system, so a catastrophic event that destroys the system or the data center will wipe out all your data stored there.

HIPAA-protected data: The system is not yet HIPAA-compliant, but the approval is “well underway and nearing completion” as of Fall 2015.

For more information, see PMACS HPC Services and HPC:Archive System Wiki

dataarchving.1446223356.txt.gz · Last modified: 2015/10/30 16:42 by mgstauff