This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
dataarchving [2015/11/12 21:01] mgstauff [PMACS HPC:Archive System] |
dataarchving [2017/08/07 15:59] (current) mgstauff [PMACS HPC:Archive System] |
||
---|---|---|---|
Line 7: | Line 7: | ||
** A regular USB harddrive you bought on Amazon does //NOT// count as a high-quality copy. ** | ** A regular USB harddrive you bought on Amazon does //NOT// count as a high-quality copy. ** | ||
- | We suggest you implement | + | We suggest you implement |
==== Desktop RAID ==== | ==== Desktop RAID ==== | ||
- | Purchase a small good-quality desktop RAID system to store your data. Typically this will be called NAS [[https:// | + | Purchase a small good-quality desktop RAID system to store your data. Typically this will be called NAS [[https:// |
+ | |||
+ | ===Recommendation=== | ||
+ | We've had good experiences with the Synology DS (DiskStation) series of RAID systems, for example the DS416. These products have a good user interface and support connection over the network via NFS (linux/ | ||
+ | |||
+ | ===Mini FAQ=== | ||
+ | |||
+ | ==Q: We want to back up our MRI data and are expecting to collect multiple terabytes of imaging data over the next few years. Do you have a specific suggestion for us? I was looking at the DS416 option from the wiki and also saw a 2-bay system== | ||
+ | |||
+ | A: The key is to have a RAID system 1 or higher, so you have redundancy if one of the drives fails. See here: https:// | ||
+ | |||
+ | A two-bay system will work depending on what " | ||
+ | |||
+ | ==Q: Is it possible to purchase just 1 internal hard drive for now - or would you not recommend this - and if so do you have any good brands or suggestions? | ||
+ | |||
+ | No, you want two drives at a minimum so you can at least do RAID 1. You can start with two drives, and then add more and expand the raid volume later. (At least with the Synology systems) you can start with RAID 1 and then switch to RAID 5 or 6. Also, each drive is limited to use the size of the smallest drive in the raid, so if you start with 2TB drives, you'll want to expand in the future with 2TB drives (or larger drives, but only 2TB of each one will get used). | ||
+ | |||
+ | Some real-world hard drive reliability stats: | ||
+ | https:// | ||
+ | |||
+ | ==Q: How technologically savvy to be we need to be to maintain this system. How often would we need to check our back up system? We do not plan on accessing it very often - just using it purely as a back up kept off site.== | ||
+ | |||
+ | A: A typical undergrad/ | ||
+ | |||
+ | You can set up most (or maybe all) systems to send you email alerts (and maybe text alerts) when there' | ||
+ | ==== Archive-Quality Blu-ray Discs ==== | ||
+ | |||
+ | If your data is less than a few hundred gigabytes, you might want to use archive-quality blu-ray discs. This is a somewhat newer option. Writeable blu-ray discs (BD-R) come in 25GB, 50GB and 100GB sizes. | ||
+ | |||
+ | We suggest you make two copies of critical data and store them in separate buildings. | ||
+ | |||
+ | === M-Disc === | ||
+ | Be sure to get " | ||
==== PMACS HPC:Archive System ==== | ==== PMACS HPC:Archive System ==== | ||
+ | |||
+ | NOTE 8/2017 | ||
+ | | ||
+ | PMACS has new options for storage that may be of use. | ||
+ | In particular the " | ||
+ | stated ability to conform to HIPAA compliance needs. We have not had time to investigate | ||
+ | this ourselves. You are welcome to contact PMACS about this and ask our help to figure | ||
+ | out if the new services are usable by cluster users. | ||
+ | | ||
+ | http:// | ||
+ | |||
This is a service that provides very easy access to a modern robot-controlled high-availability tape archiving system. It provides a simple filesystem-view interface with simple file retrieval. Custom linux commands are provided for the user to make their archiving copies. Note that this is an **archiving** service, and is not meant to be a regular backup service. You are able to retrieve files, but such retrievals are expected to be rare. | This is a service that provides very easy access to a modern robot-controlled high-availability tape archiving system. It provides a simple filesystem-view interface with simple file retrieval. Custom linux commands are provided for the user to make their archiving copies. Note that this is an **archiving** service, and is not meant to be a regular backup service. You are able to retrieve files, but such retrievals are expected to be rare. | ||
Line 19: | Line 62: | ||
Your data is stored on mirrored tapes, meaning there is a redundant copy on a different set of tapes. **However** both copies reside in the same physical system, so a catastrophic event that destroys the system or the data center will wipe out all your data stored there. | Your data is stored on mirrored tapes, meaning there is a redundant copy on a different set of tapes. **However** both copies reside in the same physical system, so a catastrophic event that destroys the system or the data center will wipe out all your data stored there. | ||
- | **HIPAA-protected data:** The system is not yet HIPAA-compliant, | + | **HIPAA-protected data:** The system is not yet HIPAA-compliant. |
+ | |||
+ | STATUS UPDATE 3/2/2017: PMACS has had to change systems because of a loss of vendor support. The newer system is expected to be ready in a month or two, but HIPAA-compliance | ||
=== Creating an Account === | === Creating an Account === | ||
Line 30: | Line 76: | ||
*User' | *User' | ||
*User' | *User' | ||
- | *Data stored on the cluster requires HIPPA/other compliance?: | + | |
PI Info: | PI Info: | ||
*PI's Full Name: | *PI's Full Name: | ||
Line 44: | Line 90: | ||
*BA associating the 26-digit SAM account with the User’s account in the TRC/HPC billing system?: Yes/No | *BA associating the 26-digit SAM account with the User’s account in the TRC/HPC billing system?: Yes/No | ||
- | Contact: pmacshpc@med.upenn.edu | + | Contact: |
For more information, | For more information, |