Office of Research Computing

Storage and Data Transfer

Storage
Executive Summary
  • Storage System: MEMORI
  • Cost: $60 per Terabyte per year
  • Usable Capacity: ~5 Petabytes (PB)
  • Backup: Researchers’ responsibility
  • Access: Via ORC clusters or network share
  • Sharing: Via Globus
  • End of Life: October 2028

The Office of Research Computing (ORC) offers storage space on its MEMORI system to Mason researchers to store their research data.  This storage is for open and public data only and cannot be used for storing sensitive or private data.  The current raw storage installed is 10PB  and the end of life of this storage is October 2028.   As we use erasure coding and replication to store data, storage on the MEMORI system is highly redundant and the usable storage is half the raw storage, or approximately 5PB .  Please note that the data is not backed up.  Storage is provided to researchers on a cost-recovery basis at $60 per terabyte per year.  Smaller chunks are not provided.

The storage will be accessible on the head/login and compute nodes of the ORC high performance computing (HPC) clusters.  It can also be configured to be accessible as a network drive on researchers’ workstations/laptops using Samba/SMB from the GMU network and VPN (Virtual Private Network).  The storage can also be accessed using Globus through ORC’s data transfer nodes (DTNs).  Data can be shared with external collaborators using the Globus data transfer tool.

New Storage Costs – Effective – 7/1/2024
  • The new rate for data storage is $60/TB/year and is valid for 5 years. Proposals submitted in FY24 and beyond with award dates after 7/1/2024 should include the new rate in the proposal budgets.
  • Researchers must determine how much storage is needed and budget for the amount required.
  • The storage space is to be used to store and share research data only.
  • The rate of $60/TB/year is a yearly rate, and the storage cost will be charged every year. Please budget for each year.
  • Storage cost is not prorated and is charged for the entire fiscal year regardless of when it is purchased.
  • Researchers must complete and sign a storage Service Level Agreement (SLA) outlining the terms of the storage purchase and provide an account against which the storage charges will be made. If the storage payment is not made when it is due, any data will be retained in place for 3 months, then moved to our cold archives for 3 months before it is permanently deleted.
  • F&A charges are not levied if “Indirect” funds are used to pay the storage cost.

If you are interested in purchasing storage, please submit a ticket at orchelp@gmu.edu to schedule a consultation.

GitLab

The ORC operates a GitLab server that is available to all researchers at Mason at no cost.  Please submit a help request for assistance accessing this service.

Data Transfer and Sharing
GUI and Command Line File Transfer

File transfer between ORC systems and between ORC systems and end-user systems is generally performed using command line file transfer commands such as scp or sftp, or graphical clients such as Filezilla or Cyberduck that support the scp and sftp protocols.  If the storage has been provisioned as a SMB/CIFS share normal Windows files sharing may be used to transfer files.  Data may also be shared with non-Mason collaborators and transferred to and from external repositories using Globus. We recommend using Globus, especially for large transfers, as it provides a robust transfer method that is tolerant of interruptions.

Globus

Mason subscribes to the Globus research data management service operated as a non-profit service by the University of Chicago. Globus provides a web-based interface to securely facilitate parallel, load-balanced, fault-tolerant data transfers ranging from Megabytes to Petabytes. Users may access the Globus Connect portal to transfer data between clusters operated by the Mason ORC and clusters run by other agencies, such as the NSF ACCESS program or other high-performance computing centers worldwide. The portal may also transfer files from home, office, or lab-based systems such as laptops, desktops, or scientific instruments. Globus also provides a simple method for sharing data with collaborators and a feature-full REST API and python-based programming SDK to permit the creation of data portals and automate routine data distribution and sharing tasks.

Globus allows sharing data without requiring accounts for collaborators on the system where the data resides. Any storage system provisioned through Globus can be easily configured to enable secure data sharing. Once configured, select directory paths may be shared with either read-only or read-write access. Collaborators receive an email with a link to the shared directory paths and can then use Globus to transfer data from/to your storage system directly and securely.

Mason has the High Assurance subscription levels of Globus. This subscription enables Mason to identify storage systems that contain sensitive data requiring a higher level of protection, including Personally Identifiable Information (PII), and controlled but unclassified data. Globus will ensure that stricter access policies, as required by the institution, are enforced on High Assurance data.

Detailed information on using Globus at Mason can be found on the ORC Wiki.