Categories
Alerts News

Winter Break Support – important information.

Please be advised that George Mason University will be closed for Winter Break  

Monday, December 19, 2022 – Monday, January 2, 2023.   

During the break, the ORC resources such as the Hopper cluster are expected to be up and functioning as normal. ORC will respond to urgent catastrophic events, however, routine questions or tickets filed at orchelp@gmu.edu may not receive a response until after the break. Weekly regular activities such as the ORC New User tutorials will be suspended until university offices reopen on Tuesday, January 3, 2023. 

Please continue to check the ORC Website for more updates and other upcoming events. 

The ORC wishes you a safe and happy holiday season.

Categories
Alerts

ORC resources suffering from widespread network disruption

Network access to ORC resources is currently disrupted due to configuration changes made to integrate new network infrastructure hardware.  The HOPPER and ARGO clusters, Virtual host systems and network data shares may be inaccessible, or only intermittently available until the problem is resolved.  Engineers are working with the equipment providers support team to diagnose and resolve the issue. We appreciate this is very disruptive and apologize for the inconvenience.

When it is available, further information regarding the estimated down time will be posted here and sent to the ARGO-USERS mailing list.

Update 11/9/2021 22:50. Engineers from Dell believe they have resolved the connectivity issues on the Dell hardware.  However, as of now the clusters remain unresponsive.  This may be due to storage server problems caused by the network outage or with campus networking.  We will be engaging with GMU IT support in the morning to perform additional analysis of the issue.  We hope to have all clusters and systems available before end-of-day Wednesday 11/10/2021.

Update 11/10/2021 12:30. All issues have been resolved and the HOPPER and ARGO clusters are available.

 

Categories
Alerts News

Announcing the Hopper Cluster (Hopper)

Announcing the Hopper Cluster (Hopper) 

The ORC would like to invite you to use Hopper its new high performance compute cluster. Hopper is named in honor of the late rear admiral Grace Hopper, a computing pioneer and local resident. All new ORC cluster accounts will be created and activated on Hopper by default. However, existing Argo cluster account holders should send an email to orchelp@gmu.edu to request activation of their account on Hopper.

Hopper currently has a total of 70 compute nodes each node with 48 cores (Intel Cascade Lake) and 188 GB of available memory. Currently, 28 nodes and the GPU node are freely available for all users. The remaining nodes may also be used but jobs will be subject to preemption by jobs run by the node’s sponsors. There is one Nvidia DGX GPU node with 128 CPU cores (AMD EPYC/Milan), 1 TB of memory, and 8xA100 GPUs.  

A large expansion of Hopper is planned for the Fall of 2021 which will add a substantial number of compute and GPU nodes including very large memory nodes with up to 4 TB of memory. Users who require memory address spaces greater that 180 GB will need to continue to use the Argo cluster until the new large memory nodes become available in Hopper. 

The Hopper cluster is configured in a similar but not identical fashion to Argo. The software modules are organized differently and there are differences in the partition names, defaults, and versions of software available. Please review the documentation linked below for more detailed information on the differences.  

You may log in to Hopper using “ssh <UserID>@hopper.orc.gmu.edu,” where “<UserID>“ is your GMU NetID, use your GMU campus password when prompted. Home, scratch, and project directories will be mounted in the same locations as on Argo. Let us know if there are any “groups” directories you need to access, or if there are specific software packages and versions you require that are not available. The partition/queue structure on Hopper is summarized in the table below: 

Partition  Time Limit (D-H:M)  Description  ARGO Equivalent 
debug  0-01:00  Intended for quick tests   
interactive  0-12:00  Interactive jobs (Open OnDemand)   
normal  3-00:00  default  all-LoPri, all-HiPri, bigmem-HiPri, bigmem-LoPri, all-long, bigmem-long 
contrib*  6-00:00    CDS_q, COS_q, CS_q, EMH_q 
gpuq  1-00:00  GPU node access  gpuq 

*NOTE:  Being a contributor on Argo does not automatically grant access to the contrib partition on Hopper. All users may submit jobs to the contrib partition on Hopper, however, their jobs may be preempted and killed by a contributor’s job at any time. We recommend that non-contributor users who submit to the contrib partition ensure their jobs use some form of checkpointing. Contact orchelp@gmu.edu if you need help implementing checkpointing in your jobs. 

Open OnDemand 

We would also like users to try our new Open OnDemand (OOD) Server, which enables launching interactive apps including RStudio, Jupyter Lab, MATLAB and Mathematica, or a Linux graphical desktop through a web interface. These interactive sessions can be used for up to 12 hours. From a web browser, login to https://ondemand.orc.gmu.edu using your GMU username and credentials to access the OOD server. Please let us know of any problems you encounter, and any applications you would like to be able to use via Open OnDemand. 

Documentation 

Please refer to the following links for current documentation on Hopper: 

If you have any questions about any aspect of the new Hopper cluster, please send an email to orchelp@gmu.edu. 

 

Categories
Alerts News

ARGO Scratch File system migration – Cluster unavailable in the AM 03/13/2021

There is a  planned interruption to the availability of the Argo cluster.

On the morning of 03/13/2021, the scratch filesystem will be migrated to new hardware.   The file system must be quiescent during the data transfer so the entire ARGO cluster will be unavailable from 5 am for a few hours. All partitions are being drained. Jobs will start running again
by the afternoon.

Categories
Alerts News

Maintenance Scheduled for ORC hosted Virtual Machines and Servers.

A maintenance period has been scheduled for Tuesday 1/19/2021 between 8 am and 11 am, during which it is planned that all ORC VMs and hosted servers will be patched and rebooted.  There will necessarily be short periods, generally no longer than 15 minutes, during which the systems will be unavailable.  If this maintenance would cause disruption to your work, then please let us know asap so that we can make the necessary adjustments.

The Argo Cluster will not be affected by this maintenance.

Categories
Alerts News

Christmas Break 2020 – Support

The University will break for the holidays on December 18th 2020, and normal working hours will resume on January 4th 2021. Account requests made after 12pm on December 18th will not be processed until January 4th. Staff will be monitoring email and help tickets and may be able to respond to urgent inquiries but please be aware this will be on a best effort basis. If you have concerns or questions please send email to orchelp@gmu.edu.