Office of Research Computing

Virginia Women in HPC – Inaugural Event

VIRGINIA WOMEN IN HPC
INAUGURAL EVENT

WHEN
October 6th
1:00pm – 2:00pm

WHAT
We are proud to announce the founding of Virginia’s first Women in High-Performance Computing (WHPC) program. Join Virginia WHPC for its inaugural event featuring inspiring lightning talks by female faculty of the Commonwealth sharing and discussing how HPC has facilitated their scientific research and professional careers.

Topic: How does HPC help with your scientific research — Faculty perspectives

Speakers:

  • Julie Quinn – University of Virginia
  • Jenna Cann – George Mason University
  • Grace Chiu – William & Mary’s Virginia Institute of Marine Science

Registration Link: http://HTTPS://TINYURL.COM/VA-WHPC-OCT2021

THIS VIRTUAL EVENT IS JOINTLY HOSTED BY VIRGINIA COMMONWEALTH UNIVERSITY, GEORGE MASON UNIVERSITY, VIRGINIA TECH, WILLIAM & MARY, UNIVERSITY OF RICHMOND, AND RESEARCH COMPUTING AT THE UNIVERSITY OF VIRGINIA.

WHPC-event flyer-092221

OKLAHOMA SUPERCOMPUTING SYMPOSIUM 2021

The annual Oklahoma Supercomputing Symposium will be held as a free virtual event this year.

The meeting agenda with registration information can be found here: http://www.oscer.ou.edu/Symposium2021/agenda.html

SPEAKERS WILL INCLUDE:
Margaret Martonosi – Assistant Director, Computer and Information Science & Engineering National Science Foundation
Lynne Parker – Director, National AI Initiative Office. Assistant Director of OSTP for Artificial Intelligence Office of Science and Technology Policy (OSTP) The White House
Dan Stanzione – Director, Texas Advanced Computing Center University of Texas Austin
Thirumalai (Venky) Venkatesan – Director, Center for Quantum Research and Technology University of Oklahoma

Announcing the Hopper Cluster (Hopper)

Announcing the Hopper Cluster (Hopper) 

The ORC would like to invite you to use Hopper its new high performance compute cluster. Hopper is named in honor of the late rear admiral Grace Hopper, a computing pioneer and local resident. All new ORC cluster accounts will be created and activated on Hopper by default. However, existing Argo cluster account holders should send an email to [email protected] to request activation of their account on Hopper.

Hopper currently has a total of 70 compute nodes each node with 48 cores (Intel Cascade Lake) and 188 GB of available memory. Currently, 28 nodes and the GPU node are freely available for all users. The remaining nodes may also be used but jobs will be subject to preemption by jobs run by the node’s sponsors. There is one Nvidia DGX GPU node with 128 CPU cores (AMD EPYC/Milan), 1 TB of memory, and 8xA100 GPUs.  

A large expansion of Hopper is planned for the Fall of 2021 which will add a substantial number of compute and GPU nodes including very large memory nodes with up to 4 TB of memory. Users who require memory address spaces greater that 180 GB will need to continue to use the Argo cluster until the new large memory nodes become available in Hopper. 

The Hopper cluster is configured in a similar but not identical fashion to Argo. The software modules are organized differently and there are differences in the partition names, defaults, and versions of software available. Please review the documentation linked below for more detailed information on the differences.  

You may log in to Hopper using “ssh <UserID>@hopper.orc.gmu.edu,” where “<UserID>“ is your GMU NetID, use your GMU campus password when prompted. Home, scratch, and project directories will be mounted in the same locations as on Argo. Let us know if there are any “groups” directories you need to access, or if there are specific software packages and versions you require that are not available. The partition/queue structure on Hopper is summarized in the table below: 

Partition  Time Limit (D-H:M)  Description  ARGO Equivalent 
debug  0-01:00  Intended for quick tests   
interactive  0-12:00  Interactive jobs (Open OnDemand)   
normal  3-00:00  default  all-LoPri, all-HiPri, bigmem-HiPri, bigmem-LoPri, all-long, bigmem-long 
contrib*  6-00:00    CDS_q, COS_q, CS_q, EMH_q 
gpuq  1-00:00  GPU node access  gpuq 

*NOTE:  Being a contributor on Argo does not automatically grant access to the contrib partition on Hopper. All users may submit jobs to the contrib partition on Hopper, however, their jobs may be preempted and killed by a contributor’s job at any time. We recommend that non-contributor users who submit to the contrib partition ensure their jobs use some form of checkpointing. Contact [email protected] if you need help implementing checkpointing in your jobs. 

Open OnDemand 

We would also like users to try our new Open OnDemand (OOD) Server, which enables launching interactive apps including RStudio, Jupyter Lab, MATLAB and Mathematica, or a Linux graphical desktop through a web interface. These interactive sessions can be used for up to 12 hours. From a web browser, login to https://ondemand.orc.gmu.edu using your GMU username and credentials to access the OOD server. Please let us know of any problems you encounter, and any applications you would like to be able to use via Open OnDemand. 

Documentation 

Please refer to the following links for current documentation on Hopper: 

If you have any questions about any aspect of the new Hopper cluster, please send an email to [email protected]. 

 

ARGO Scratch File system migration – Cluster unavailable in the AM 03/13/2021

There is a  planned interruption to the availability of the Argo cluster.

On the morning of 03/13/2021, the scratch filesystem will be migrated to new hardware.   The file system must be quiescent during the data transfer so the entire ARGO cluster will be unavailable from 5 am for a few hours. All partitions are being drained. Jobs will start running again
by the afternoon.

Maintenance Scheduled for ORC hosted Virtual Machines and Servers.

A maintenance period has been scheduled for Tuesday 1/19/2021 between 8 am and 11 am, during which it is planned that all ORC VMs and hosted servers will be patched and rebooted.  There will necessarily be short periods, generally no longer than 15 minutes, during which the systems will be unavailable.  If this maintenance would cause disruption to your work, then please let us know asap so that we can make the necessary adjustments.

The Argo Cluster will not be affected by this maintenance.

Christmas Break 2020 – Support

The University will break for the holidays on December 18th 2020, and normal working hours will resume on January 4th 2021. Account requests made after 12pm on December 18th will not be processed until January 4th. Staff will be monitoring email and help tickets and may be able to respond to urgent inquiries but please be aware this will be on a best effort basis. If you have concerns or questions please send email to [email protected].

Hopper now has power

Equipment Delivery for Hopper has Begun

Five racks of equipment from Dell for the Hopper cluster were delivered today to the Acquia Data Center.  ORC staff will be hard at work in the coming weeks configuring and testing the ORC’s latest cluster.

 

ORC’s Newest Cluster Now Has an Official Name

The ORC’s newest cluster will be called “Hopper” in honor of the computing pioneer Rear Admiral Grace Hopper.

Interruption to the Argo Cluster – scheduled 11th October 2020

Unfortunately,  a second maintenance outage will be required to complete essential powers infrastructure changes in the Data Center.  Currently, this maintenance is scheduled for 8 am-5 pm on Sunday the 11th of October.  The cluster will once again be drained, this means that jobs whose run time would overlap with the maintenance will be held until after the maintenance.  You will be able to run jobs right up to the maintenance time by submitting with a time limit such that your job will not run past 8 am on Sunday the 11th of October 2020.

Please send an email to [email protected] with any questions or concerns.