The University of Texas at Dallas’ Cyberinfrastructure & Research Services Department was established in 2015 to support research and teaching on campus by offering a suite of cyberinfrastructure services to faculty, staff, and students.
Our mission is to facilitate research and education on campus by providing computing resources, support and training that are synergistic with national resources. The UTD CIRC department provides free services to faculty, staff and students that include:
- Access to High Performance Computing (HPC) resources
- Full system lifecycle support including design, purchase, installation, administration and consulting
- Software consulting including installation and debugging / optimization support
- Workflow optimization and parallelization assistance
- Cloud computing support for commercial cloud services
CIRC’s primary resource is Ganymede, a heterogeneous local High Performance Computing (HPC) cluster. Ganymede is a 6200-core high performance computing cluster with 24TB of memory based on CentOS 7.6 and OpenHPC. It has a 10 Gigabit ethernet network and a FDR (56 Gbps) InfiniBand interconnect configured in a semi-fat-tree topology. It has two distributed file systems, one for the home directories over the 10 Gigabit Ethernet network available via NFS and one for the work file system that is a 1.2PB high-performance parallel file system (GPFS) accessible over the InfiniBand network. The GPFS file system uses Data Direct Networks (DDN) storage enclosures directly attached to InfiniBand storage for the back end hardware. Compute nodes are all dual processor with a variety of Intel architectures including Sandy Bridge, Haswell, Broadwell, and Sky/Cascade Lake. Additionally, several nodes in the system have Quadro and Tesla GPUs based on the Nvidia Kepler, Volta, and Pascal architectures. The freely available queues have the following resources:
110 Dell C8220 compute blades, each with:
- 2x Intel Xeon E5-2680 (Sandy Bridge) 16-core 2.7GHz/20M cache processors
- 32GB (22GB Usable) ECC DDR3 Memory
- 56Gb/s FDR Infiniband
32 Dell M610 blades, each with:
- 2x Intel Xeon E5540 (Nehalem) 4-core 2.53GHz/8M cache processors
- 24GB (14GB usable) ECC DDR3 Memory
- 40Gb/s QDR Infiniband
In addition to these resources, researchers may purchase hardware to add to Ganymede to have their own queues, resulting in lower wait times and hardware tailored to research needs. All free and condo queues combined, Ganymede has around 350 nodes cluster-wide.
The Texas Advanced Computational Center (TACC) is an XSEDE (Extreme Science and Engineering Discovery Environment) funded NSF center that provides computational resources for our national scientists and researchers. Additionally, it provides dedicated resources for University of Texas System schools. Through partnerships with the XSEDE program and TACC specifically, CIRC is able to assist our campus researchers in getting access and compute time on TACC systems.
The first system we suggest to our users at TACC is Lonestar5 (LS5). LS5 is a system made available to the UT System schools through the University of Texas Research Cyberinfrastructure initiative. LS5 is a Cray XC40 system featuring 1252 Cray XC40 compute nodes, each with two 12-core Intel® Xeon® processing cores for a total of 30,048 compute cores, 2 large memory compute nodes, each with 1TB memory, 8 large memory compute nodes, each with 512GB memory, 16 Nodes with NVIDIA K-40 GPUs, and a 5 petabyte DataDirect Networks storage system. If your workloads are modest in size and use traditional CPUs for computation, LS5 might be the perfect fit for your research.
If, however, your work requires larger computational resources, another option is Stampede2. Stampede2 is an NSF-funded system featuring 4,200 Intel Knights Landing nodes, each with 68 cores, 96GB of DDR RAM, and 16GB of high speed MCDRAM, 1,736 Intel Xeon Skylake nodes, each with 48 cores and 192GB of RAM, 100 Gb/sec Intel Omni-Path network with a fat tree topology employing six core switches, and two dedicated high performance Lustre file systems with a storage capacity of 31PB. Note, in order to be able to run large jobs on Stampede2, users will usually be required to submit a scaling study as part of the allocation request. Thus, a good place to start is LS5 in order to demonstrate need for Stampede2.
For our deep learning users, TACC has a dedicated resource, Maverick2. Maverick2 is only for machine learning workloads and features 23 nodes with 4 x Nvidia 1080 TI GPUs, 4 nodes with 2 x Nvidia V100 cards and 3 nodes with 2 x Nvidia P100 cards.
In order to get access to any TACC resources, contact [email protected] to discuss your research and computing needs.