The Aurora, an Exascale supercomputer that was sponsored by the United States Department of Energy (DOE) and designed by Intel and Cray for the Argonne National Laboratory, is currently the fastest AI system in the world, according to a press release that Intel shared today explaining how the supercomputer has broken the exascale barrier at 1.012 exaflops, with 10.6 AI exaflops listed as an additional figure. The system, which was launched in November 2023 and features over 20,000 Intel Xeon CPUs and over 60,000 Intel Data Center GPUs, is estimated to have cost $500 million to develop.
Aurora’s hardware comprises:
- 166 racks, 10,624 compute blades
- 21,248 Intel Xeon CPU Max Series processors
- 63,744 Intel Data Center GPU Max Series units
- 84,992 HPE slingshot fabric endpoints
Recent records attained include:
- Second on the high-performance LINPACK (HPL) benchmark
- Broke the exascale barrier at 1.012 exaflops utilizing 9,234 nodes, only 87% of the system
- Third spot on high-performance conjugate gradient (HPCG) benchmark at 5,612 teraflops per second (TF/s), 39% of the machine
A behind-the-scenes look at the supercomputer:
Intel on how Aurora is optimized for AI:
At the heart of the Aurora supercomputer is the Intel Data Center GPU Max Series. The Intel Xe GPU architecture is foundational to the Max Series, featuring specialized hardware like matrix and vector compute blocks optimized for both AI and HPC tasks. The Intel Xe architecture’s design that delivers unparalleled compute performance is the reason the Aurora supercomputer secured the top spot in the high-performance LINPACK-mixed precision (HPL-MxP) benchmark – which best highlights the importance of AI workloads in HPC.
The Xe architecture’s parallel processing capabilities excel in managing the intricate matrix-vector operations inherent in neural network AI computation. These compute cores are pivotal in accelerating matrix operations crucial for deep learning models. Complemented by Intel’s suite of software tools, including Intel oneAPI DPC++/C++ Compiler, a rich set of performance libraries, and optimized AI frameworks and tools, the Xe architecture fosters an open ecosystem for developers that is characterized by flexibility and scalability across various devices and form factors.