To Exascale and Beyond: 7 Key Takeaways From ISC 2024

The International Supercomputer Conference (ISC) concluded last week in Hamburg, Germany. Several of the biggest names in supercomputing, including Intel, Hewlett Packard Enterprise, Nvidia, and AMD, announced new updates at the show alongside up-and-comers such as IQM Quantum, Supermicro, and Spinncloud. The 2024 show took on “Reinventing HPC” as the theme, exploring how the supercomputer industry is one of both breakthroughs and blurred lines. Supercomputers are massively parallel systems that use tens of thousands of CPUs, GPUs, TPUs, or other specialized processing engines. They may differ only in software from their kindred data center cloud hyperscalers. With AI taking such a big role in the data center and high-performance computing (HPC) worlds, the distinction is only becoming less clear. The very definition of supercomputer and HPC may need to change.

In this roundup, we’ll examine ISC’s key announcements and discuss how they point to trends in high-performance computing and supercomputing.

Intel

Intel has teamed with Argonne National Laboratory and Hewlett Packard Enterprise (HPE) to produce the Aurora supercomputer. Aurora delivers conventional supercomputer performance at 1.012 exaflops, placing it in the number two spot in the most recent Top500 supercomputer list. It’s only the second exascale computer ever to power up. Aurora comes in on top of the AI supercomputing list at 10.6 AI exaflops.

Aurora is a massive system consisting of 166 racks, 10,624 compute blades, 21,248 Intel Xeon CPU Max processors, and 63,744 Intel Data Center GPU Max units.

Hewlett Packard Enterprise

In 2019, Hewlett Packard Enterprise (HPE) joined the supercomputer fray by purchasing the late Seymour Cray’s machines. Today, the HPE/Intel/Argonne Aurora supercomputer announced at ISC 2024 continues that legacy by being only the second supercomputer to reach exascale capability.

Aurora uses the HPE Cray EX supercomputer platform, which was purpose-built for exascale computing. A crucial component is HPE Slingshot, the largest deployment of open Ethernet-based supercomputing interconnect. The system was built with AI-driven research in mind. It will be used to map the human brain, study high-energy particle physics, and accelerate AI-driven drug research.

Nvidia

Nvidia has emerged as one of the giants in AI and supercomputing processor cores. Nvidia announced the installation of its Grace Hopper superchips in nine supercomputer systems worldwide. The superchip combines Arm-based Nvidia Grace CPU and Nvidia Hopper GPU architectures using Nvidia NVLink-C2C interconnect technology.

The integrated combination delivers a balance of HPC and power efficiency. Grace Hopper is designed for power efficiency, high computational speeds, and easy scaling. The new Nvidia-based supercomputers are online or coming online in France, Poland, Switzerland, Germany, the U.S., Japan, and the U.K.

AMD

AMD showcased its HPC solutions at ISC via the Frontier supercomputer housed at Oak Ridge National Lab. Frontier came out three years ago as the first exascale and the highest-performing supercomputer, with 1.2 exaflops, according to Top500. Frontier, powered by AMD EPYC CPUs and AMD Instinct GPUs, still holds the title of fastest supercomputer in the world, albeit by a much smaller margin this year than last.

AMD also noted that 157 of the top 500 fastest supercomputers are powered by AMD. That’s a 29% increase since 2023.

IQM Quantum

IQM Quantum partnered with HPE at ISC 2024 to demonstrate a hybrid system that integrates quantum computing and more conventional high-performance computing.

Quantum computers, while still largely in the early research stage, have the potential to radically disrupt HPC. IQM Quantum has taken a novel approach by combining quantum hardware with classical HPC hardware. One of the first deployments of this hardware will be in Germany at the Leibniz Supercomputing Centre (LRZ).

IQM Quantum will use the platform in partnership with Hewlett Packard Labs to continue hybrid advancement and allow users to develop quantum and hybrid computing algorithms and practices.

Supermicro

Supermicro showed off its liquid-cooled AI and HPC systems at the show, which reduce cooling power requirements over conventional air-cooled systems. Liquid cooling enables denser AI and HPC computing. By improving heat extraction, Supermicro rack solutions may increase the speed and lower the cost of data centers and supercomputers.

Thermal management is one of the key enabling technologies for high-performance computing, including supercomputers. The CPUs, GPUs, and TPUs, along with DRAM and solid-state storage, create massive amounts of heat. Supermicro liquid-cooled supercomputing rack systems can be adapted for most HPC hardware. Supermicro offers servers based around Nvidia, AMD, and Intel processors.

The liquid cooling system is part of Supermicro’s environmental, social, and governance (ESG) initiative and promises to save 40% of the power used by an equivalent air-cooled system.

Spinncloud

Spinncloud announced the SpiNNaker2 event-based hybrid AI platform. The system’s predecessor, SpiNNaker1, was the brainchild of Steve Furber, one of the original developers of the Arm architecture. SpiNNaker was designed to emulate the human brain. SpiNNaker2 expands upon the prior version, extending traditional AI computing models with new algorithms that adapt dynamically to contextual nuance.

Spinncloud believes that the current computing architecture is woefully inadequate for AI. Even with massively parallel matrix math-optimized computing, existing AI systems use simple pattern recognition, tokenization of patterns, and matching to existing tokenized data. They fall short when tasked with original thinking or contextual nuance.

Spinncloud has developed a low-power architecture that they believe closely models the human brain. One of the key elements is the biologically-inspired, event-based asynchronous parallel operation. Side-by-side operations don’t need to remain synchronous. SpiNNaker2, based on this new architecture, promises significantly greater computing power per unit of energy consumed.

Zetta, Here We Come

When Seymore Cray first used parallelism and other architectural innovations to create the supercomputer sixty years ago, the size and scope of today’s systems were mere science fiction. The June 2024 Top500 list now has two exascale supercomputers, Frontier and Aurora, at 1.206 and 1.012 exaflops, respectively. The next closest, Microsoft’s Eagle, comes in at 561 petaflops. Frontier was the first to reach the exascale, arriving at that point in June 2022, just 14 years after IBM hit the petaflop line (1,000x slower than exascale) with Roadrunner.

The newest HPC chips are built with multiple-core processor chiplets containing different combinations of CPUs, GPUs, TPUs, and stacked memory all on the same substrate connected with high-speed interconnects and intra-chip networking. They are nearly complete computers on their own, yet they are combined in the tens of thousands to create data centers and supercomputers. The scalability of these systems means that the primary limitations are a quadfecta of power consumption, cooling, data transfer, and raw computing power. Innovations in any of these four areas can lead to jumps in supercomputer performance.

With new architectures and quantum-HPC hybrids, like those from Spinncloud and IQM Quantum, emerging, the gap between exascale and zettascale may close sooner than a decade and a half. Alternately, breakthroughs in data transfer between dispersed hyperspeed cloud computing may make the concept of a singular supercomputer meaningless.

A Footnote on Supercomputing Speed

Supercomputer speed is measured in floating point operations per second (flops). Today, that means petaflops or exaflops. A petaflop is 1015 flops per second. An exaflop is 1018 flops—one thousand petaflops—a quintillion (billion billion) flops. By comparison, the Intel 8087, the first math co-processor I ever coded Assembly language for, came in at a not-at-all blazing 50 kiloflops. A supercomputer capable of operating at one exaflop or greater is said to be an exascale supercomputer.

The “f” in supercomputing flops refers to an IEEE double-precision, 64-bit floating-point number (FP64). Common AI benchmarks use a combination of 32-bit and 64-bit math. While calculating an orbital trajectory for a space probe billions of miles away requires a lot of precision, tokenized patterns in an AI model can be found easily with 32, 16, and even 8-bit precision floating point (FP32, FP16, and FP8) numbers. Additional precision would only slow the operations down. AI calculations often start with FP8 and then bump up to FP32 calculations for final refinement. This generally allows the same machine to deliver a higher AI benchmark than the conventional benchmark.

Share to your social below!