SUNNYVALE, CALIFORNIA – November 10, 2022 – Cerebras Systems, the pioneer in high performance artificial intelligence (AI) compute, today announced record-breaking performance on the scientific compute workload of forming and solving field equations. In collaboration with the Department of Energy’s National Energy Technology Laboratory (NETL), Cerebras demonstrated its CS-2 system, powered by the Wafer-Scale Engine (WSE), was as much as 470 times faster than NETL’s Joule Supercomputer in field equation modeling, delivering speeds beyond what either CPUs or GPUs are currently able to achieve.
The workload under test was field equation modeling using a simple Python API that enables wafer-scale processing for much of computational science, achieving gains in performance and usability that cannot be obtained on conventional computers and supercomputers. This domain-specific, high-level programmer’s toolset is called the WSE Field-equation API, or WFA. The WFA outperforms OpenFOAM® on NETL’s Joule 2.0 supercomputer by over two orders of magnitude in time to solution. While this performance is consistent with hand-optimized assembly codes, the WFA provides an easy-to-use, high-level Python interface that allows users to form and solve field equations effortlessly. This new WFA toolset has the potential to change the way computers are used in engineering in a positive and basic way.
This work demonstrates the fastest known time-to-solution for field equations in computing history at scales up to several billion cells. The speed was achievable because of the WSE provides memory and point-to-point bandwidths high enough that there are no communication bottlenecks for tensor instructions and computation proceeds at or faster than the clock rate. In the past, field equations have been memory bound, and in distributed systems, they are limited by node-to-node communication bandwidth. These limitations create a need for memory hierarchies and complex programming methods to ensure maximum possible utilization. All of this complexity is eliminated because of the exceptionally high bandwidths afforded on the WSE, and tensor instructions can proceed at rates not possible with conventional, distributed Von Neuman architectures.
“NETL and Cerebras collaborated to develop a new programming methodology with a team of three people on never-before seen hardware with a unique instruction set, all in less than 18 months. To put that in perspective, efficient distributed computing efforts often take years to decades of work with very large groups of developers,” said Dr. Brian J. Anderson, Lab Director at NETL. “By using innovative new computer architectures, such as the Cerebras WSE, we were able to greatly accelerate speed to solution, while significantly reducing energy to solution on a key workload of field equation modeling. This work combining the power of supercomputing and AI will deepen our understanding of scientific phenomena and greatly accelerate the potential of fast, real-time, or even faster-than-real-time simulation.”
The Joule 2.0 supercomputer is the 139th fastest supercomputer in the world as ranked by the TOP 500.org, . and contains 84,000 CPU cores and 200 GPUs. By bringing together exceptional memory performance with massive bandwidth, low latency inter-processor communication and an architecture optimized for high bandwidth computing, the CS-2 time-to-solution and energy consumption was superior to the Joule supercomputer when running a standardized multi-dimensional, time-variant field equation test problem.
The CS-2 was proven as much as 470 times faster in time-to-solution than the largest cluster of CPUs that NETL’s Joule 2.0 supercomputer could allocate to the problem of this size. It was also found to be more than two orders of magnitude more energy efficient than distributed computing.
“Cerebras is proud of our collaboration with NETL. Together we have produced extraordinary results in advancing foundational workloads in scientific compute,” said Andrew Feldman, co-founder and CEO, Cerebras Systems. “Conventional supercomputers consume enormous amounts of energy, are complex to set up, time consuming to program and as demonstrated by our work, slower to produce answers than the Cerebras CS-2. Through our partnership with NETL, the CS-2 proves that wafer-scale integration is a viable solution to many of the leading scientific problems in high performance computing—in fact, we showed that the CS-2 produces results that are hundreds of times faster than the biggest supercomputers, while using hundreds of times less energy.”
The research was led by Dr. Dirk Van Essendelft, Machine Learning and Data Science Engineer at NETL, Robert Schreiber, Distinguished Engineer at Cerebras Systems and Michael James, co-founder and Chief Architect for Advanced Technologies at Cerebras. The results came after months of work and continue the close collaboration between the Department of Energy’s NETL laboratory scientists and Cerebras Systems. In November 2020, Cerebras and NETL announced a new compute milestone on the key scientific workload of Computational Fluid Dynamics (CFD).
With every component optimized for AI work, the CS-2 delivers more compute performance at less space and less power than any other system. It does this while radically reducing programming complexity, wall-clock compute time, and time to solution. Depending on workload, from AI to HPC, CS-2 delivers hundreds or thousands of times more performance than legacy alternatives. A single CS-2 replaces clusters of hundreds or thousands of GPUs that consume dozens of racks, use hundreds of kilowatts of power, and take months to configure and program. At only 26 inches tall, the CS-2 fits in one-third of a standard data center rack.
Researchers from NETL and Cerebras will present their findings at the SC22 Conference, and a paper based on the work can be found on arXiv, titled Disruptive Changes in Field Equation Modeling: A Simple Interface for Wafer Scale Engines.
About Cerebras Systems
Cerebras Systems is a team of pioneering computer architects, computer scientists, deep learning researchers, and engineers of all types. We have come together to build a new class of computer system, designed for the singular purpose of accelerating AI and changing the future of AI work forever. Our flagship product, the CS-2 system is powered by the world’s largest processor – the 850,000 core Cerebras WSE-2 enables customers to accelerate their deep learning work by orders of magnitude over graphics processing units.
Learn more in this blog by Distinguished Engineer Rob Schreiber.
Related Posts
August 27, 2024
Cerebras Launches the World’s Fastest AI Inference
20X performance and 1/5th the price of GPUs- available today. Developers can…