On October 30th at TensorFlow World, my colleague Manjunath Kudlur and I were honored to speak publicly for the first time about the Cerebras software stack. This stack links deep learning researchers to the massive compute capabilities of the Wafer Scale Engine (WSE).
The WSE is the largest commercial chip ever manufactured, built to solve the problem of deep learning compute. The WSE is 1.2 trillion transistors, packed onto a single chip with 400,000 AI-optimized cores, connected by a 100Pbit/s interconnect. The cores are fed by 18 GB of super-fast, on-chip memory, with an unprecedented 9 PB/s of memory bandwidth.
What does this mean for AI researchers? We believe, along with many others, that AI has massive potential — from ads, to autonomous vehicles, from commerce to climate. It has transformative potential for the way we live and work.
Researchers continue to see gains with deeper models and larger datasets. But, they are compute limited today. Training commonly takes days, weeks, even months. Not only is this costly, it constrains research and development. We need wall clock training times on the timescale of experimentation and human innovation — minutes-hours rather than days-weeks — even for large models. This means we need 100-1000x increase in compute capabilities, not incremental 1.5-2x.
We need this performance in an accessible, easy to program package.
This talk introduced the Cerebras Software Stack. Its primary task is to map the computational graph for researchers’ neural networks — from the framework level all the way down to our massive wafer-scale processor.
The stack integrates seamlessly with popular machine learning frameworks like TensorFlow and PyTorch, allowing researchers to use familiar and flexible tools that bring their models to the WSE.
A programmable C++ interface allows researchers to extend the platform and develop custom kernels – empowering them to push the limits of ML innovation.
To learn more, check out the slides here, or contact us – we’re a friendly bunch! Review our careers page if you’re interested in joining our incredible team.
Related Posts
August 28, 2024
Integrating LLMs and Software Engineering for Self-Refining Copy Creation
Discover how to build an AI agent that generates marketing copy efficiently…
August 28, 2024
ReadAgent: Bringing Gist Memory to AI
Learn how gist memory improves long context handling for large language models.…