Cerebras Inference now powers
Perplexity’s Sonar
Get answers at an unprecedented 1,200 tokens/s– 10x faster than comparable models. By combining Cerebras’ breakthrough AI hardware with Perplexity’s optimized search model, Sonar is redefining instant, high-quality AI-driven search.

Genomic Foundation Model
A revolutionary model designed to improve
diagnostics and personalize treatment selection.



Latest Announcements
Cerebras Systems and Mayo Clinic Unveil Best in Class Genomic Foundation Model
ROCHESTER, Minn., and SUNNYVALE, Calif. — January 14, 2025 — Cerebras Systems, in collaboration with Mayo Clinic, announced significant progress in developing artificial intelligence tools to advance patient care, today at the JP Morgan Healthcare Conference in San Francisco. Together, Cerebras and Mayo Clinic have developed a world-class genomic foundation model designed to support physicians and patients.
Cerebras Demonstrates Trillion Parameter Model Training on a Single CS-3 System
SUNNYVALE, CA AND VANCOUVER — December 10, 2024 – Today at NeurIPS 2024, Cerebras Systems, the pioneer in accelerating generative AI, today announced a groundbreaking achievement in collaboration with Sandia National Laboratories: successfully demonstrating training of a 1 trillion parameter AI model on a single CS-3 system. Trillion parameter models represent the state of the art in today’s LLMs, requiring thousands of GPUs and dozens of hardware experts to perform. By leveraging Cerebras’ Wafer Scale Cluster technology, researchers at Sandia were able to initiate training on a single AI accelerator – a one-of-a-kind achievement for frontier model development.
Cerebras Delivers Record-Breaking Performance with Meta's Llama 3.1-405B Model
Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference: Frontier AI now runs at instant speed. Last week we ran a customer workload on Llama 3.1 405B at 969 tokens/s – a new record for Meta’s frontier model. Llama 3.1 405B on Cerebras is by far the fastest frontier model in the world – 12x faster than GPT-4o and 18x faster than Claude 3.5 Sonnet. In addition, we achieved the highest performance at 128K context length and shortest time-to-first-token latency, as measured by Artificial Analysis.
Award Winning Technology
Cerebras continues to be recognized for pushing the boundaries of AI

TIME

TIME

FORBES

FORTUNE
ai model services
You bring the data, we'll train the model
Whether you want to build a multi-lingual chatbot or predict DNA sequences, our team of AI scientists and engineers will work with you and your data to build state-of-the-art models leveraging the latest AI techniques.

