Cerebras announces six new
AI Inference Data Centers
New datacenters will catapult Cerebras to hyperscale capacity, over 40 million Llama 70B tokens per second. We are creating the largest domestic high-speed inference cloud, join us
READ MORECerebras Inference is now on

All Hugging Face developers can now get one-click access to Cerebras Inference. Experience a
10X Faster AI Powered insights for enterprises

With Cerebras Inference, AlphaSense has dramatically increased the speed and accuracy of its AI-driven research tools—delivering financial and business insights that are more accessible, actionable, and real-time than ever before.
LEARN MORE



Data Center Expansion
Cerebras announces six new state of the art AI data centers
Hugging Face
Cerebras is now on Hugging Face
AlphaSense
Cutting edge insights with Cerebras
Latest Announcements
Cerebras brings instant inference to Mistral Le Chat
We are excited to announce that Cerebras Inference is now powering Mistral’s Le Chat platform. Cerebras powers Le Chat’s new Flash Answers feature that provides instant responses to user queries. At over 1,100 token/s, Le Chat is 10x faster than popular models such as ChatGPT 4o, Sonnet 3.5, and DeepSeek R1, making it the world’s fastest AI assistant.
Cerebras Launches World's Fastest DeepSeek R1 Llama-70B Inference
Today, we’re excited to announce the launch of DeepSeek R1 Llama-70B on Cerebras Inference. We achieve world record performance over 1,500 tokens/s on this model – 57x faster than GPU solutions. The model runs on Cerebras AI data centers in the US with no data retention, ensuring the best privacy and security for customer workloads. Users can try out the chat application on our website today. We are also offering developer preview via API – please reach out if you’re interested.
Cerebras Powers Perplexity Sonar with Industry’s Fastest AI Inference
Sunnyvale, CA — February 11, 2025 – Cerebras Systems, the pioneer in accelerating generative AI, today announced its pivotal role in powering Sonar, a groundbreaking model optimized for Perplexity search. Built on the robust foundation of Llama 3.3 70B, Sonar represents a significant advancement in answer quality, factuality, and readability, setting new standards for user satisfaction in search technology. The new Sonar search experience, powered by Cerebras, is available to Perplexity Pro users starting today.
Award Winning Technology
Cerebras continues to be recognized for pushing the boundaries of AI

TIME

TIME

FORBES

FORTUNE
ai model services
You bring the data, we'll train the model
Whether you want to build a multi-lingual chatbot or predict DNA sequences, our team of AI scientists and engineers will work with you and your data to build state-of-the-art models leveraging the latest AI techniques.

