New Year, New Model:
Fastest DeepSeek R1-70B now available on Cerebras
It was all gas, no brakes for Cerebras in January as we accelerated the AI industry. We shipped the world’s fastest DeepSeek R1-70B at more than 1500 tok/sec. That’s 57 times faster than GPU-based solutions. At the JP Morgan Healthcare Conference, we announced the Mayo Genomic Foundation Model built by Mayo Clinic on Cerebras. We participated in AI House at Davos and organized our first Café Compute of 2025!
We’re just getting started. Keep reading!
Instant Reasoning: DeepSeek R1 Llama-70B powered by Cerebras Inference
We’re excited to offer the fastest DeepSeek R1 Llama-70B in the world through Cerebras Inference. At over 1,500 tokens/second, Cerebras is the first provider of real-time reasoning for this world-leading open-source reasoning model. We know that data privacy and security are top priorities for enterprises and developers. That is is why Cerebras Inference runs 100% in our U.S.-based AI datacenters with zero data retention or transfer. When you run DeepSeek R1 Llama-70B on Cerebras, you can expect best-in-class security and data ownership for your AI.

Introducing The Mayo Genomic Foundation Model for Personalized Treatment
We announced a groundbreaking Genomic Foundation model with the Mayo Clinic. The model was trained on a vast dataset of patient information, demonstrates impressive accuracy in predicting treatment response and identifying genetic predispositions to various conditions. For researchers, doctors, and providers, this translates to more informed treatment decisions, reduced trial-and-error, and optimized resource allocation. This model has potential to revolutionize how we diagnose and treat diseases like Rheumatoid Arthritis.

Powering Digital Twins with Tavus
Tavus.io an innovative AI video research company that specializes in APIs for building digital twin video experiences. With Cerebras Inference, Tavus was able to reduce their Time to First Token (TtFT) by 66%. Tavus not only improved the speed of response generation but also enhanced the overall user experience by making interactions feel more responsive and authentic.

Building seamless customer service experience with OpenCall.AI
Building real-time AI contact centers is complex, demanding high reliability and speed. OpenCall overcame these challenges with Cerebras, reducing latency by 90% and enabling complex workflows like appointment booking and payment processing in real time. OpenCall’s AI contact center leverages Cerebras’s powerful inference capabilities to deliver instant, reliable support.

Introducing: The Cerebras Trust Center:
Security and compliance are paramount at Cerebras. We created the Cerebras Trust Center, a new public portal, to demonstrate our unwavering commitment to these principles. The Cerebras Trust Center provides ongoing updates on security processes, a comprehensive overview of compliance efforts, and valuable resources for customers and partners to ensure Cerebras meets their specific security needs.
We encourage you to explore the Trust Center and learn more about how Cerebras prioritizes the security of your data and operations.


Cerebras in the AI Community
Andy Hock represented Cerebras at Davos 2025, with our partners G42 at the Davos AI House, participating in high-level roundtables focused on the future of AI and its impact on society and sustainability.
In San Francisco, the Cerebras team celebrated Chinese New Year with a festive Cafe Compute event, bringing together colleagues and community members to enjoy traditional food and cultural activities. Cerebras also recently hosted a whirlwind 60-second hackathon on Discord, resulting in over 4,000 projects in a single day!