Blog

All

Customer

Chip

May 15, 2025

Realtime Reasoning is Here - Qwen3-32B is Live on Cerebras

Qwen3-32B on Cerebras runs at 2,400 tokens/s, making reasoning run in realtime

The Cerebras Scaling Law: Faster Inference Is Smarter AI

June 11, 2025

Cerebras May 2025 Newsletter

June 02, 2025

Bringing Cerebras Inference to Quora Poe’s Fast Growing AI Ecosystem

May 20, 2025

Humanizing Digital Technology with Norby

May 08, 2025

Cerebras Partners with IBM to Accelerate Enterprise AI Adoption

May 06, 2025

Cerebras April Highlights

May 02, 2025

Sei AI - Revolutionizing Financial Services Support with Cerebras-Powered Agents

April 25, 2025

Llama 4 is now available on Cerebras Inference

April 09, 2025

Breaking the Boundaries of Presence: Delphi's Digital Mind Technology

March 26, 2025

Compressing KV cache memory by half with sparse attention

March 24, 2025

CePO Update: Turbocharging Reasoning Models’ capability using test-time planning

March 17, 2025

LongCePO: Empowering LLMs to efficiently leverage infinite context

March 17, 2025

Alex - iOS Development with AI and Cerebras Inference

March 06, 2025

Extending LLM context with 99% less training tokens

February 25, 2025

Dolphin - Efficient Data Curation at Cognitive Computations with Cerebras - Cerebras

February 25, 2025

Visualizing Knowledge at Lightning Speed with Cerebras at Tako - Cerebras

February 11, 2025

Cerebras brings instant inference to Mistral Le Chat - Cerebras

February 06, 2025

Cerebras: January 2025 Happenings - Cerebras

January 31, 2025

Cerebras Launches World's Fastest DeepSeek R1 Llama-70B Inference

January 29, 2025

Building Real-Time AI-Powered Contact Centers with Cerebras at OpenCall - Cerebras

January 21, 2025

100x Defect Tolerance: How Cerebras Solved the Yield Problem - Cerebras

January 13, 2025

Building Real Time Digital Twin with Cerebras at Tavus - Cerebras

January 07, 2025

Announcing Cerebras Inference Research Grant - Cerebras

December 10, 2024

CePO: Empowering Llama with Reasoning using Test-Time Compute - Cerebras

December 10, 2024

Memo-ry: Simplifying Daily Tasks for People with Memory Loss - Cerebras

December 09, 2024

AIBI: Revolutionizing Interviews with AI (AI interviews aren't really a thing yet) - Cerebras

December 04, 2024

Chatting Your Way Through 4500 NeurIPS papers with Cerebras - Cerebras

November 20, 2024

Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference - Cerebras

November 18, 2024

Guii: An Interactive Coding Companion for Creative Frontend Development - Cerebras

November 12, 2024

Building an AI-Powered Search Assistant for Zoom Team Chat - Cerebras

November 08, 2024

Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s - Cerebras

October 24, 2024

Simulating Human Behavior with Cerebras - Cerebras

October 14, 2024

The Practitioner’s Guide to the Maximal Update Parameterization - Cerebras

September 23, 2024

ReadAgent: Bringing Gist Memory to AI - Cerebras

August 28, 2024

Integrating LLMs and Software Engineering for Self-Refining Copy Creation - Cerebras

August 28, 2024

Llama3.1 Model Quality Evaluation: Cerebras, Groq, SambaNova, Together, and Fireworks - Cerebras

August 28, 2024

Introducing Cerebras Inference: AI at Instant Speed - Cerebras

August 27, 2024

Introducing DocChat: GPT-4 Level Conversational QA Trained In a Few Hours - Cerebras

August 21, 2024

Revolutionizing Life Science and Healthcare with Generative AI - Cerebras

August 20, 2024

Cerebras Wafer-Scale Engine Achieves 210x Speedup Over NVIDIA H100 - Cerebras

August 16, 2024

New Tool Generates Stencil Codes Two Orders of Magnitude Faster on Cerebras WSE Than on GPUs

August 09, 2024

US DOE Achieves 88x Performance Speedup with Cerebras CS-2 Over H100 in Materials Modeling

June 25, 2024

Cerebras Breaks Exascale Record for Molecular Dynamics Simulations - Cerebras

May 17, 2024

Introducing Sparse Llama: 70% Smaller, 3x Faster, Full Accuracy - Cerebras

May 15, 2024

Supercharge your HPC Research with the Cerebras SDK - Cerebras

May 01, 2024

Cerebras CS-3 vs. Nvidia B200: 2024 AI Accelerators Compared - Cerebras

April 12, 2024

Cerebras CS-3: the world’s fastest and most scalable AI accelerator - Cerebras

March 12, 2024

Cerebras and Qualcomm Unleash ~10X Inference Performance Boost with Hardware-Aware LLM Training - Cerebras

March 11, 2024

Load more

Schedule a meeting to discuss your AI vision and strategy.

Contact us join newsletter Linkedin

Follow

Get Updates

Newsletter signup

Company

News

Insights

info@cerebras.ai

1237 E. Arques Ave  Sunnyvale, CA 94085

© 2025 Cerebras.
All rights reserved.