Integrating LLMs and Software Engineering for Self-Refining Copy Creation

AI agents are among the most exciting advancements in the field of large language models (LLMs). By integrating agentic workflows, these models can now better handle planning, reasoning, and…

ReadAgent: Bringing Gist Memory to AI

Large Language Models (LLMs) exhibit remarkable abilities in understanding natural language, but they are not without limitations. One area where LLMs can struggle is in processing long text inputs,…

Llama3.1 Model Quality Evaluation: Cerebras, Groq, SambaNova, Together, and Fireworks

Introduction At Cerebras, we are redefining AI inference by delivering unparalleled speed, quality, and efficiency. Our new inference solution sets an industry benchmark, delivering 1800+ tokens per…

Introducing Cerebras Inference: AI at Instant Speed

Today, we are announcing Cerebras inference – the fastest AI inference solution in the world. Cerebras inference delivers 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for…

Introducing DocChat: GPT-4 Level Conversational QA Trained In a Few Hours

We are excited to announce the release of Cerebras DocChat, our first iteration of models designed for document-based conversational question answering. This series includes two models: Cerebras…

Revolutionizing Life Science and Healthcare with Generative AI

Introduction  Healthcare currently accounts for 17% of GDP in the United States, making it one of the country’s largest economic sectors and an industry with immense potential to transform the human…

Cerebras Wafer-Scale Engine Achieves 210x Speedup Over NVIDIA H100

Researchers from Rice University and TotalEnergies have introduced a dataflow matrix-free finite volume solver for the Cerebras (WSE) that achieves a 210x speedup over NVIDIA H100 GPU. This type of…

New Tool Generates Stencil Codes Two Orders of Magnitude Faster on Cerebras WSE Than on GPUs

Introduction  Scientific computing, particularly in fields like seismic imaging, weather forecasting, and computational fluid dynamics, heavily relies on stencil computations. While stencil…

US DOE Achieves 88x Performance Speedup with Cerebras CS-2 Over H100 in Materials Modeling

Cerebras Systems has partnered with the National Energy Technology Laboratory (NETL) since 2019 to push the boundaries of physical simulation. NETL has previously published work on the WFA, a…

Cerebras Breaks Exascale Record for Molecular Dynamics Simulations

Overcoming Timescale Limitations In collaboration with researchers from Sandia, Lawrence Livermore, and Los Alamos National Laboratories, Cerebras established a new benchmark for simulating materials…

Introducing Sparse Llama: 70% Smaller, 3x Faster, Full Accuracy

Cerebras and Neural Magic have achieved a major milestone in the field of large language models (LLMs). By combining state-of-the-art pruning techniques, sparse pretraining, and purpose-built…

Supercharge your HPC Research with the Cerebras SDK

The Cerebras SDK Cerebras Systems is a team of pioneering engineers of all types, driven by the world’s largest computing challenges. Our newly announced flagship product, the CS-3 system, is…