November 18, 2024
Large Language Model,Deep Learning
Llama 3.1 405B now runs at 969 tokens/s on Cerebras Inference
Frontier AI now runs at instant speed. Last week we ran a customer workload on Llama 3.1 405B at 969 tokens/s – a new record for Meta’s frontier model. Llama 3.1 405B on Cerebras is by far the…
November 12, 2024
Large Language Model,Deep Learning
Guii: An Interactive Coding Companion for Creative Frontend Development
Guii offers a development experience similar to drawing on a canvas. By adding Guii Devtools to your codebase, you can interact directly on a webpage—selecting visual elements like boxes,…
November 8, 2024
Large Language Model,Deep Learning
Building an AI-Powered Search Assistant for Zoom Team Chat
Imagine a workday where all the answers you need are just a message away. No more switching between apps, no more digging through files and folders, no more endless searches. Just ask, and the…
October 24, 2024
Large Language Model,Deep Learning
Cerebras Inference now 3x faster: Llama3.1-70B breaks 2,100 tokens/s
Today we’re announcing the biggest update to Cerebras Inference since launch. Cerebras Inference now runs Llama 3.1-70B at an astounding 2,100 tokens per second – a 3x performance boost over…
October 14, 2024
Large Language Model,Deep Learning
Simulating Human Behavior with Cerebras
LlamaSim is a multi-LLM framework that aims to simulate human behavior at scale. Given a specific environment (e.g., voters in Pennsylvania, students at CMU), we simulate how target…
September 23, 2024
Large Language Model,Deep Learning
The Practitioner’s Guide to the Maximal Update Parameterization
Introduction Maximal Update Parameterization (µP) offers significant advantages for neural network training, but its adoption has been limited due to the complexity of the underlying math and the…
August 28, 2024
Large Language Model,Deep Learning
Integrating LLMs and Software Engineering for Self-Refining Copy Creation
AI agents are among the most exciting advancements in the field of large language models (LLMs). By integrating agentic workflows, these models can now better handle planning, reasoning, and…
August 28, 2024
Large Language Model,Deep Learning
ReadAgent: Bringing Gist Memory to AI
Large Language Models (LLMs) exhibit remarkable abilities in understanding natural language, but they are not without limitations. One area where LLMs can struggle is in processing long text inputs,…
August 28, 2024
Large Language Model,Deep Learning
Llama3.1 Model Quality Evaluation: Cerebras, Groq, SambaNova, Together, and Fireworks
Introduction At Cerebras, we are redefining AI inference by delivering unparalleled speed, quality, and efficiency. Our new inference solution sets an industry benchmark, delivering 1800+ tokens per…
August 27, 2024
Large Language Model,Deep Learning
Introducing Cerebras Inference: AI at Instant Speed
Today, we are announcing Cerebras inference – the fastest AI inference solution in the world. Cerebras inference delivers 1,800 tokens per second for Llama3.1 8B and 450 tokens per second for…
August 21, 2024
Large Language Model,Deep Learning
Introducing DocChat: GPT-4 Level Conversational QA Trained In a Few Hours
We are excited to announce the release of Cerebras DocChat, our first iteration of models designed for document-based conversational question answering. This series includes two models: Cerebras…
August 20, 2024
Large Language Model,Deep Learning
Revolutionizing Life Science and Healthcare with Generative AI
Introduction Healthcare currently accounts for 17% of GDP in the United States, making it one of the country’s largest economic sectors and an industry with immense potential to transform the human…