Cerebras

Inference

The world’s fastest inference.
20x faster than GPUs, 1/5 the cost.

Award Winning Technology

Cerebras continues to be recognized for pushing the boundaries of AI

TIME

FORBES

FORTUNE

Cerebras AI Day

Watch keynotes from our historic event

Opening Keynote by CEO, Andrew Feldman

Hardware Keynote by CTO, Sean Lie

ML and Product Keynote by VP of Product, Jessica Liu 

ai model services

You bring the data, we'll train the model

Whether you want to build a multi-lingual chatbot or predict DNA sequences, our team of AI scientists and engineers will work with you and your data to build state-of-the-art models leveraging the latest AI techniques.

FIND OUT MORE

high performance computing

The fastest HPC accelerator on earth

With 900,000 cores and 44 GB of on-chip memory, the CS-3 completely redefines the performance envelope of HPC systems. From Monte Carlo Particle Transport to Seismic Processing, the CS-3 routinely outperforms entire supercomputing installations.

FIND OUT MORE

Models on Cerebras

The Cerebras platform has trained a huge assortment of models from multi-lingual LLMs to healthcare chatbots. We help customers train their own foundation models or fine-tune open source models like Llama 2. Best of all, the majority of our work is open source.

llama 3.1

Foundation language model
8B-70B, 15T tokens
128K context

llama 2

Foundation language model
7B-70B, 2T tokens
4K context

Mistral

7B Foundation model that leverages
grouped-query attention,
coupled with sliding window attention

JAIS

Bilingual Arabic + English model
13B, 30B Parameters
Available on Azure, G42 Cloud

MED42

Medical Q&A LLM
Fine-tuned from Llama2-70B
Scores 72% on USMLE

bloom

Massive multi-lingual LLM
176B parameters, 366B tokens
2k context

FALCON

Foundation language model
40B, 1T tokens,
(Uses Flash Attention and Multiquery)

MPT

Foundation model trained
on 1T tokens of English
that uses ALiBi positioning method

starcoder

Coding LLM
15.5B parameters, 1T tokens
8K context

diffusion
transformer

Image generation model
33M-2B parameters
Adaptive layer norm

T5

For NLP applications
Encoder-decoder model
60M-11B parameters

CRYSTALCODER

Trained for English + Code
7B Parameters, 1.3T Tokens
LLM360 Release

CEREBRAS-GPT

Foundational Language Model
100m - 13b parameters
NLP

BTLM-chat

BTLM-3B-8K fine-tuned for chat
3B parameters, 8K context
Direct Preference Optimization

gigaGPT

Implements nanoGPT on Cerebras
Trains 175B+ models
565 lines of code

Latest blog posts