Skip to main content

The world’s fastest GLM-4.6 is now available on Cerebras at 1,000 TPS >>

Pricing

Inference API access

Free

The easiest way to get started with Cerebras

  • Access to all Cerebras powered models
  • The world’s fastest inference – 20x faster than OpenAI and Anthropic
  • Community support via Discord
get api key

Developer

Generous rate limits for power users

  • Everything in Free
  • Self-serve payment starting at just $10
  • 10x higher rate limits than free tier
  • Higher priority processing
get api key

Enterprise

Highest throughput, custom weights, and guaranteed uptime

  • Everything in Free
  • Highest rate limits for production workloads
  • Lowest latency with dedicated queue priority
  • Support for custom model weights
  • Model fine-tuning and training services
  • Dedicated support team with response time guarantees
contact sales

Cerebras Code

Pro
$50/month

Experience instant code completions with frontier models

  • Top open source model access with fast, high-context completions.
  • Send up to 24 million tokens/day ($48/day worth of value)
  • Ideal for indie devs, simple agentic workflows, and weekend projects.
sign up

Max
$200/month

Built for teams running demanding workloads at scale

  • Highest rate limits for production workloads
  • Lowest latency with dedicated queue priority
  • Support for custom model weights​
  • Model fine-tuning and training services
  • Dedicated support team with response time guarantees​
sign up

Developer tier Pricing

*Preview models are intended for evaluation purposes only, and are not intended for use in production environments. They may be discontinued at short notice.

**Models have been scheduled for deprecation as part of ongoing efforts to serve the most up-to-date models.

Partners

Get access to Cerebras Inference through our partner APIs