Cerebras AI Model Studio

The Cerebras AI Model Studio is a simple pay by the model computing service powered by dedicated clusters of Cerebras CS-3s and hosted by Cirrascale Cloud Services. It is a purpose-built platform, optimized for training and fine-tuning large language models on dedicated clusters of millions of cores. It provides deterministic performance, requires no distributed computing headaches, and is push-button simple to start.

Contact Sales To Get Started

SOFTWARE PLATFORM

Software that Integrates Seamlessly with your Workflows

The Cerebras Software Platform, CSoft, has two main parts:

The Cerebras ML Software integrates with the popular machine learning frameworks PyTorch, so researchers can effortlessly bring their models to the CS-3 system.

The Cerebras Software Development Kit allows researchers to extend the platform and develop custom kernels – empowering them to push the limits of AI and HPC innovation.

Programming at Scale White Paper
CEREBRAS ML SOFTWARE

Unmatched Productivity and Performance

Cerebras ML Software makes it simple to get existing PyTorch models running on the CS-3.

Our PyTorch interface library is a simple wrapper for PyTorch program exposed through API calls that is easy to add as few extra lines of code for an existing PyTorch implementation. The integration is via lazy tensor backend with XLA to capture the full graph of a model and map it optimally onto our massive wafer scale engine (WSE-3).

Learn more about PyTorch integration

SOFTWARE DEVELOPMENT KIT

Designed for flexibility and extensibility

The Cerebras SDK enables developers to extend the platform for their work, harnessing the power of wafer-scale computing to accelerate their development needs. With the SDK and the Cerebras Software Language (CSL), developers can target the WSE’s microarchitecture directly using a familiar C-like interface for developing software kernels.

SDK Whitepaper
CEREBRAS GRAPH COMPILER

Cerebras Graph Compiler Drives Full Hardware Utilization

The Cerebras Graph Compiler (CGC) automatically translates your neural network to an optimized executable program.

Every stage of the process is designed to maximize WSE-3 utilization. Kernels are intelligently sized so that more cores are allocated to more complex work. The Graph Compiler then generates a placement and routing, unique for each neural network, to minimize communication latency between adjacent layers.