November 18, 2024
Empirical Upper Bounds for Unstructured Sparsity in Compute-Efficient Language Modeling
Large language models have driven significant progress in…
November 13, 2024
Self-Data Distillation for Recovering Quality in Pruned Large Language Models
Large language models have driven significant progress in…
September 4, 2024
Bilingual Adaptation of Monolingual Foundation Models
We present an efficient method for adapting a monolingual…
July 2, 2024
Bilingual Adaptation of Monolingual Foundation Models
We present an efficient method for adapting a monolingual…
November 13, 2023
Efficient Algorithms for Monte Carlo Particle Transport on AI Accelerator Hardware
The recent trend toward deep learning has led to the…
November 8, 2023
Position Interpolation Improves ALiBi Extrapolation
Linear position interpolation helps pre-trained models…
September 26, 2023
Scaling the “Memory Wall” for Multi-Dimensional Seismic Processing with Algebraic Compression on Cerebras CS-2 Systems
We exploit the high memory bandwidth of AIcustomized…
September 22, 2023
BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model
We introduce the Bittensor Language Model, called…
August 31, 2023
Jais and Jais-chat: Arabic-Centric Foundation and Instruction-Tuned Open Generative Large Language Models
We introduce Jais and Jais-chat, new state-of-the-art…
December 14, 2021
A Big Chip for Big Science: Watching the COVID-19 Virus in Action
It’s hard to imagine a better example of “AI for good” than…