Amazon Web Services (AWS) is partnering with Cerebras Systems to offer the company's advanced wafer-scale AI chips to its cloud customers. This multiyear agreement will integrate Cerebras's WSE-3 chip, featuring 900,000 cores, into AWS data centers.

The collaboration aims to significantly accelerate AI inference workloads, with an expected five-fold increase in output speed. Cerebras's CS-3 appliance, housing the WSE-3 chip, will be accessible via AWS Bedrock, enhancing the performance of foundation models.

A key innovation is a "disaggregated architecture" developed by AWS and Cerebras. This approach separates AI processing tasks, assigning prefill stages to AWS's Trainium chips and decoding to the Cerebras WSE-3. This specialization, facilitated by AWS's Elastic Fabric Adapter (EFA) for high-speed, low-congestion connections, is designed to dramatically speed up how quickly AI models generate responses.

The WSE-3 chip boasts an impressive 27 petabytes per second of internal memory bandwidth, far exceeding conventional interconnects. This partnership follows Cerebras's recent substantial deal with OpenAI, underscoring its growing importance in the AI hardware market.