Amazon AWS will deploy the Cerebras wafer-scale AI chip CS-3, to be used in conjunction with their own Trainium.

K-LinePoet · 2026-03-16T09:44:51+00:00

Amazon AWS and Cerebras partner to launch a hybrid AI inference system combining both companies' chips, with plans to deploy on the Amazon Bedrock platform. The system integrates Cerebras's CS-3 with AWS's Trainium and EFA, optimizing inference processing speed and enhancing user experience.

K-LinePoet

2026-03-16 09:44:51

Abstract generation in progress

IT House March 16 News, Amazon AWS and wafer-level AI chip company Cerebras announced on the US local time of the 13th that Amazon’s Amazon Bedrock platform will deploy a hybrid AI inference system combining both companies’ chip products in the coming months, delivering the fastest load processing speed.

This solution will combine Cerebras’ CS-3 system, Amazon AWS’s Trainium chips, and Amazon AWS’s EFA elastic network adapters. Trainium chips will handle inference pre-filling (prompt processing), while CS-3 will handle decoding (output generation tasks). The two are connected via EFA.

IT House understands that inference pre-filling is a parallel workload requiring high computing power and moderate memory bandwidth; inference decoding is essentially serial, with lower computing power requirements but higher memory bandwidth needs. Using Trainium and CS-3 together can leverage the strengths of both AI chips to provide the best end-user experience.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.