Qualcomm’s new AI systems promise 10x bandwidth, lower power use

Source: interestingengineering

Author: @IntEngineering

Published: 10/28/2025

To read the full content, please visit the original article.

Qualcomm has unveiled its next-generation AI inference accelerators, the AI200 and AI250, designed to significantly enhance data center generative AI performance with improved efficiency and scalability. The AI200 card supports 768 GB of LPDDR memory, enabling large-scale language and multimodal model inference with a focus on lowering total cost of ownership (TCO). Building on this, the AI250 introduces a near-memory computing architecture that delivers over 10 times higher effective memory bandwidth and substantially reduces power consumption, facilitating more efficient disaggregated AI inferencing. Both solutions feature direct liquid cooling, PCIe for scale-up, Ethernet for scale-out, and a rack-level power consumption of 160 kW, targeting hyperscaler-grade performance with sustainability in mind. Qualcomm emphasizes seamless integration through a rich software stack and open ecosystem, supporting major AI frameworks and enabling one-click deployment of pre-trained models. Their software tools, including the Efficient Transformers Library and AI Inference Suite, allow developers to operationalize AI models easily

Qualcomm’s new AI systems promise 10x bandwidth, lower power use

Tags