Nvidia is reportedly developing a dedicated AI inference processor, with OpenAI and other AI firms slated as early adopters. The new chip aims for faster and more efficient AI model development, potentially debuting at Nvidia's GTC developer conference this month.
This move addresses the growing demand for specialized inference hardware, as rivals like Google and Amazon, along with startups like Cerebras and SambaNova, offer competing solutions. Nvidia's current GPUs, while dominant, are facing criticism for high energy consumption in inference tasks.
The upcoming chip is expected to integrate technology acquired from the startup Groq Inc. Groq's 'language processing units' are known for novel architecture enabling lower energy usage for inference.
OpenAI reportedly intends to use the new chip to power its Codex programming tool, aiming to compete with Anthropic's Claude Code. Nvidia is also promoting its Grace CPUs for certain AI workloads, with Meta Platforms already committing to a significant CPU-only deployment for its ad-targeting agents.