Amazon Web Services will deploy approximately 1 million Nvidia GPUs across its global cloud regions by 2027, accelerating AI infrastructure scaling. The expansion includes advanced networking and systems designed for autonomous reasoning and real-time model execution.
The deal reflects a strategic shift toward inference-heavy computing, which now accounts for two-thirds of AI workloads-up from one-third in 2023. Inference chips enable live AI operations without retraining.
Despite AWS’s development of proprietary AI chips, industry experts note that Nvidia dominates critical stack components, creating high switching costs and reinforcing long-term dependency. This ‘infrastructure flip’ embeds Nvidia deeply within cloud provider ecosystems.
U.S. regulators continue to scrutinize Nvidia’s chip exports to China, underscoring geopolitical tensions around AI dominance.