pwshub.com

AMD admits its Instinct MI300X AI accelerator still can't quite beat Nvidia's H100 Hopper

Serving tech enthusiasts for over 25 years.
TechSpot means tech analysis and advice you can trust.

In context: The first official performance benchmarks for AMD's Instinct MI300X accelerator designed for data center and AI applications have surfaced. Compared to Nvidia's Hopper, the new chip secured mixed results in MLPerf Inference v4.1, an industry-standard benchmarking tool for AI systems with workloads designed to evaluate AI accelerator training and inference performance.

On Wednesday, AMD released benchmarks comparing the performance of its MI300X with Nvidia's H100 GPU to showcase its Gen AI inference capabilities. For the LLama2-70B model, a system with eight Instinct MI300X processors reached a throughput of 21,028 tokens per second in server mode and 23,514 tokens per second in offline mode when paired with an EPYC Genoa CPU. The numbers are slightly lower than those achieved by eight Nvidia H100 accelerators, which hit 21,605 tokens per second in server mode and 24,525 tokens per second in offline mode when paired with an unspecified Intel Xeon processor.

When tested with an EPYC Turin processor, the MI300X fared a little better, reaching a throughput of 22,021 tokens per second in server mode, slightly higher than the H100's score. However, in offline mode, the MI300X still scored lower than the H100 system, reaching only 24,110 tokens per second.

The MI300X supports higher memory capacity than the H100, potentially allowing it to run a 70 billion parameter model like the LLaMA2-70B on a single GPU, thereby avoiding the network overhead associated with model splitting across multiple GPUs at FP8 precision. For reference, each instance of the Instinct MI300X features 192 GB of HBM3 memory and delivers a peak memory bandwidth of 5.3 TB/s. In comparison, the Nvidia H100 supports up to 80GB of HMB3 memory with up to 3.35 TB/s of GPU bandwidth.

The results largely align with Intel's recent claims that its Blackwell and Hopper chips offer massive performance gains over competing solutions, including the AMD Instinct MI300X. Likewise, Nvidia provided data showing that in LLama2 tests, a system with eight MI300X processors reached only 23,515 tokens per second at 750 watts in offline mode. Meanwhile, the H100 achieved 24,525 tokens per second at 700 watts. The numbers for server mode are similar, with the MI300X hitting 21,028 tokens per second, while the H100 scored 21,606 tokes per second at lower wattage.

Source: techspot.com

Related stories
2 weeks ago - The setback won't stop us from banking billions, CFO insists Nvidia has confirmed earlier reports that its Blackwell generation of GPUs suffered from a design defect that adversely impacted the yields of the hotly anticipated accelerators.…
2 weeks ago - Needs new investors to get beyond current modest products Chinese GPU-maker Xiangdixian Computing Technology has admitted it has not met its development targets and let go of some staff as part of a restructuring plan.…
2 weeks ago - Uncle Sam apparently worried GPU giant may be punishing customers who shop around The US Department of Justice on Tuesday is said to have stepped up its antitrust investigation into Nvidia, issuing subpoenas seeking evidence for its case...
1 month ago - Investors still shovelling money into AI but 'path to monetization' still far Is the Gen AI bubble about to burst? You'd better hope not, as it appears to be one of the only major growth areas in the US tech economy, according to S&P...
1 week ago - In a CNET exclusive, I visited Sony's PlayStation headquarters to play the all-new PS5 Pro and ask Sony executives what makes this midcycle upgrade different.
Other stories
38 minutes ago - Experts at the Netherlands Institute for Radio Astronomy (ASTRON) claim that second-generation, or "V2," Mini Starlink satellites emit interference that is a staggering 32 times stronger than that from previous models. Director Jessica...
38 minutes ago - The PKfail incident shocked the computer industry, exposing a deeply hidden flaw within the core of modern firmware infrastructure. The researchers who uncovered the issue have returned with new data, offering a more realistic assessment...
38 minutes ago - Nighttime anxiety can really mess up your ability to sleep at night. Here's what you can do about it right now.
38 minutes ago - With spectacular visuals and incredible combat, I cannot wait for Veilguard to launch on Oct. 31.
38 minutes ago - Finding the perfect pair of glasses is difficult, but here's how to do so while considering your face shape, skin tone, lifestyle and personality.