AMD 'Advancing AI' Event: Everything Revealed in 9 Minutes video

AMD 'Advancing AI' Event: Everything Revealed in 9 Minutes

All right, big, big, welcome to advancing A I 2024. Today, I'm super excited to launch our Fifth Gen Epic portfolio. It all starts with our latest Z five core. We designed Zen five to be the best in server workloads. And that means delivering an average of 17% higher I PC than Zen four and adding support for full A VX 512 turn is fantastic. It features up to 100 and 50 billion transistors across 17 triplets, scales up to 100 and 92 cores and 384 threads. And one of the things that's very special about Fifth Gen Epic is we actually thought about it from the architectural standpoint in terms of how do we build the industry's broadest portfolio of CP US that both covers all of the new cloud workloads as well as all of the important enterprise workloads and things like, you know, building fifth Gen Epic CPU at five gigahertz. That was because there was a new workload. We were seeing the need when you think about industry leading performance for A I head notes, that frequency becomes really important. And that's an example of where we've brought in the portfolio with T. Now, this version of turn is actually 100 and 28 cores and it's optimized for scale up workloads. It has 16 4 nanometer triplets. So if you look outside of the um of the ring, you see the, the 16 4 nanometer triplets and a six centimeter io die in the center. And we've optimized this for the highest performance per core because it's extremely important in enterprise workloads. When you see that you know software is often licensed on a per core basis, you want the highest possible uh performance per core. OK. Now, this is also Turin, this is the 192 core version of turn and this guy is optimized for scale out workloads. And here we use 12 3 nanometer compute chiplets and it's the same six nanometer IO dye in the center. And this version of turn is really optimized for cloud. So applications that benefit from maximum compute per socket. This is what we do. So let's now take a look at your performance. We're gonna compare many of the things um in the next few slides to the competition's top of stack Emerald Rapids. Uh When you look at the competition's top of stack, a dual socket fourth Gen Epic server is already 1.7 times faster on spec and rank 2017. And with fifth Gen Epic, the performance is fantastic. We extend that lead with 2.7 times more performance. Now, we know it's a very competitive space and we fully expect as our competition launches their next generation CP US and they ship in volume. That turn will continue to be the leader for the enterprise base. There are many commercial software stacks that are licensed for core and CIO s wanna optimize the cost by running solutions on the fewest possible course when running these workloads on prem again, fourth gen Epic is already the performance leader. And with fifth gen, we deliver 1.6 times more performance per core than the competition. That's 60% more performance with no additional licensing cost. Today, I'm very excited to launch M I 325 X. Our next generation Instinct Accelerator with Leadership Generative A I performance M I 325 again leads the industry with 256 gigabytes of ultrafast HBM three E memory and six terabytes per second of bandwidth. When you look at M I 325 we offer 1.8 times more memory, 1.3 times more memory bandwidth, 1.3 times more A I performance in both FP 16 and FP eight compared to the competition. And when you look at that across some of the key models, we're delivering between 20 40% better inference performance and latency on things like llama Miral and mix straw. And importantly, one of the things that we wanted to do was in was really keep a common infrastructure. So M I 325 leverages the same industry standard OCP compliant platform designed that we used on M I 300. And what it does is it makes it very easy for our customers and partners to bring solutions to market. Now, when you look at the overall platform with HEP us, we deliver significantly more A I compute and memory as well. The HEPU version features two terabytes of HBM three E memory and 48 terabytes per second of aggregate memory bandwidth enabling our customers to run more models as well as larger models on a single M I 325 server today. I'm very excited to give you a preview of our next generation M I 350 series. The M I 350 series introduces our new C DNA for architecture. It features up to 288 gigabytes of HBM three E memory and adds support for new FP four and FP six data types. And again, what we're thinking about is how can we get this technology to market the fastest? It actually also drops into the same infrastructure as M I 300 M I 325 and brings the biggest generational leap in A I performance in our history when it launches in the second half of 2025 looking at the performance C DNA four delivers over seven times more A I compute. And as we said increases both memory capacity and memory bandwidth. And we've actually designed it for higher efficiency, reducing things like networking overhead so that we can increase overall system performance. In total CN A four will deliver a significant 35 times generational increase in A I performance. Compared to C DNA three. We have been relentlessly focused on performance from the latest public models to the flagship proprietary models with each rock and release. We've delivered significant performance gains. Our latest release, Roam 6.2 delivers 2.4 times the performance for key infants workloads compared to our six do release from last year. These gains have been made possible by a number of enhancements, improved attention algorithms, graph optimizations, compute libraries, framework optimizations and many many more things. Similarly, Roam six star two delivers over 1.8 times improvement in training performance. And again, these gains have been made possible by improved attention algorithms like flash attention V through three that is supported, improved compute communication libraries parallelization strategies and framework optimizations. The MD Pensando team has now delivered the third generation P four engine featuring over 200 full, fully programmable match processing units. A bunch of table engines. What it means is a super high performance, fully programmable data path engine that can deliver 400 gigabits per second line rate performance. While multiple advanced services run concurrently on top of that engine, these services can be coded and changed at the speed of software while matching the performance of hardwired solutions. Selena offers 400 gigabits per second throughput while running Sdn security encryption services simultaneously, it provides greater than two extra performance of our previous generation while being fully backward compatible. And that fully programmable pipeline means we can continuously deliver innovations and features to our customers and they can add their own innovation on top. Selena will power high performance front-end networks for A I systems and, and meet the increased demands of general compute clouds that are powered by TCP US that need to be fed some data. But I'm equally excited to announce AM D's Polara 400. It uses the same third generation P four engine to enable what we expect will be the industry's first ultra Ethernet Consortium ready A in it will deliver the performance benefits of UEC ready RD MA and ensure that AM D's customers can continue to innovate at a rapid pace and achieve the fastest time to production. The A MD teams, networking teams are delivering extremely well and I'm pleased to announce that Selena and Polara will both be available uh early next year. The A MD Ryzen A I pro 300 series resets the bar for what a business PC can do. We combine our high performance Z five CPU, our new R DNA 3.5 graphics, our new X DNA two NPU with 50 plus tops of A I performance and all of this is within copilot plus P CS.

Related stories

Other stories