pwshub.com

Intel Stability Issues - The Story So Far

Let's talk Intel and the CPU crashing issues. If you've been closely following this situation, there won't be anything new in this article, other than our opinion on the problem and how we're handling reviews until the problem is fully addressed. For the past few weeks, TechSpot has been reporting the different angles of the story, from the CPU stability concerns to the financial and business side of it. However, in terms of hands-on testing, we've just been waiting for something to come up, and it sounds like we're still a few weeks away from the microcode update that's meant to address this problem.

In the meantime, reviews of AMD's new Zen 5 processors will be out, and rather than talk about the Intel stability issues on the side in that content, we figured a dedicated piece before those reviews go live was a better way of going about it. So, let's get to it, and we'll start by quickly summarizing the situation to date...

Things have not been going well over at Intel

Intel stock price has tanked, and they recently announced massive job cuts. While the writing has been on the wall for some time now, it's always sad when you hear 15,000 people are going to lose their job. It's also bad news for the entire industry, well maybe not for Intel's direct competitors, but certainly for the industry as a whole.

It's vital for us PC enthusiasts that Intel remains competitive because history has shown us that a single dominant force generally leads to stagnation. So the hope is Intel's next-generation products are good, really good, and they're going to need to be because the damage that's been done to the Intel brand over the past few months is substantial.

Truth is Intel has been struggling for a while now, and although it wasn't necessarily visible on the balance sheets, demand for their processors was slowing. Demand for their 11th, 12th, 13th, and now 14th generation desktop processors has been weak, and the situation seemed to worsen with each new generation.

There have been a number of factors at play. First, AMD has gained a lot of traction, delivering class-leading performance at highly competitive prices, on platforms that aren't nearly as locked down, ultimately making them more appealing to consumers (the server market has seen similar trends albeit that's been a harder market to penetrate for AMD). More recently, however, Intel's stability concerns have scared off buyers, and a quick look over on the Amazon top CPU seller list won't show a single 13th or 14th-gen Intel processor in the top 10, which is crazy to see.

What's been crashing?

To put it simply, owners of some 13th- and 14th-generation Intel CPUs have been experiencing crashes when performing certain tasks, such as loading games or running game servers. This would sometimes appear as an "out of video memory" issue, despite it being a CPU problem, not a GPU problem. It has primarily affected high-end K-series parts such as the 13900K and 14900K, but there are many other Raptor Lake processors that can suffer from the problem.

The mass stability concerns for 13th and 14th gen processors are something we first touched on a few months ago in our article "Intel CPUs Are Crashing and It's Intel's Fault." We were very direct with that title, but we felt we needed to be. Quite a few online commentators and even some in the tech media were incorrectly blaming Intel's partners, the board makers, for these stability issues, when in reality, we knew Intel was squarely to blame, and I think now that has certainly proven to be true.

In that article (back in late April), we tested what at the time were new "Intel Baseline" profiles developed by Asus and Gigabyte. These profiles were meant to cap the power limits of Intel's K-SKU processors in the hope of addressing the stability issues. The idea was that the crashes were caused by board partners running the CPUs above safe limits and above what Intel recommends or specifies to their partners.

The entire process was a mess, though. Not all board partners issued BIOS updates, and while Asus and Gigabyte used the "Baseline" terminology, neither used the same settings. The performance hit when using the Gigabyte Baseline profile was quite extreme, so naturally, customers weren't happy, and Intel stepped in, taking the opportunity to place the blame on their partners.

Then in early May, Intel requested that all board partners implement the new official Intel "Default Settings Profile" as the BIOS default by May 31st to fix 13th and 14th-gen Core CPU stability issues. This made it seem as though the only adult in the room was stepping up to put the kids in line – the kids in this scenario being the board partners.

Playing the BIOS blame game

In typical Intel fashion, though, this just added to the confusion. It wasn't just consumers and PC enthusiasts who were confused; Intel's partners seemed just as confused. This is because Intel issued three profiles as the default profiles. How does that work, you might be wondering? To this day, no one knows.

The three profiles were labeled 'Baseline,' 'Performance,' and 'Extreme,' and even today, board makers seem to keep switching between what the default profile is for their boards. Some started with 'Performance' and then moved to 'Extreme,' and basically, none of them were rolled out by Intel's promised date of May 31st.

The thing is, none of these solved the issue.

The thing is, none of these solved the issue. Crash reports were still coming in, and by this point, major game developers and server companies were starting to go public with information regarding mass crashing of Intel K-SKU powered servers, used for their class-leading single-core performance.

Alderon Games was one of the developers that went public, stating that "Intel is selling defective 13th-14th-gen CPUs." The developer of dino survival game Path of Titans switched all of its servers to AMD, claiming AMD experiences 100 times fewer crashes compared to Intel because it's only a matter of time before affected CPUs fail.

At the time, we speculated that these BIOS profiles weren't going to fully solve the issue but were rather intended to buy Intel time – either to bury/ignore the story in the hope that it goes away or come up with a plan to minimize costs.

There's no real reason for Intel to create a 125-Watt profile for 253-Watt K-SKU parts, other than to make a power profile so limited that any CPU, almost no matter how degraded, would work.

It seems as though Intel has tried to avoid taking any real responsibility for this. As we said, they tried blaming their partners, and that didn't work, so it seemed the next step was to blame you, the customer. There's no real reason for Intel to create a 125-Watt profile for 253-Watt K-SKU parts, other than to make a power profile so limited that any CPU, almost no matter how degraded, would work.

Intel advertises parts like the Core i9-13900K and 14900K to run at a continuous 253 Watts, and at no point do these CPUs need to drop below that limit when under load. But it seems as though this limit could be degrading some of these CPUs too rapidly, and if that is the case, rather than replace those CPUs, Intel went with the 'Performance' profile as a fail-safe. Simply load that profile, and it will reduce the long-duration power limits, and the hope is that will stabilize the system – and Intel won't need to replace your CPU under warranty.

This hasn't appeared to help, though, and the 13th and 14th-gen stability issues (both Raptor Lake chips) have continued. About three weeks ago, on July 22, Intel issued a report addressing the situation, in which they claimed that after extensive analysis, the cause of the instability issues was determined to be elevated operating voltage, stemming from a microcode algorithm resulting in incorrect voltage requests to the processor.

Degradation or not?

At the time, Intel said they were targeting mid-August for the release of a patch to fix the issue, so presumably, we will see something in the next week or two. That's what we're currently waiting on. That's essentially where we are after all these months – we're still waiting on Intel to release a fix. The question on everyone's mind is, will this updated microcode even fix the issue, and is the issue even fixable at this point?

As we speculated months ago, this appears to be a degradation issue, and when we speculated that this was the problem, many Intel fans were very unhappy. However, given what we knew at the time, rapid degradation seemed like a very likely scenario.

During the Computex trade show in Taiwan (mid May), multiple sources told us that processor degradation was occurring, and this was the root cause of Intel CPUs' instability issues. The sources that confirmed this are very close to the issue; these people really know what they're talking about, so we have no reason to doubt the information we were given.

But Intel won't confirm nor deny that degradation has occurred to the point where it's creating these stability issues. Even when asked directly, they dodge the question. They just say that the upcoming patch will stop it from happening but don't directly confirm that it has happened. Based on that response, you could safely assume that it has.

There's also the via oxidation issue which Intel confirmed took place during 2023 – roughly a year, which is a very significant timeframe. But Intel has denied that oxidation is the main cause of current stability issues. Despite that, Intel has not disclosed batch numbers for the affected CPUs. When asked directly about it, Intel simply said, "Intel will continue working with its customers on Via Oxidation-related reports and ensure that they are fully supported in the exchange process."

In the end, it's Intel's customers that are left in the dark. Did Intel sell you a CPU with a manufacturing defect that they should replace immediately, or are your stability issues related to something else? No one knows, and that's how Intel likes it.

Intel also said to The Verge that they will not be recalling Raptor Lake CPUs despite the stability issues and will not be halting sales or performing channel inventory recalls while it validates the update. This means that even though Intel has acknowledged there's an issue and hasn't released a fix yet for that issue, they are still perfectly happy to sell you a CPU – a CPU that might be defective.

All of that said, Intel has just announced an extended 2-year warranty on all of their boxed 13th and 14th-gen desktop processors. They have also said any of their customers who are affected should contact customer support. But the problem is that Intel has to process those RMA claims and we're hearing mixed reports from users on how successful they were in getting a replacement CPU, if at all, so that's a developing story.

What's next?

Let's be honest, throughout this saga, Intel has tried to deflect blame and minimize the problem at every turn. They tried to throw board partners under the bus and create a power profile fix so buyers likely wouldn't be able to get a replacement CPU, instead reducing performance because the previous settings were "not Intel's recommendation."

Now they are saying, "trust me, just RMA your CPU multiple times and maybe one of those times we'll replace it for you." At this point, Intel can't be trusted to do the right thing for customers.

This is where we are right now. There have been many more angles to this story and loads of speculation, but at the end of the day, what we know for sure is that 13th and 14th-gen CPUs are experiencing stability issues en masse, with the 13900K and 14900K most affected. Intel says this issue can also impact 65W parts, so base model Core i5 parts such as the locked 13400 can also run into the same issue.

So, finally, let's talk about our plans moving forward. As it stands, we can't and don't recommend anyone purchase Intel 13th and 14th-gen processors, and our stance on this won't change until the issue is fully rectified. This means no more crashing, and those who have 13th or 14th-gen processors that are still crashing, presumably due to irreversible degradation, are provided a replacement part, so Intel honors their warranty.

We think Intel knows exactly what the issue is.

We will continue to benchmark and include 13th and 14th-gen processors in our reviews, using the Intel Extreme profile, but we won't be recommending them. The next step for us will be to re-test all those Intel CPUs once the microcode fix is released in a week or two (hopefully), and at that point, we will have to re-evaluate the situation.

As for Intel, they've really fumbled this situation. They've taken way too long to address this and they've yet to do so properly. These issues have been known as far back as mid-2023, and Intel announced that they were officially investigating months ago.

So the situation is either Intel knows exactly what the problem is and isn't willing to bear the costs involved to fix it, so they're buying time and trying to dodge the problem in the process, hoping it will blow over as people just move on to the next thing. Or Intel really has been searching for answers all this time, which would suggest they don't understand their own products nearly as well as they should, and that's potentially an even bigger concern.

We think Intel knows exactly what the issue is. CPUs have degraded and will need to be replaced at a huge cost to them, and they are actively trying to avoid replacing degraded parts. That's just our opinion, but after all this time, it's looking more and more likely that this is indeed the case.

So, to reiterate, we're not ignoring these issues. Our plan was never to push Intel for six years to clearly define a power specification for their processors, then be among the first in the tech media to discuss the 13th and 14th generation stability issues online, then fly 10 hours to Taiwan to speak with engineers about the issue, only to ignore it moving forward. Clearly, ignoring the problem isn't what we've been doing. We tested the first round of BIOS updates, and since then, there's really been nothing to test. With nothing to test, we can only write an article like this where we discuss the issue but give you nothing concrete.

Source: techspot.com

Related stories
1 month ago - At the time of writing, Black Myth: Wukong has amassed a staggering 2,223,179 concurrent players on Valve's platform following its Tuesday launch. That's an achievement that puts it ahead of Palworld, which peaked at 2,101,867 players...
3 weeks ago - Today we'll show you how to boost the gaming performance of your Ryzen CPU by 10% with one simple trick. Sounds too good to be true, right? But somehow, this actually works.Read Entire Article
1 month ago - AMD keeps botching their product launches, and Zen 5 is just another example in a string of releases over the last two years that range from disappointing to downright embarrassing.Read Entire Article
1 month ago - Is Windows 10 or Windows 11 faster for gaming? We revisit this question using four hardware setups benchmarked with fresh OS installations in 13 games, with updated drivers and testing software.Read Entire Article
1 week ago - The TechSpot PC Buying Guide helps you navigate the current state of the PC hardware market. With AMD's Ryzen 9000 series underperforming and Intel facing stability issues, we guide you to the best current options.Read Entire Article
Other stories
2 minutes ago - As an Amazon Prime member, not only do you get a free Grubhub+ membership, you can also score $10 off your first $15 order.
2 minutes ago - Amazon's second Prime Day event of 2024 is still a few weeks away, but there are some bargains you can score now.
2 minutes ago - YouTube will roll out a new generative AI video tool named Veo later this year that'll allow creators to create 6-second clips with nothing more...
1 hour ago - FBI Director hails successful action but calls it “just one round in a much longer fight.”
1 hour ago - SocialAI takes the social media "filter bubble" to an extreme with 100% fake interactions.