CPUs, bottlenecks, and games: the problem with CPU benchmarking

The long-awaited Ryzen 7000X3D series is here, and everyone agrees that the Ryzen 9 7950X3D is the fastest CPU for gaming… but by how much? That’s a tough question to answer because the reviews are all over the place. Some publications found that the 7950X3D was barely any faster than Intel’s Core i9-13900K, while others found larger margins of over 10%. It’s not like reviewers are testing completely different games, and in non-gaming benchmarks like Cinebench R23, the scores are about the same across the board, give or take a percentage point.

This isn’t the first time reviewers can’t agree on how fast CPUs are for games. In fact, it happens to pretty much every CPU, whether it has a fancy 3D V-Cache or not. We don’t really see these wide, varying margins for reviews on GPUs, SSDs, or even CPUs in non-gaming benchmarks. So what’s the deal? It ultimately comes down to the unique behavior of CPUs in games and the different testing methodologies used from review to review.

The curious case of the CPU bottleneck

A render of a Ryzen 7000 CPU in an AM5 motherboard.

Modern GPUs have anywhere from hundreds to tens of thousands of cores. These cores are highly flexible and are ideal for tackling workloads that scale in difficulty. This means the best gaming GPUs can handle graphics settings that result in varying visual quality and frames per second. Lowering graphics settings like resolution makes the math for rendering frames easier, meaning more frames can be rendered per second. On the other hand, if frames are harder to render, fewer will be made per second.

The role of the CPU in games is vastly different from that of the GPU. Since the early 2000s, many processes originally performed on the CPU are now done by the GPU, leaving the CPU with relatively little to do. The most important task of the CPU is just to get these minimal tasks done as soon as possible.

But there are two major problems. Firstly, these tasks can’t be spread evenly to all cores and threads, so more cores doesn’t always mean better performance. Secondly, bigger cores with more computational power won’t be useful since these workloads are so basic. These factors make clock speed and cache size disproportionately important for gaming. Cache reduces the time spent waiting for data, which is a significant factor in performance loss. Clock speed, on the other hand, is the only realistic way to speed up workloads that can’t take advantage of modern chips’ raw horsepower.

A PC’s gaming performance is determined mainly by the GPU and the CPU (storage and RAM are usually secondary factors), but not at the same time because at any given moment, your performance is either limited by the GPU or the CPU. That naturally leads to one big question: When is a PC limited by the CPU or the GPU? This question actually gets to the heart of one of the most confusing things about gaming benchmarks because the difference between GPU and CPU bottlenecks isn’t very intuitive.

When your PC is GPU limited, the graphics card will be running at or close to 100% usage, which means using as many resources as possible and usually hitting the max power consumption. This means that you can trade frames for visual quality and vice versa. But for most games, these graphics settings don’t directly impact the CPU, and even in games with CPU-related settings, there are usually only a few.

Increasing graphics settings isn’t necessary to create a CPU bottleneck in games. In fact, increasing graphics settings practically ensures you’ll never see a CPU bottleneck. Remember, the CPU is pretty limited in the amount of work it can do, and while there are few, if any, settings you can tweak to increase the workload in games, you can increase the framerate by lowering graphics settings.

Since the early 2000s, many processes originally performed on the CPU are now done by the GPU, leaving the CPU with relatively little to do.

Running into a CPU bottleneck is simple if you increase the framerate to where the GPU can render more frames than the CPU can deal with. This basically means a CPU has a limit to how many frames it can show in any given game. There are only two realistic ways to remove a CPU bottleneck in games. You can get faster RAM with higher frequency and timings for a small performance boost or lower the framerate — and it’s that second option that creates problems for benchmarking.

Imagine a reviewer is testing two hypothetical CPUs, Gamma and Zeta. In a big-budget, graphically intense game like Atomic Heart, Gamma can get up to 200 FPS while Zeta can achieve 300. Depending on how the reviewers test the CPUs and how hard they push the framerate up, they could find that both CPUs are roughly equal, that Zeta has a slight advantage, or that Zeta has a commanding lead. This is why CPU reviewers often come to different conclusions about CPU performance in games.

Therein likes the basic dilemma of reviewing CPUs in games. You need to push the framerate as high as possible to expose CPU bottlenecks and thus show the true limits of each CPU, often resulting in an unrealistic benchmark. As you can imagine, this phenomenon has been causing controversy for years.

The dilemma of benchmarking CPUs in games

Angled view of black laptop with game controller on wooden table

Most enthusiasts take one of two positions when it comes to CPU benchmarking. The first position advocates for a more scientific approach that exposes the bottleneck without regard for realistic settings, while the second argues that reviewers should test at settings that mean more to readers who want to make purchasing decisions.

Each school of thought has its strengths and weaknesses. The proponents of the scientific position (usually reviews and fans of the company with the fastest CPU) are undoubtedly correct in that this approach reveals the CPU’s true limits in gaming. However, they also often argue that these tests accurately predict future performance. When you upgrade your GPU and suddenly have the capacity for higher framerates, you obviously want a better CPU.

This argument about future performance has been debunked multiple times. While AMD’s FX CPUs initially launched to poor results in gaming compared to Intel’s offerings, over time, chips like the FX-8350 actually gained ground and even overtook their Core i5 counterparts as games started to use more cores and threads. Additionally, I would argue that gamers rarely upgrade graphics cards purely for higher framerates. Gamers want better framerates and better quality settings, including higher resolutions. This reduces the chances of exposing a CPU bottleneck after a GPU upgrade.

NVIDIA GeForce RTX 4090 AMD Radeon RX 7900 XTX

The argument for “realistic” settings is more intuitive and easier to follow, but most of the rhetoric is just about how bad 1080p is for testing high-end CPUs. The thing is, can you even properly test a high-end CPU against one that’s midrange or lower at a higher resolution? If you have a Core i9-13900K, you’re simply more likely to aim for higher framerates purely because your PC also has a high-end GPU like the RTX 4090, while a user with a Core i3-13100 is unlikely to aim much further past 60 FPS because they probably also have a lower-end GPU like an RX 6500 XT. Do you test at settings realistic for the 13900K or for the 13100?

That being said, I think this second camp is making some valid points. I can’t say for certain what the average user wants, but as long time member of this community, I would imagine that most target anywhere 60 to 144 FPS since 60Hz and 144Hz are very popular refresh rates, often come with G-SYNC or FreeSync, and surpassing the refresh rate breaks those technologies. 144 FPS isn’t that much higher for modern CPUs, so CPU bottlenecking is less likely, and consequently, benchmarks that show CPUs getting 300 FPS probably aren’t very useful to most users.

This debate goes back at least six years, and I first encountered it when the first-generation Ryzen series launched in 2017. Reviewers have mostly remained committed to either the scientific point of view or generally indifferent to either side in their testing. On the other hand, readers mostly get upset when their preferred brand loses in the reviews, but they bring up some good points. However, I believe there’s a middle path that can satisfy the requirements of both philosophies, a way of benchmarking that uses both realistic settings and achieves results that are relevant to readers.

Why the framerate itself is a key part of a CPU benchmark

Intel Core i5-12600K sitting on LGA 1700 CPU socket

I’ve always been fascinated by testing methodology and ways to show people results that actually mean something. This is more of a thought experiment rather than a serious proposal, and it’s something I use for fun, but I’ve come up with my own CPU testing methodology.

We can’t ignore the potential maximum framerates made possible by the GPU because it determines how CPUs perform and how realistic it is for users. What I propose is to turn this concept on its head and select settings to achieve a certain framerate rather than setting specific presets or setting everything to minimum.

Here’s the basic methodology. Select a control CPU that every other CPU will be compared against. Since CPUs have a performance limit, the control chip should be the fastest CPU you’re testing, such as a Core i9-13900K or a Ryzen 9 7950X3D. Next, start at higher graphics settings, run your benchmarks, and keep tweaking the settings until your control CPU achieves your desired framerate. For example, in esports titles like Counter-Strike: Global Offensive, your desired framerate should probably be at least 240 FPS on average — if not higher.

A CPU review is supposed to show what’s worth buying and what isn’t, and although reviews are the product of many hours of hard work, not every review critically analyzes the data.

Once you’ve found the settings that achieve your preferred framerate on the control CPU, use those settings when testing other chips. The idea is to show how much faster the control CPU can be compared to theoretically slower CPUs in a test that is both scientific and realistic. What people want to know is whether a higher-end CPU is worth the money, and this kind of methodology is very good at showing that.

There’s one obvious problem with this kind of benchmarking, though: It takes time. Tweaking graphics settings and running benchmarks until the control CPU has the right framerate is time-consuming, and not using presets can mean changing individual settings on each new CPU for each game. Additionally, new CPUs and games require additional calibration, perhaps to the point where you need to make a different CPU the control. Just choosing a preset or setting everything to minimum is way easier.

There are alternatives to this methodology that are much easier to implement. Many reviewers test at multiple resolutions to show the shifting CPU bottleneck, with 1080p having the most CPU bottlenecking and 1440p or 4K the least. Techspot and Anandtech sometimes test multiple GPUs to achieve the same effect since faster GPUs will have a higher potential framerate that may reveal CPU bottlenecking.

Analysis is even more important than methodology

A good testing methodology and high-quality data are only half of what makes a review comprehensive. The other half is analysis, which is when the reviewer informs the readers what the results mean. Many users can make up their own minds about what data means, but not everybody into PC gaming is an enthusiast.

If a review shows a benchmark where one CPU is hitting 500 FPS and another 300, there should be some context on what that means. If it’s an esports title, that difference could be important to anyone who wants to play competitively and needs the highest framerates. For most other games, the performance advantage offered by the faster CPU will unlikely be fully realized or appreciated. I’ve seen some reviews show benchmarks with these kinds of results in very old games and hype up the faster CPU, while other reviews found much more modest margins in more realistic tests.

Ultimately, a CPU review is supposed to show what’s worth buying and what isn’t, and although reviews are the product of many hours of hard work, not every review critically analyzes the data. I appreciate the reviewers who take a moment to discuss CPU bottlenecks and how they grow or shrink with different GPUs and graphics settings. It’s certainly true that some CPUs are faster than others, and are better for gaming, but it’s never clear-cut if that means it’s better for every single user.