AMD Epyc vs Threadripper vs Ryzen 9950X3D Pro and Cons

Ladies/Gents,

I know this is an ever moving target with new hardware coming out, but still would love to hear the debate of why would you opt to build a system with a current or even older Epyc vs getting a Threadripper Reg/Pro vs the new Ryzen 9950X3D when it comes to non-gaming but rather a workstation environment. I guess this is hard with cores/clocks/cache and PCI-E lanes. Yet I feel that people could give examples and Pros and Cons of one CPU build versus the others.

3 Likes

Epyc 7002/7003 series

  • Cheap on eBay
  • Lots of PCIe 4.0 lanes (128 IIRC)
  • 8 channels of 3200MHz DDR4 memory, which is also cheap.
  • Can have two sockets to double CPU and memory capacity.
  • Lower boost frequencies (4.0GHz for 7443, maybe slightly higher for frequency optimized).
  • Memory bandwidth around 200GB/s per CPU.
  • TDPs up to 280W.
  • Have to look out for vendor locked second-hand CPUs.

Epyc 9004/9005 series

  • Some QS versions going cheap on eBay, but otherwise current and expensive.
  • Lots of PCIe 5.0 lanes (128 IIRC)
  • Can have two sockets to double CPU and memory capacity.
  • 12 channels of 4800/6000MHz DDR5 memory, which is expensive.
  • Memory bandwidth around 400-600GB/s per CPU depending on generation and model.
  • Higher boost clocks than earlier series, going as high as 5.0GHz in the 9575X.
  • TDPs up to 500W.
  • Have to look out for vendor locked second-hand CPUs.

Threadripper Pro 3000/5000 series

  • Still surprisingly pricey, possibly due to low volumes.
  • Lots of PCIe 4.0 lanes (motherboards can have up to 7 x16 slots).
  • 8 channels of 3200MHz DDR4 memory, which is cheap.
  • Higher boost clocks than Epycs, typically in the 4.3GHz range.
  • Have to look out for vendor locked second-hand CPUs.
  • TDPs in the 280W range, but goes up with overclocking.

Threadripper 7000 series

  • Current generation, so pricey.
  • Reasonable amount of PCIe 4.0 and 5.0 lanes (motherboards typically having 3 x16 PCIe 5.0, 1 x16 PCIe 4.0 and supports for several M.2 drives).
  • 4 channels of 4800/6000MHz DDR5 memory.
  • Boost clocks up to 5.3GHz.
  • TDPs around 360W, but goes up with overclocking.

Threadripper Pro 7000 series

  • Very expensive.
  • Lots of PCIe 5.0 lanes (7 x16 PCIe 5.0 and several M.2 drives).
  • 8 channels of 4800/6000MHz DDR5 memory.
  • Boost clocks up to 5.3GHz.
  • TDPs around 360W, but goes up with overclocking.

Ryzen 9 9950X3D

  • Comparatively cheap.
  • Typically motherboards support 1 x16 PCIe 5.0 slot, a couple of M.2 drives and another slot providing any remaining lanes.
  • 2 channels of 6000MHz+ DDR5 memory.
  • Boost clocks up to 5.7 GHz.
  • TDP of 170W, but goes up with overclocking.

My take on the situation is this:

  1. If you want a fast general purpose workstation with a single GPU, decent networking and a reasonable amount of fast storage, then go for the 9950X3D. Also the cheapest to run.
  2. If you (think you) need two GPUs instead of one, you can find motherboards which give you two x16 PCIe slots at 8x speeds, and still use the 9950X3D.
  3. A Threadripper 7000 series build would suit a software development workstation with the extra cores and memory bandwidth. It would also work as an intermediate option if you need up to 4 GPUs for AI or ML.
  4. If you want a cheap LLM inferences rig, then a cheap Epyc 7002/7003 system gives you the connectivity and memory you need. A Threadripper Pro 3000/5000 system might also work, but is likely to cost more and not support as much memory.
  5. Memory-bound workloads such as LLM inferencing or CFD benefit from Epyc 9000 series CPUs. The Threadripper Pro 7000 series may also suit, with the reduced memory channels compared to the Epyc mostly compensated for being able run the memory faster.

I hope that gets the party started. :slight_smile:

11 Likes

I have been a long time HEDT fan, starting when I bought my Core i7-920 on the x58 platform back in 2008. I later used a Core i7-3930k on the x79 platform which lasted me until I got my Threadripper 3960x in 2019.

I like having a capable workstation, but I also like playing the occasional game, so I have gravitated towards HEDT style boards and big top of the line GPU’s.

Right now I have a Threadripper 3960x (24C48T). Normally I would have upgraded by now, but I have been struggling for years now with that the HEDT product segment has essentially vanished. Neither Intel nor AMD makes them anymore.

HEDT used to be pro-sumer near workstation-level products, that could serve as workstations due to having many workstation-like features (large numbers of PCIe lanes, quad channel (or more) RAM, etc.) but did so without sacrificing consumer and gaming performance. Essentially you could have the best of both worlds. A competent workstation-like product, that could also excel at consumer workloads, including games (and often even outperform consumer high end systems at gaming)

Current product lines are either Consumer or true Workstation, with nothing like HEDT left in between.

Current Threadrippers - even the non-Pro models - are workstation only. They are configured for only Registered ECC RAM which slows things down terribly, and have multiple CCD/NUMA configurations which make them terrible at many consumer workloads and games.

You can no longer have a powerful all-in-one machine that does everything you need, and that is sad and a huge loss.

I partially blame DDR5 for forcing the issue, but honestly the HEDT concept was dying long before that.

Both AMD and Intel have been cramming more and more cores into their consumer offerings in a lame attempt to make up for this, but the fact that they are still limited to only dual channel RAM, and a pathetically small 28 PCIe lanes, means they could never serve as my main workstation. The Dual channel RAM limitation also creates a RAM bottleneck for the top CPU core variants in highly threaded workloads.

As for why I blame DDR5, it is because with DDR3 and DDR4, registered and unbuffered RAM were pin compatible, and could fit in the same sockets on motherboards. If your memory controller supported both, you could just populate whichever you needed. Wanbted a HEDT screamer? Go with highly clocked non-ECC unbuffered RAM. Didn’t care about the added latency, and just needed massive quantities of ECC RAM? Populate it with registered ECC modules.

With DDR5 motherboard makers have to plan ahead, as the RAM slots can only support unregistered or registered RAM, not both. They could theoretically have separate slots for separate types of RAM, but that is quickly going to turn into a signalling and real-estate nightmare, and drive up cost.

So, since you can no longer have both anyway, this drives the entire market into sementing into workstation, and consumer, and the HEDT product category dying off. Why would you optimize multiple CCD’s / NUMA nodes for games if it is now really only a workstation product anyway?

This is why once the new Threadripper 7000 series launched, it performs horribly in games. Threadripper 7000 series is just not suitable for anyhting beyond casual games.

…and even a many core top end AM5 chip really isn’t suitable as a workstation anymore, due to seriously limited PCIe lanes, and dual channel RAM that bottlenecks all those 32 cores.

So, if you need a workstation, buy something Epyc, Threadripper or Xeon. If you play games, buy something AM5. (I wouldn’t bother with Intel’s Core Ultra as they are falling behind there)

This is what I’ve decided to do.

I’m in the process of converting my Threadripper 3960x to be my dedicated workstation. I’m removing my 4090, installing a more basic GPU (Radeon RX 7600) and may eventually even switch to ECC RAM.

I don’t care that it is a little older. For my type of work, the core type and per core performance isn’t particularly important. (within reason) The older 3960x will likely last me a long time yet in that regard. What I really need are all those PCIe lanes. (64 available lanes for the 3960x)

I am also in the process of building a dedicated machine for enjoying games. It will be either a 9800x3d or a 9950x3d (haven’t decided yet, need to spend some more time reading reviews and benchmarks). This machine will inherit my RTX 4090 until such time as something faster is readily available and isn’t being scalped.

I have been putting off this decision for years, hoping against hope that something, anything that could serve as both would come along.

I was thinking, maybe, just maybe when the AMD 800 series chipset came along, it would upgrade to using Gen 5 PCIe, and have a beefier integrated PCIe switch, being able to provide more downstream older gen PCIe slots off the chipset. But no such luck. the 800 series chipset is still limited to only 4x lanes ant gen4, and can thus only provide a limited number of chipset PCIe lanes.

In the end I have had to admit defeat. It is a sad day. And end of an era, IMHO.

It’s not all negative though. Building two systems has a bit of upfront cost, but once that is done, I envision only chasing upgrades on the game system. On my workstation build, chasing expensive workstation upgrades likely won’t make sense. I’ll probably just move my old Supermicro Server boards into workstation duty when I upgrade them. Currently my server is an EPYC 7543 in a Supermicro H12SSL-NT board with 512GB of RAM.

That will likely be my next workstation upgrade.

And when I do, it will be fine. I run some VM’s and stick a lot of expansion cards in my workstation, but I have absolutely zero interest in rendering, encoding, machine learning or AI bullshit, so the fact that it will be older and slower wil be just fine.

This might actually save money in the long run, as upgrading expensive HEDT or workstation components costs a lot of money. Just look at those Threadripper 7000 + motherboard prices. Sheesh.

Anyway, those are my thoughts.

2 Likes

Great summary !

This is an option I have also considered - kind of returning to the dual system builds of 5-ish years ago used for DAW, live streaming games or video production. It covers many bases and keeps the costs down, and only the really high-end use cases where high frequency, high core count and high memory bandwidth are really needed.

I think we’ll at least get more memory channels in the next desktop generation.
AMD has APUs with quad-channel interfaces, and the desktop is supposed to stick with dual-channel, I can’t believe that.

Thanks for the input, great stuff. For me, right now I have a retrofitted SuperMicro 846 4U box, with a Gigabyte MZ32-AR0 board, currently hosting a EPYC 7302 and 256GB (half populated) memory. Not only do I want to build another system to replace a old DELL Xeon Precision workstation, but might also upgrade existing 4U with either a EPYC 7543 or EPYC 7763 and might fill-in the rest of the DIMMs etc.

I work with a lot of data in quant system development and quant trading, so I guess I’m looking at another EPYC or Threadripper build, but can’t decide, never mind the Chip generation decision keeping in mind the pricing. :cold_sweat:

every time i look into Epyc or Threadripper builds, I get hung up on the lack of motherboard options, the lack of current CPU features like AVX-512 on older Epyc/TR CPUs, the huge TDP with slow per-core performance that impacts a lot of workloads, the fact that many workloads dont scale linearly to use all cores but at the same time I dont actually have any workloads that I need to run multiple sets of data in parallel to saturate the CPUs, the fact that the higher number of PCIe lanes on Epyc / TR are completely useless when many motherboards dont have good PCIe slot configurations to allow you to make use of them, unless you move up to EATX or other giant sized builds or some kind of rack server which I am not interested in, and of course the price points are massively increased.

The end result is that for my own usages, in every situation, I would be paying 3-5x more for a physically larger Epyc / TR system that sucks massive amounts of electricity to do anything, which also implies its gonna run hotter too and that heat has to go somewhere (my office), and the benefits compared to Ryzen are miniscule.

Ryzen has its own issues, especially on AM5, but I have ended up just sticking with it regardless because the leap to the next step up is not a good value proposition, For me. Maybe it is for you though.

Perhaps one of the most appreciated things though is that all of these support ECC memory. Otherwise I could have simply jumped to Intel

2 Likes

I’ve actually found the power use of my EPYC 7543 with 512GB of RAM to be quite reasonable for what it does. Here is the output from the power stats page in the BMC:

Now, a typical average of ~300W may seem like a lot, but I also have this thing loaded up pretty heavily with hardware. There are 12x spinning 7200rpm hard drives in there which on their own use between 65w and 80w (depending on load).

There is a hot and power hungry 4x 10gig ethernet adapter on board. An LSI 9305-24i SAS HBA, a 40Gig Intel 710XL NIC, 16x NVMe drives, and probably something else I am forgetting right now.

I’ve never measured the power use without all of the drives and PCIe cards installed, but if I had to guess, I bet average usage would be down at ~130w or so, which really isn’t bad. In my neck of the woods, that’s less than $250 a year in power, which really does seem reasonable for what you get.

That, and the Supermicro H12SSL-NT motherboard is a regular ATX board. (12" x 9.6")

Here is a pic from when I was testing it before upgrading my server:

Here it is all loaded up:

It certainly uses a lot less power than the aging dual socket Xeon E5 server it replaced.

You could - of course - go for a smaller AM5-based EPYC 4004 series system, but then you are going to have the same limitations from a PCIe perspective as you discuss above. Still, they are a nifty lighter weight option to have - for sure, and for many of my applications the dual 8x PCIe is more than enough. It’s a little shame that they don’t seem to be planning to offer a Zen5 AM5 Epyc. Another will be coming with Zen6 though.

But if you want all of the PCIe expansion, there really aren’t any other options than going bigger.

Though AMD’s Zen 4c (Siena) lineup has some lower power options, including an 80W 8C/16T model (EPYC 8024PN). And these still keep most of the PCIe lanes of their larger siblings at 96 lanes (down from 128). They only boost up to about 3Ghz, so per core performance is not going to be as high as a desktop chip, but that comes with the territory. It’s a nice “tweener” option to have.

They really have a dizzying array of EPYC chips now. I stopped paying attention after my last upgrade when I went with a Milan 7543 (as I usually do, and then have to furiously read up again when it is time for the next upgrade) and since then they seem to really have cranked up the rate of releases. It doesn’t feel like I bought my Milan Epyc 7543 and H12SSL-NT that long ago, but it is Zen3 so I guess time flies…

That said, I don’t usually buy this stuff new. The Epyc 7543 and H12SSL-NT were already 2-3 years old when I picked them up. the CPU was used, but the motherboard was new. Brand new latest gen server stuff is a little rich for my blood, considering this is just a hobby for me, not a money-maker. I definitely did not pay the $3,761 new price for that CPU :sweat_smile:

My server builds usually last me a long time. Its similar to my car buying cadence actually. I buy something gently used 2-3 years old and well taken care of once every 5 years or so, and then rinse and repeat. Wow, I never thought of how close that is to how I buy cars before :sweat_smile:

5 Likes

The ideal ratio of cores to memory channels /bandwidth “sweet-spot” (i.e. not bottle-necking the cores with inadequate memory access) varies markedly with the work you are performing. If you are planning on using the system for software development especially for very large builds (like myself when I built my Thread Ripper Pro 64-cores system) then the recommendations I found (and which worked well for me) were:

  • for 32 cores, 4 channels of memory (i.e. Thread Ripper)
  • for 64 cores, 8 channels of memory (i.e. Thread Ripper Pro).
  • If you are going higher than 64 cores, then you may want the 12 channels available on the current generation of Epyc processors.

Fundamental problems with ThreadRippers:

  • They carry a big price premium per core, but lag a generation
  • Similar problem with upgrade - when Zen6 comes out, how long before TR6 ?
  • They aren’t crazy enough. They are supposed to be absolute no-compromise performance monster. And yet they have similar TDP to EPYC. I’d expect to see 800W Chipss that can rev up like a dragster und turbo on all cores on stock specs. And then be manually overclocked.
  • Same thing with memory. Why doesn’t this thing have fast C/UDIMMs on crazy frequences and insane beeffed up IMC ?
  • If you are going to go crazy with TR, why faffing around with memory channels ? If EPYC can have 12 channels, why would TR have any less ? TR has sense only as an extreme weapon of performance, be it unicore or multicore.
  • PCIe lanes - if EPYC can have bazzilion, why not TR ? Sure not for (now unlimited) power churn. Maybe it’s about IF bandwidth. If so, TE internal connectivity has them significantly widened to make sense. TDP constrains are much wider here. IF server needs connectivity, well insane muscle needs artillery - bunch of BIG GPUs, FPGA DSPs etcetc.

With all that I would never use TR as a workstation. But as a CPU muscle in my private small cloud it’d be a potent weapon.

I’d do all my daily work on frugal small Ryzen 9 CPU, perhaps with small GPU and then offloaded Blender/simulation/whatever sessions to a “muscle”.

Sadly, AMD has pussied out with TR. They made TR an crippled offshoot of EPYC.

2 Likes

This is a incredibly stupid post, no offense. Threadripper is great for gaming. If you can afford a high end fast CPU like a Threadripper you will likely have a high end GPU and will be targeting modern games with raytracing and pathtracing at 4k60-120+.

Your GPU will be the bottleneck, and if your GPU is not the bottleneck (you like ancient games) then your FPS will already be so extremely high that any difference does not matter and it’s completely moot. Games have almost nothing to do with your CPU (short of offloading non-render related calculations like simulations or NPC behavior). Gaming is a GPU activity. To see this nonsense repeated here is a bit concerning.

CPU does not matter for gaming, unless you like playing old games at ancient 1080p where you might see a small difference and even then it’s irrelevant. If you are serious about gaming then Threadripper is a superior option because you actually have the lanes for things like capture cards (basically required for Twitch/Youtube streaming), or NVMe SSD raids (which Microsoft updated DirectStorage to support), or just a lot of fast storage in general for your games and streaming vods. On AM5 you can’t even run more than one SSD without bottlenecking if you have a GPU plugged into a x16 slot.

And that’s not even getting into mGPU being natively supported by DX12 and Vulcan, which performs much better than SLI, and is something that many game engines will likely support eventually (with how demanding things like pathtracing and VR are). Tim Sweeney has already floated adding it to Unreal Engine.

Almost everything in this post is false. Threadripper is a no compromises performance monster, which is why the CPUs are able to be overclocked unlike Epyc which cannot be. Obviously AMD is not going to ship them at 600W or 800W out of the box because they need to support reasonable cooling solutions, but Threadripper 7000 is an overclocking monster. If you have the cooling for it, it’s a free ~30% performance boost. My watercooled 7985WX boosts close to 5ghz all core, and consumes about 800W at peak.

Your Ryzen 9 is a heavily locked down toy in comparison.

1 Like

Honestly, you don’t have a clue what you are talking about.

I don’t disagree, but primarily in regards to fewer PCIe lanes, fewer memory channels and fewer cores.

The 7000 series Threadrippers are monster workstation CPUs, and they do that better than any other CPU out there, but several things hold them back when it comes to games.

Among them:
1.) Registered only RAM gives it a memory latency penalty which most of the time isn’t significant, but happens to make a big difference in games. This is compounded by the fact that registered RAM usually is not available in as high clock speeds as non-ECC udimms. The Threadripper 7000 has a huge amount of memory bandwidth, but games tend to care less about bandwidth (within reason) than they do about latency.

2.) Multi-CCD implementation does not have the same level of optimization and core parking features as many-core consumer X3D, so it will get pretty bad cross CCD performance hits and cache thrashing. AMD finally sorted out their multi-CCD penalties for the Ryzen 9000 series consumer multi-CCD X3D chips (9950X3D and 9900X3d) through updated platform drivers. This was a constant thorn in the side of owners of previous generation 7950x3d and 7900x3d owners). This combined with the CPU optimization profiles embedded in Microsofts Game Bar mean there is little to no penalty from multi-CCD anymore But because Threadripper is now a workstation platform, they haven’t bothered doing this for Threadrippers.

To be clear, this wasn’t the case back with the Threadripper 3000 series. My Threadripper 3960x performs very similarly in games to its similarly clocked consumer Ryzen 7 3800x. It has more cores, more PCIe lanes and more memory channels, but those don’t really make a difference in lightly threaded loads like games.

But things changed after the Threadripper 3000. The 5000 series was only released as a Threadripper Pro SKU, and has the same kind of performance penalties in games and some other consumer workloads as the Threadripper 7000 series. The 7000 series exists in a non-Pro SKU, but that is in name only. It is still a workstation CPU.

But the numbers speak for themselves. Every single review of the Threadripper 7000 series that tested games found that it performed worse than a budget Ryzen 5 7600.

There was a time when you could buy a more expensive “higher level than consumer” CPU and platform and it would dominate the consumer chips in almost everything, and in other areas perform equivalently. Intel called that HEDT. They sold it as their x58, X79, X99 and x299 platforms. It existed between consumer and workstation in the product lineup.

These CPU’s had some workstation-like features (more CPU cores, more PCIe lanes, more memory channels) but were still optimized like consumer chips, not taking the “stability above all else” approach that server and true workstation platforms did. They also didn’t require either Registered or ECC RAM. krrping their memory latency nice and low.

Unfortunately Intel’s x299 (2017) and AMD’s Threadripper 3000 (2019) were the end of HEDT. These days you have to choose consumer or workstation. There is no in-between best of both worlds solution like we used to have.

It’s dead, gone and buried.

Threadripper 7000 is indeed an impressive platform for when you need to throw cores at a problem, but it suffers in game workloads due to optimization for stability, lack of CCD optimization for games and has high latency buffered/registered RAM requirements which takes it down yet another notch.

The analogy is a very heavy and capable tractor trailer vs. a light and nimble sportbike. Both excel at what they are intended for, but they are not intended for the same thing.

I think you are stuck in the past.

It used to be the case that games were almost always GPU limited, at least unless you were using a very old CPU, but that hasn’t been the case for quite some time now.

Two things have changed.

1.) The younger gaming population have higher framerate expectations than we did 20-25 years ago, when the good old “60fps” acceptability rule of thumb was coined. They want to see at least between 90fps and 120fps or the game feels “laggy” to them. It is very common for modern gamers to play at minimum settings to make games as responsive as possible, which shifts the load away from the GPU to the CPU.

2.) Many recent titles (like for example Starfield and S.T.A.L.K.E.R 2) have been very CPU dependent even when trying to achieve the framerates we used to expect.

Here are the frame rates that S.T.A.L.K.E.R 2 is CPU limited at:

There isn’t a CPU in existence that can achieve what I normally consider “acceptable minimum framerates”, never dipping below 60. The closest is the 9800x3d (and probably 9950x3d, though I don’t have those numbers)

I tried playing it on my Threadripper 3960x when the game first released late last year. In the prologue I was CPU limited at between 45-52fps average with dips that went way way lower. And it only got worse as I got into the game proper. I stopped and decided to wait until I upgraded.

My GPU was a 4090. I could turn down the resolution minimize graphics settings or enable upscaling and nothing made a difference. It was maxed at 45-52fps due to being CPU limited.

Articles at the time were suggesting “It doesn’t matter, just enable frame generation”. I rolled my eyes and tried that. Sure, it looked way smoother, but the input lag was still horrendous. The mouse feel was as if I was playing at 30fps.

To be fair, a 7000 series Threadripper would perform a bit better than my 3960x, but not enough to make it playable. At least not by my standards.

But don’t take my word for it. Take it form professional game reviewers:

(I included the 1480 second time stamp taking it directly to the game section of the review)

It’s been a long time since the CPU was mostly irrelevant to game performance. Probably over a decade at this point.

TLDR version?

The 7000 series Threadrippers are absolutely brilliant in workstation/productivity/scientific/rendering/encoding/etc. loads. Nothing comes even close.

In games - however - they are middling at best. IN many titles this may not matter since - as you suggest - you are likely going to be GPU limited anyway. But there are a growing number of titles where CPU performance very much does matter. I guess you are fine if you aren’t interested in those titles.

The era of “one platform to rule them all”, “no holds barred, best at everything” ended in 2019. There is no product like that anymore.

Now you have to choose. Will it excel at consumer, or will it excel at workstation?

Eh, if your workloads shard it makes more sense to buy two or three 9900X[3D]s or 9950X[3D]s rather than a 7960X or 7975WX or something. More cores, more lanes, quite possibly more DDR and probably with a latency edge. Shimada Peak’ll help reduce the disadvantage in core performance but probably won’t much about non-Pro’s tendency towards a disadvantage in cores/channel, though if it turns out DDR5-6400 1:1’s supported that’d be a bit of a mitigation.

Storm Peak benched slower than Raphael on a goodly number workloads and it’s not looking like Shimada Peak’ll change that with respect to Granite Ridge. Given how much can be done with 96-256 GB in dual channel and 16-32 threads it’s hard for me not to feel like Threadripper’s mostly a solution in search of a problem. There’s niches where it’s the right compute shape and thus very useful but none of the workstation/productivity/scientific/rendering/encoding stuff I do gets enough of a speedup for the high cost and all the hassle to be worth it.

Right now I’ve effectively got a 9950X + 2x9900X cluster and can’t write code or set up data fast enough to keep the 9950X busy, much less either 9900X. There’s occasional work modes where I’ve something to hand off but even the 9950X spends most nights and weekends sleeping. Looking ahead I don’t see this changing, if we get more compute limited it’ll probably be in dGPU bound workflows.

I got nothing. ^^

I see 9900X tie 9950X core for core, so there’s not many real world workloads where 9950X is much of an advantage. I’d hoped to get a look at how our stuff interacts with vcache but 9950X3D was total unobtainium when I built the 9950X. It’s only just starting to drop below MSRP here.

I wish I could get my hands on a TR to see if I can figure out WHY the performance is oddly poor. It doesn’t make a lot of sense. As I mentioned in the other thread, the windows scheduler has a lot of issues, many of which can be worked around or fixed with simple settings changes.

I see the same basic problems you’ve mentioned on AM5. In my workloads with partially active cores it looks like 11 is a substantial downgrade with respect to 10 in terms of maintaining core affinity.

Unfortunately I’ve had too many other things going on at work to free up a couple days to dig into it. Some of our in house benches with continuously running threads also show degradation on 11.

The affinity issues happen on multi-CCD AM5. However, AM5 is much, much less susceptible to the power limit issues related to parking. It has a much higher W/Core ratio than the high end TR.

Everything in this post is incorrect. That you are cherrypicking ancient 1080p benchmarks only proves my point that you A) have no idea what you’re talking about and B) are proving my point. To the extent that computation does matter in gaming, Threadripper is better because it’s fundamentally threaded work. Another example of this is recent render thread parallelization work in Unreal Engine 5.4 and 5.6.

A system drawing frames faster in a non-GPU limited scenario, based on things like less latency due to core layout or more L3 cache, has nothing to do with CPU peformance and is completely useless because if your GPU is not the bottleneck then your FPS is going to be so high that differences do not matter.

You cannot show benchmarks at modern resolutions (4k and higher) with recent games, because you will find no statistically significant difference. Meanwhile with a weak low end CPU like a 9800x3D everything you do on the computer will be 300% slower and you will have no connectivity for serious gamer essentials like capture cards.

Not only that, but it’s double silly because with a Threadripper, if you want, you just can disable other CCDs while still maintaining the superior overclocking capabilities of Threadripper. This will reduce latency.

Anyone buying a low end CPU “for gaming”, which has little to nothing to do with gaming, and where there is no difference in modern titles at modern resolutions, is not only gimping themselves, but has no idea what they’re talking about to the point of computer illiteracy.

1 Like

I’d figure single CCD as well. Even if state’s not evicted from L3 while a thread blocks, moving the thread between cores on the same die still costs context save and restore, L1 and L2 refill, L3 slice shift, and maybe core power state transition. There’s probably also a cost to switching a scheduler thread between a core’s two hardware threads, which 11 doesn’t seem to be stable about either.

I don’t know if there’s any way to get scheduler diagnostics at the needed level of granularity but one possibility which fits the observational data I have is 11 systematically underestimates relocation costs while overestimating the benefits of following core performance order.

I’d expect the larger the power transitions the greater the amount of IPC throttling required. What I see is 11 tends to be most performant on single thread, thread counts matching a single CCD’s cores (so six or eight threads), or thread counts using all cores (12 or 16). Even then it’s often all over the place as to which logical processor threads are on, which might avoid outright parking but still implies constant power transitions.

One basic test I’ve been thinking to run is to step through workloads with a single thread up to the number cores and compare what happens with threads unpinned, pinned to physical cores, and pinned to logical cores. It’s not hard but writing the new task modules, changing workloads to use them, and running everything will take some time. In some cases thread type permutations with respect to the core performance order are called for.