Wendell, can you explain these PyPrime results?

@wendell There is one thing that you could help me with, because I am stumped for an explanation.

ScatterBencher made a video on overclocking the 9700X and so I decided to see what I could get out my 9950X if I disabled one CCD.

Now ScatterBencher was doing the whole PBO + CO thing whereas I was configuring my 9950X my way, which admittedly gives a tad less Single Core performance, but a lot more Multi Core performance.

Be that as it may, ScatterBencher ran PyPrime which is a Single Core workload that runs at RealTime priority and his CPU should have been boosting to about 5.7 GHz.

My CPU was confined to a max single core clockspeed of 5.55 GHz.

So here was his result (lower is better) with his stock result in blue and his overclocked result in green:

I ran the benchmark and was gobsmacked when I saw the following and I have absolutely no way of explaining it:

It gets even weirder though. When I went back to 16 Cores/32 Threads with my 9950X where it runs at a max clockspeed of 5.375 GHz I got the following result:

This is still a far better result than ScatterBencher managed to achieve.

I don’t know if ScatterBencher was running Windows 11 or Windows 10.

My results are from Windows 10

1 Like

pyprime is basically a memory benchmark so it’d be most apt to compare how you have your memory tuned to the skatterbencher guide.

When Pieter OC’d the memory in that same article he got this:

2 Likes

It’s running close enough to his.

I decided to do the experiment with the Hynix 32 GB RAM I have with faster timings, and you appear to be right, PyPrime results are better.

So I can use PyPrime as CineBench for my RAM.

Thank you for your reply, because I was totally stumped. :grin:

1 Like

I only run pyprime with default size. No idea your score is good or not.

Are you on the latest AGESA? Is CCD to CCD latency fixed?

1 Like

I run PyPrime on default.

I’m on AGESA 1.2.0.1a, which now enables me to configure my 9950X completely in the BIOS.

This was not the case with AGESA 1.2.0.0a Patch A.

WIth regard to the latency, I have no idea, because I don’t have access to the tool Anandtech was using.

So it is a case of “Trust me bro” on the part of Anandtech, and considering the Tech Media’s track record when it comes to Ryzen over the past five years - and calling it pathetic would be a compliment - I wouldn’t put much stock in the results; especially considering that they were running on WinTel 11.

Try PyPrime 2B size. A tuned Zen 4 should be less than 10s.

1 Like

My 9950X is running at 5.375 GHz for CCD0 and 5.3 GHz for CCD1, which gives me the following CineBench 2024 score:

Here are my stable memory timings:

I decided to play around with PyPrime and got the following results:

6200 MT/s UCLK=MEMCLK FCLK 2000

6200 MT/s UCLK=MEMCLK/2 FCLK 2000

6200 MT/s UCLK=MEMCLK FCLK 2167

6400 MT/s (not stable) UCLK=MEMCLK FCLK 2000

6400 MT/s (not stable) UCLK=MEMCLK/2 FCLK 2000

1 Like

Try 6200/2066. It might have better latencies since FCLK is matched 2:3 to the memory speed.

6200 MT/s UCLK=MEMCLK FCLK 2167

This one does not make sense to me. It should not be so slow. Probably it is unstable. On zen 4 (and I assume zen 5) infinity fabric has some error correction mechanisms that slow down the CPU.

Possibly you can gain some performance in the secondary/tertiary timings but tuning those can be very tedious. You could try the infamous buildzoid timings

or this sheet with suggested/calculated timings

But the most important timing is tREFI and tRFC which you have already tuned.

If 6200 is stable at VSOC 1.2, and 6400 UCLK=MCLK boots, you could probably get 6400 stable with somewhat higher VSOC. Just hypothetically of course :crazy_face:

Pyprime is single-threaded AFAIK so higher single-core might help a bit, but it is mostly a memory latency benchmark. y-cruncher is more bandwidth sensitive, if you want to cover that too.

I agree with you and with my 7950X setting 2167 for FCLK gave me more performance.

With my 9950X it results in lower performance.

I will give an FCLK of 2066 a go, to see if that results in a better performance.