Intel CPU performance oscillates, while AMD CPUs are very stable?

It’s probably too vague of a question, but i can’t google anything even remotely similar and don’t know how to narrow it down.

So, i got a bunch of linux systems that are loaded in a pattern of 80ms of CPU computing, 5ms of transmitting results, 15ms of idling before the next command arrives.

80% all core load, 20% idle.

On AMD CPUs, 2600, 3600, 5950X, 3975WX, the compute part takes 80ms, plus minus 2. It’s rock stable.

On Intel CPUs, 8700K, 10900K, 12900K, the compute part takes anything from 80ms to 200ms. It’s all over the place.

Anyone have any idea why this might be?

I looked at frequencies, and both have them jumping around more or less the same.

I tried turning boost off, but that had no effect.

Locking the CPU frequency does fix it, but is a rather suboptimal and crutchy workaround.

No idea where else to look.

Is this a time-sensitive operation? I can’t say I know why this is happening, but I’m curious about the workload, as that may enlighten.

Intel tries to do a lot of strange things, in my experience, with regards to frequencies and optimization.

Does the core turbo up when on the compute portion?

Some of these CPU scaling governors are tunable: https://www.kernel.org/doc/html/latest/admin-guide/pm/cpufreq.html

as @SgtAwesomesauce is strongly hinting at, step 1 should be to gather and log CPU performance days together with the timing information you’re already logging

1 Like

It’s a matter of entertaining a what if.
I’ve been playing with making a software raytracing renderer for one of my games, and at higher settings it was giving me 3 FPS on a 32 core threadripper, which is not particularly playable. So i wondered what would happen if i tried to distribute the rendering over all the computers i can get my hands on within reasonable ping.
It worked, and got it up to 8 FPS… Which is still kinda choppy, but getting close to the edge of being playable.
So yeah, it have to be realtime.

Yeah, both kinds of CPUs turbo up and clock down many times a second.

Ok, now that’s a clue.
All the intel cpus use intel_pstate, while amd ones use acpi-cpufreq.
Adding intel_pstate=disable kernel parameter to the former make them use acpi-cpufreq as well, and that solves the problem.

One remaining snag is that on everything but 12900K this also disables boosting.

So now the question is how do you get intel_pstate to be stable or get acpi-cpufreq to boost?
The slow runs did coincide with frequency drops on intel_pstate, so i suspect it’s just not as responsive, at least with default settings…

AMD has recently added their own p-state driver to the Linux kernel for Zen2/Zen3 cpus. I’m wondering how your newer CPUs perform with that enabled, if they become more intel-like.

That’d be this one patchwork coming in 5.17?

It’s a remarkably short piece of code, I don’t know why I was expecting something more complicated.

There’s also this zenpower3 which seems to be mostly about sensors IIUC, that’s not what you meant?

It’s not in a released kernel yet, AMD (and Valve, for their AMD-powered Steam Deck) are working on it though.