AMD Threadripper 3970X under heavy AVX2 load: Defective design? (No, but there is an issue)

@Lt.Broccoli

Great idea, will do now!

Just to be clear: here are the settings I chose and the sample output. Is this what you’re trying to run?

Seemingly no issues on my 3970x + ASRock TRX40 Creator. This board does have “SRC Spread Spectrum” disabled by default claiming enabling it reduces stability.

image

These are the correct settings indeed.

How long have you been running the test?

I think it’s also disabled by default on my motherboard but I’ll need to check next time I reboot.

I let it run for > 20 minutes.

1 Like

Be aware that some spread spectrum settings can be hidden behind other options. For example, on the Asus motherboard I’m using, I couldn’t disable VRM spread spectrum without first disabling VRM power-saving – but disabling spread-spectrum is what stopped the instability.

Thanks for posting. Please have a look at my thread, i posted an update: the VRM is missing a switching cycle every 4ms actively discharging Vcore so a voltage dip occurs. Oscilloscope screenshot here:
https://forum.level1techs.com/uploads/default/original/4X/e/9/8/e98317514c2e8f565a8c6c76ce078779581b98bb.png
My comments on it in this post:

3 Likes

Well I think the fix can be done via BIOS update. I am running a 3960x with a Aorus Master. On the F4h bios I could run the test for about 10 minutes before failure. When I upgraded to the F5a BIOS, the test fails immediately with spread spectrum enabled or disabled.

I am running Prime on Linux (Siduction (debian sid)) v29.8. The F4h bios was missing the spread spectrum settings.

Well i can remember that Gigabyte actually had some,
bug with i believe the F4 bios on the master.
It caused the vrm to run a bit abnormaly warm,
even when it actually has a active cooling fan.
The weird thing is that the Extreme did not seem to have the said issue.
Even though the vrm’s are physically the same.
I believe Gigabyte fixed the problem with the F5 bios on the master.
But did not really state what the actual issue might have been.

But still i kinda doubt that vrm is cullprit when it comes to avx2 heavy load failure.
I kinda wonder if this issue might not be related to the cpu,
actually hitting its powerlimit.

Hi,

Sorry you are having issues…can you reach out to me @ [email protected] and we can get you connected with some folks over here to help.

Thanks-
Drew

10 Likes

Hey everyone, we (mods) are in the process of confirming @dprairie_AMD’s identity. We advise holding off on reaching out to him in the meantime.

Assuming you’re legit, welcome @dprairie_AMD. We’re happy to have an AMD rep among us.

2 Likes

I sure hope that is a real AMD rep! This is a serious issue and I hope it will be fixed in future bios or AGESA updates. My entire workload depends on AVX512, and I just plunked down $5000+ on a 3970x build. Haven’t put it together yet, but I’m dreading having it crash or worse, output incorrect calculations.

Are you trolling ? Your whole workload depends on AVX512 but you bought a CPU that doesn’t support AVX512 , never did and never advertised it did.

1 Like

Sorry I meant the smaller AVX, I believe 256?

I’m not really sure about the technical naming here, I just know my software uses AVX.

I’ve tried disabling Spread Spectrum in my BIOS (it was set to Default, whatever that means) but it didn’t help. Prime95 with min/max FFT sizes set to 16K still fails instantly (within a fraction of a second).

A quick LinkedIn search reveals that AMD’s director of corporate communications is called Andrew Prairie, usually referred to as Drew.
The mail address in his post also seems pretty legit, seeing as it is listed in several official AMD documents.

2 Likes

He’s legit. We got in touch with him and AMD is currently trying to repro the problem on their end. Keeping you posted.

7 Likes

Yes confirmed.

1 Like

You mean AVX2?

If i want to test this under linux (epyc), which program and settings should i use? MPrime? Settings?

I did option 16 then option 2 for most of my testing. On all available threads. Mprime from the cli is fine but I tested windows and Linux