AMD Threadripper 3970X under heavy AVX2 load: Defective design? (No, but there is an issue)

Well I ran the test that was failing on F5C (contains AGESA 1.0.0.3B), ran for 20+ minutes without any errors.

OS: Siduction (Debian sid) x86_64
Kernel: 5.5.6-towo.1-siduction-amd64
Packages: 3239 (dpkg), 26 (flatpak)
Shell: zsh 5.8
Resolution: 3840x2160
DE: GNOME 3.34.4
WM: Mutter
WM Theme: Plata-Noir-Compact
Theme: Plata-Compact [GTK2/3]
Icons: Flat-Remix-Red-Dark [GTK2/3]
Terminal: tilix
CPU: Ryzen Threadripper 3960X 24- (48) @ 3.8GHz
GPU: NVIDIA GeForce GTX 970
GPU: NVIDIA GeForce GTX 1080
Memory: 7194MiB / 64254MiB
GPU Driver: NVIDIA 440.64
Motherboard: Aorus Master

Prime 95 Build; Linux64,Prime95,v29.8,build 6

2 Likes

I dont see F5D, F4, I think is still on the Gigabyte site, but I was running F4H previously.

Thats for the Aorus Extreme sorry.

No worries on the Master the F5C bios seems to resolve the issue, well for me.

XMP enabled @3600, infinity fabric set to 1800, SVM enabled, IOMMU enabled, PBO enabled, XFR enabled with Auto Scalar setting and Spread Spectrum set to Auto.

They apparently fixed it. You should see if it works for you now. :slight_smile:

OP here: Yes, the latest BIOS from GIGABYTE should fix the issue with Prime95.

4 Likes

Hmm… Solve one problem find another, with the latest bios, windows to go hardlocks, at the login screen. Rebooting, causes the system to become unbootable, unless doing a CMOS clear. This is with and without XMP, even the default optimized settings.

I use Windows to go to set my Ram and RGB fusion colors.

This happens with a clean install of Windows?

Haven’t tried it, I am trying to avoid a dual boot setup. Windows to go is like a persistent live Linux.

I’ve plugged the windows to go drive into multiple systems without issue.

it might be time to get a new iso for windows to go or just a new install of it on the TR system as the start machine.

Anybody else have Windows instability?
Seeing as no-one else has posted any complaints, seems like the problem is unique to you as of now.

I am also running Windows to Go, off a Samsung T5 External drive. Microsoft, isn’t developing Windows to go anymore. I think its just unique to me. I ended up using OpenRGB in linux to set my Ram and Motherboard colors.

An update on our investigation with AMD: [SOLVED] 3970X - Prime95 stability?

Still would be nice if Asrock and Msi boards,
could also being tested for the said issue.
But there seem to be enough complaints for AMD,
to take a close look to it.
And they seem to have figured out the issue with Gigabyte boards.

So what was the fix in the end? I don’t see it in the thread here.

Was it just a powerstage problem with the motherboard specifically?

@Jimster480 I’m preparing a conclusion post.

1 Like

To add some information for anyone searching for it…

I had issues on one core with the 16k AVX2 Prime95 load when running the CPU at stock. Running PBO or auto OC made the problem go away.

I updated my BIOS (Zenith II Extreme Alpha) to 0902 last night, and its completely stable at stock now.

EDIT:
Looks like that BIOS update included “02. [Q][E] Update CastlePeakPI1.0.0.3 Patch B”

I apologize for the lack of recent updates on this topic. Obviously the current health crisis has not sped up the process.

On March 7th I wrote in this thread:

It turns out that is not the case, at least on my system: I recently switched back to the GIGABYTE TRX40 Aorus Xtreme motherboard (after a few weeks on the ASUS Zenith II Extreme Alpha) and the fact is that GIGABYTE’s latest BIOS version (“F4d”, AGESA 1.0.0.3 B) does not fix the instability under Prime95.

As can be witnessed in this thread, AMD has been extremely responsive and helpful. They do have a fix for the instability that works on my system, but either GIGABYTE screwed up when merging it into their F4d BIOS, or they introduced another issue.

That’s the current situation. At this point, and per my current understanding of the situation, I believe the pressure should be put on GIGABYTE, not AMD.

Ideally GIGABYTE would wake up, get in touch with us and join our conversation with AMD. Unfortunately there’s no sign that they’re willing to do that.

I’m personally done with switching motherboards. I’ve spent far too much time on this issue.

(I’m marking back this topic as unsolved.)

1 Like

Hello @FranzB , I saw this thread and Since I have a 3960x with Aorus Master board I went ahead to test Prime95.

I currently can’t reproduce any issue with Prime95 under Windows 10 or Server 2019 (I have a dual boot system). My BIOS version is F5c

However, I tried a live Ubuntu Linux 20.04 and downloaded Prime95. I started to perform the torture test but after all threads are started I am getting a “killed” message on the terminal window. Doesn’t seems to be related to the issue you have but maybe an issue with using Ubuntu Live.

Unfortunately, Ubuntu 20.04 runs with Kernel 5.4, and only the newer Kernel 5.6 fully supports the Ryzen 3000 series power states, and PBO system. I can see that currently Ubuntu defaults to the lower PState (2.200 Mhz). So I decided not to install Ubuntu on one of my hard drives. I can also see a lot of ACPI errors under linux on boot (Oddly looks like Ubuntu supported as declared on the AMD’s CPU page is false)

I had the same issue on my TRX40 designare - PRIME seemed to work on windows and was dying like that on Linux.
This was fixed by latest firmware F4C I think.
EDIT: maybe not the same issue - I got the issue of some torture threads failing like what started that discussion.