Threadripper Reset Fixes

Hey all,

For anyone interested I made a package for archlinux that
contains PCIe reset patchset.

I can now start and stop a VM without having to reboot the machine. Unfortunately I have yet to solve the guest video adapter sleep state issue.

Threadripper 1950x
Gigabyte X399 Designare EX
Host AMD RX 560D
Guest EVGA GTX 970 FTW+

2 Likes

The last experiment.

HostOS: Xubuntu 17.10x64 + vanilla 4.16.12 + geof patch (GTX960 2MB)
GuestOS: FreeBSD 11.1 + Geforce 396.18beta (GTX 1080Ti)

screenshot (VDPAU)

http://www.monitos.cz/tmp/kvm_freebsd_11_1_GFX_passthrough_GTX1080Ti.png

HW: X399 Taichi, 1950X, 4x 8GB 3200CL14, 2xNVMe, 1x SSD, 1xHDD, Corsair RM-1000X , …

GuestOS: Windows 7x64 works too.

Screenshot

http://www.monitos.cz/tmp/libvirt_gfx_passthrough_win7.png

HostOS: HW/OS/…: Same as above.

Was gnif’s patch committed to the Linux kernel yet ?
Or are the kernel devs still holding this up by individually marking their territory on the patch ? :joy:

I have been informed that AGESA 1.1.0.0 fixes this problem, I am yet to test and confirm though.

2 Likes

I have quasi confirmed, but more testing underway.
Vega GPUs still have their own set of bugs but it does appear fixed for bioses packing AGESA 1.1.0.0

Wonder how long it will take ASUS to roll out this BIOS, I’m on v1003 which has AGESA 1.0.0.5 (and the ability to not disable C-States, which gives me firmware bug errors at boot time).

I’ve a TR 1920x and want to try this vfio stuff with looking glass, but I need to get the system to clean boot first :confused:

Exactly why I avoid ASUS these days, while their boards are great, but they stop producing bios updates fairly fast.

1 Like

Gigabyte is also, I think, working on a totally new uefi experience that soon will be Kick-Ass .

Msi and asrock I can always get beta bioses. Gigabyte too with a bit of prodding.

This makes me sad, I too love ASUS, but as mentioned above, BIOS updates are very slow for release. if at all.

I like the ROG Zenith Extreme because it has the 2 by x16, and 2 by x8 lane PCIe slots, which I need (2 by x8 for LSI controllers, and 2 by x16 for GPU’s).

However, I don’t think I’ll be supporting ASUS too soon again because of these BIOS delays. I’ve had my TR and Zenith Extreme since they first came out, switched to linux about 2 months ago and still trying to get some of my errors out of my boot logs.

Yeah the lack of bios updates for the Zenith has been very disappointing. Probably the last time I go ASUS. I would’ve went for the Gigabyte Designare board had that been available at launch, oh well.

The update can’t be that far off since I’m assuming they will have something ready for TR2 launch, hopefully.

My last ASUS board was an AM2 thing… I had an early phenom in it, but when the next generation came out (AM2+) I was unable to use them without manually hacking an updated agesa into the BIOS. Still to date ASUS never released a new bios for the board.

So is this not going to be patched in the Linux kernel anymore ?

I’ve been holding off purchase of parts for my planned Threadripper build because of this bug. For 6 months or so, already.
Apart from the monster delay, meanwhile the DDR4 prices have of course gone up.
I plan on buying the ASRock Fatal1ty X399 Professional Gaming.

New agesa fixes it. So no kernel patch needed. Behaves like Intel platform now

Vega resets are still another issue.

1 Like

Thanks chaps! I’ll check with Asrock about AGESA 1.1.0.0 update.
(I don’t know how I missed gnif’s response earlier.)

On Ubuntu 18.04, did patch kernel 4.15.18 with the tr.patch, compiled and installed. System boots, LookingGlass works great. However, after exiting VM, and trying to init it again, Im getting:
error: Failed to start domain win10
error: internal error: Unknown PCI header type ‘127’

Host: Radeon Pro WX 7100 Graphics
Passthrough: Asus Radeon RX Vega 56 OC 8GB

The classic 127.

That means the reset issue is still present. It won’t reinitialize and even if you do, you will get a system lockup because you’re trying to reinitialize something that can’t be reinitialized.

Only solution is to kill power to the entire system (literally kill all power) and start the system from a pure cold boot. I had this problem non stop with the Fresco Logic FL1009 controller with a older firmware that can’t be updated.

Oh okay :confused: can it be solved by adding dedicated USB 3.0 card and passing it to VM?

Maybe something like this one:

btw I’m on AsRock X399 Taichi

Did AsRock take down BIOS 3.x with AGESA 1.1.0.0?

I see only 2.30 for X399 Taichi today.
https://www.asrock.com/mb/AMD/X399%20Taichi/index.asp#BIOS

No, it’s your Vega graphics card that has the reset issue.

1 Like