Update #2:
One thing troubled me after I setup the whole VFIO thing. My host system would freeze if I tried to:
- move a window
- open a new app
- resize a window
- separate a browser tab from the main browser window
- probably also when closing a window but I don’t remember
BTW, when my host system froze, my guest VM continued to run just fine, USB passthrough still worked, etc. So likely it was gnome that was hung.
At a certain point I realized all these things happen when a window’s size/location needs to be recalculated. I recalled that before starting the VM, Fedora would actually use both monitors. So I made a change in display settings, so that Fedora only uses the monitor connected to the host GPU.
Good news, my host system freezing issue is resolved.
Bad news, now I cannot get my GPU fan to spin, at all. No matter what I try, rebooting Windows VM or the host system, changing display back to on both monitors, nothing works. My GPU fan just won’t spin even if I leave it on overnight (I know it sounds weird, but previously I was able to get the GPU fan to spin by simply waiting for anywhere between 1 minute to a few hours after booting up the Windows VM).
I’m not sure if the fact that my host machine boots up with two monitors both displaying at the login screen is related (after logging in, it only displays on 1 monitor).
Update #1:
So I discovered this by accident. I left GPU fan on 100% (but it was not spinning) and went out to do some shopping. When I came back, the GPU fan was spinning REEEEEEEEEEEally loud…yep it was spinning at 100%. I changed it back to automatic control and it appeared to work properly.
Now, if I reboot Windows, it stops working again, but it “fixes” itself after a while.
So, issue is not related to the uninstalled PCI devices or the Vega PCI bridge devices…
I played a few games and most of them were fine except one that suffered severe jittering and/or frame loss.
I guess it’s not unacceptable to wait a while before I can play games, but I still want to know why
OP:
Fedora 33
X399 Taichi
TR 1950X
Guest GPU is an ASUS Vega 64
Currently the passthrough kind of works. I have output from the VM, I can launch games and FPS seems fine. Problem is there is no fan spin even if I manually crank fan curve up to 100%. As a result, when playing more demanding games, my GPU overheats and shuts down the VM.
I have verified that the card works in another system running Windows bare metal. Fan spins up when temperature goes up. So it’s not a hardware issue unless it’s the X399 Taichi motherboard, which I doubt because my host Linux GPU fan spins just fine.
I’m suspecting it’s because a couple of PCI devices are not passed into the VM properly. I can see them from lspci
(09:00.0 and 0a:00:0):
$ lspci | grep -i vega
09:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Vega 10 PCIe Bridge (rev c1)
0a:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Vega 10 PCIe Bridge
0b:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 XL/XT [Radeon RX Vega 56/64] (rev c1)
0b:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Vega 10 HDMI Audio [Radeon RX Vega 56/64]
And I have put all four of them in /usr/sbin/vfio-pci-override.sh
file.
However, when I try to “Add New Virtual Hardware” to the guest Windows VM, 09:00.0 and 0a:00.0 are not showing up in PCI Host Device list, but 0b:00.0 and 0b:00.1 are.
Meanwhile, in the guest Windows VM Device Manager I see two devices without proper driver installed. I’m not sure if they are related though.
The drivers for this device are not installed. (Code 28)
There are no compatible drivers for this device.
PCI Device (PCI bus 6, device 0, function 0)
Hardware Id: PCI\VEN_1AF4&DEV_1045&SUBSYS_11001AF4&REV_01
Device PCI\VEN_1AF4&DEV_1045&SUBSYS_11001AF4&REV_01\4&1743037d&0&0015 requires further installation.
PCI Simple Communications Controller (PCI bus 3, device 0, function 0)
Hardware Id: PCI\VEN_1AF4&DEV_1043&SUBSYS_11001AF4&REV_01
Device PCI\VEN_1AF4&DEV_1043&SUBSYS_11001AF4&REV_01\4&1ab0bb95&0&0012 requires further installation.
I’ve run out of ideas why my GPU fan is not spinning (no rookie errors as far as I’m aware). I’m not sure if the symptoms I observe above are even related. Appreciate helps and pointers