Return to Level1Techs.com

GPU pasthrough issues 2700X/Vega64

helpdesk
amdgpu
passthrough

#1

Hey guys, this is a little weird, Maybe It me doing things slightly out of order, I generally don’t really follow guides/wiki to the T and like to work it out for myself, and I feel I am very close.

Manjaro Linux, used the Arch wiki as a guide.
2700X @ stock + PBS
32GM RAM
280x + vega64 in lower x16 slot, I tried it on top, but got no boot picture, since found I can change that in BIOS,maybe card needs to be in top slot…
My IOMMU groups are GOOD, the list shows the vega as its own group, not even mixed with the HDMI of the card. If I am using the HDMI for video out do I NEED to add the audio for it aswell even if its not connected to a monitor with sound?

I can load the VM with the virtual splice display, and I have loaded the Vega successfully for vfio-pci as I can add it when the VM is running but when I restart win10 for drivers to finish installing, I get a pegged CPU core and nothing shows up anywhere. I have a mouse and keyboard passed through no problem. And a separate monitor plugged into the second video card.

I ended up deleting the whole VM, and starting again, trying to just pass through everything right at the start, but as soon I boot it I get the pegged CPU core and it just locks up?

Anyone got any ideas, tried searching a bit but started this thinking it would take me a few hours, and its now 1am. :joy:


#2

Man it always helps to try asking a question/…

I answered it myself, yes the HDMI audio must also be passed through. Guess I tonight’s a late one lol. May as well get windows installed a second time for no reason.

actually it might not be that… seems I need a full system restart after force killing the VM.

Seems like every time it closes the PC needs to be restarted… Defiantly has a problem.


#3

Most, possibly all Vega cards do not reset properly and often require a full reboot, especially with a force shutdown of the VM.

Some technical info on the vega reset bug-


#4

Cool thanks for that. While this is annoying news… That seems like with a little direction that might be something I can “fix” if indeed all you need to do is port part of the code from the amdgpu driver to something else to call a function… I assume it would take me a few weeks, but I might have time for that. Guess I need to get in contact with that guy, about what is involved in more depth, seems like he just doesn’t have the time to try it.

I have about 500 hours of experience hacking bitcoin C/C++ code so far this year, and consider myself not exactly bad at it, but far from an expert.

Well found this gem of info… seems a lot easier, vut we shall see. Fixing it properly is defiantly better.

Otherwise you can get around the issue by putting your system to sleep before starting the VM


#5

cat /usr/bin/reset_vega.sh
echo “1” | sudo tee -a /sys/bus/pci/devices/0000:0d:00.0/remove
echo “1” | sudo tee -a /sys/bus/pci/devices/0000:0d:00.1/remove
systemctl suspend
read input
echo “1” | sudo tee -a /sys/bus/pci/rescan
This does the trick for me. Simply run this after shutdown on host, takes about 10s and it allows the VM to start again. Just means the windows VM cannot reboot, only shutdown.


#6

I have also heard people getting around it by disabling the GPU in windows right before guest shutdown, then re-enabling it at startup.

I think they used devcon.exe in a bat script then sticking it in task scheduler, although I think group policy startup/shutdown scripts would also work.


#7

I just tried this again and yes, this hardware reset work around works for me on my PowerColor Red Devil V64 too.
Thanks for putting it into a script too, I was thinking of how I would do that.

Is there anything different about running

echo “1” | sudo tee -a /sys/bus/pci/rescan

in a shell script vs command line? Each time I tried in command line I got either invalid argument or permission denied so I had to

sudo chmod 777 /sys/bus/pci/rescan

sudo echo 1 > /sys/bus/pci/rescan

I have the Windows scheduler scripts in the VM too but they didn’t seem to work before.


#8

I ended up using the kernel patch, but now I have random screen blanks… Like it just geos black for a few seconds and then comes back. Really not sure what it is… I removed all O/C (except the powerlimit)… but its unlikley to be those, because it does it even at the lowest power state. :frowning: