Return to Level1Techs.com

[5700XT] Use AMD GPU on Host and VM -> Rebind drivers to vfio and back to amdgpu

Im trying this now for quite a long time, first with an nvidia gpu but now i switched to a R9 380 for my displays and a 5700XT to pass to the vm. I want to use the 5700XT on the host too (manjaro).

I got the DRI setup working, my gpu doesnt show in xrandr --listproviders, and i can run applications on it. So far so good.

Now im trying to rebind the gpu to the vfio driver, which is not working yet. This is what i tried:
Running start.sh:

#!/bin/bash
modprobe vfio-pci
modprobe vfio_iommu_type1
modprobe vfio
echo "1002 ab38" > /sys/bus/pci/drivers/vfio-pci/new_id
echo "0000:20:00.0" > /sys/bus/pci/devices/0000:20:00.0/driver/unbind
echo "0000:20:00.0" > /sys/bus/pci/drivers/vfio-pci/bind
echo "1002 ab38" > /sys/bus/pci/drivers/vfio-pci/remove_id

gives these errors:

start.sh: line 6: /sys/bus/pci/devices/0000:20:00.0/driver/unbind: No such file or directory

start.sh: line 7: echo: write error: No such device

lscpi -nnk: (after running the script)

20:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5700 / 5700 XT] [1002:731f] (rev c1)
  Subsystem: Sapphire Technology Limited Navi 10 [1da2:e411]    
  Kernel modules: amdgpu

Which logs could i check? Any ideas how to bind the gpu to vfio and back to amdgpu?

Here is my lspci -nnk output: (before running the script)

1d:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Tonga PRO [Radeon R9 285/380] [1002:6939] (rev f1)
  Subsystem: PC Partner Limited / Sapphire Technology Radeon R9 380Nitro 4G D5 [174b:e308]
  Kernel driver in use: amdgpu
  Kernel modules: amdgpu
1d:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Tonga HDMI Audio [Radeon R9 285/380] [1002:aad8]
  Subsystem: PC Partner Limited / Sapphire Technology Radeon R9 285/380 HDMI Audio [174b:aad8]
  Kernel driver in use: snd_hda_intel
  Kernel modules: snd_hda_intel
1e:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1478] (rev c1)
  Kernel driver in use: pcieport
1f:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1479]
  Kernel driver in use: pcieport
20:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 [Radeon RX 5700 / 5700 XT] [1002:731f] (rev c1)
  Subsystem: Sapphire Technology Limited Navi 10 [1da2:e411]
  Kernel modules: amdgpu
20:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio [1002:ab38]
  Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 HDMI Audio [1002:ab38]
  Kernel driver in use: snd_hda_intel
  Kernel modules: snd_hda_intel

From what I know reset is still not properly working on anything AMD, so from my understanding that is currently not possible.

@gnif worked on the reset fix for Vega. I haven’t followed it that much, but I think he got it working. I’m not sure if he’s working on the RX as well though (I think he was still crowdsourcing for GPUs to work on).

@peterge firstly I have edited your post to fix the formatting, in future please create pre-formatted blocks by using three backticks like this.

```
some
preformatted
content
here
```

As for unbinding, the amdgpu driver does not support it at all, your milage may vary depending on kernel version and GPU. It works but with complaints on the Vega 10 series (Vega 56/64) most of the time, but can cause a kernel panic.

This is useless if you’re binding using the bus IDs, it will just confuse things.

No driver has been loaded for the GPU, there is nothing to unbind. In fact, the driver directory wont even exist if there is no loaded kernel module.

This is likely caused by you mixing device ids and bus id bind methods.