Pass through Vega to Ubuntu guest in Proxmox?

Asus Prime X370-Pro, Ryzen 1700, Vega Frontier, latest Proxmox. I've enabled iommu and svm in BIOS. I'm getting hella confused from the numerous (sometimes conflicting) tutorials and wikis showing how to pass the PCIe through to a guest VM. Once I have the BIOS stuff enabled, what do I need to do to configure the guest to use the card? Any advice greatly appreciated, thanks :slight_smile:

edit: It's a headless setup, so the host doesn't need GPU, I'm ok with the guest having exclusive access. I did add the following to /etc/modules:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

I also changed /etc/default/grub to include GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on"

Be sure to click "Show Full Post" :wink:

Dont forget to edit your "vmid".conf file

Can Ryzen boot without a GPU?

Of course, a GPU is completely irrelevant for booting a system. As long as you have a way to access it without graphics (web console, SSH, etc.) then it's fine.

ok so i'm back at home now so here some more info. Unfortunately I wiped my proxmox drive and VM's three days ago, so I am not able to look things up anymore, but I basically had a comparable setup: proxmox host with a couple of VM's and 1 (windows) VM had gpu passthrough (also only 1 gpu on the motherboard). I started following this guide, but going along it raised more questions than it answered so I switched to this one with some changes.

Here is a recap of how I got it working:
- Install VM like normal and enable remote desktop (noVNC console in proxmox webmanager won't work anymore after passthrough)
- look up gpu with lspci
- edit grub, I also ended up adding the interrupt part (GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1")
- update grub and reboot
- add the vfio lines to /etc/modules like you already have
- use dmesg | grep -e DMAR -e IOMMU to confirm it was working
- add lines to vmid.conf
- startup VM, check if the gpu is recognised in the device manager and install AMD gpu drivers.
- passthrough usb and sata ports, install steam, play games :slight_smile:

The interrupt remapping part raised some questions with me, but I couldn't get it working so I ended up adding the allow unsafe interrupts line part in the end. Does "dmesg | grep ecap" give you any result?

I am by no means an expert, but I was able to play games so I was a happy camper.

edit: Also I recommend starting your VM in the CLI and not by pressing the start button in the proxmox webmanager. By using the CLI I got error messages which I would not get in the webmanager. To start a VM with the commang line use "qm start vmid". Replace vmid with the id of the vm you want to start.

As far as I know Vega FE should support SR-IOV so once Linux / Proxmox supports that fully for your GPU you should have some very fancy passthrough setup possibilities.

:smiley:

I appreciate the info. I did get passthrough "kind of" working, a Windows Server VM was at least able to install the drivers. That was enough to hopefully confirm that passthrough was working, but here is the problem now: Initialization of the card seems to fail during guest startup when I pass the card through to an Ubuntu 16.04 guest. I'm OK in linux, but not OK enough to see what's going on here.. Any help greatly appreciated :slight_smile:

root@ubuntu:~# dmesg | grep amd
[ 4.620388] [drm] amdgpu kernel modesetting enabled.
[ 5.031112] amdgpu 0000:01:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
[ 5.059961] amdgpu 0000:01:00.0: VRAM: 16368M 0x000000F400000000 - 0x000000F7FEFFFFFF (16368M used)
[ 5.060762] amdgpu 0000:01:00.0: GTT: 32093M 0x000000F7FF000000 - 0x000000FFD4DEFFFF
[ 5.063160] [drm] amdgpu: 16368M of VRAM memory ready
[ 5.063882] [drm] amdgpu: 32093M of GTT memory ready.
[ 5.068217] amdgpu 0000:01:00.0: amdgpu: using MSI.
[ 5.069340] [drm] amdgpu: irq initialized.
[ 5.488787] amdgpu: [powerplay] amdgpu: powerplay sw initialized
[ 5.490875] amdgpu 0000:01:00.0: fence driver on ring 0 use gpu addr 0x000000f7ff000008, cpu addr 0xffff9daf645f2008
[ 5.491937] amdgpu 0000:01:00.0: fence driver on ring 1 use gpu addr 0x000000f7ff000010, cpu addr 0xffff9daf645f2010
[ 5.493096] amdgpu 0000:01:00.0: fence driver on ring 2 use gpu addr 0x000000f7ff000018, cpu addr 0xffff9daf645f2018
[ 5.494133] amdgpu 0000:01:00.0: fence driver on ring 3 use gpu addr 0x000000f7ff000028, cpu addr 0xffff9daf645f2028
[ 5.495155] amdgpu 0000:01:00.0: fence driver on ring 4 use gpu addr 0x000000f7ff000030, cpu addr 0xffff9daf645f2030
[ 5.496157] amdgpu 0000:01:00.0: fence driver on ring 5 use gpu addr 0x000000f7ff000038, cpu addr 0xffff9daf645f2038
[ 5.497160] amdgpu 0000:01:00.0: fence driver on ring 6 use gpu addr 0x000000f7ff000048, cpu addr 0xffff9daf645f2048
[ 5.498161] amdgpu 0000:01:00.0: fence driver on ring 7 use gpu addr 0x000000f7ff000050, cpu addr 0xffff9daf645f2050
[ 5.499168] amdgpu 0000:01:00.0: fence driver on ring 8 use gpu addr 0x000000f7ff000058, cpu addr 0xffff9daf645f2058
[ 5.501089] amdgpu 0000:01:00.0: fence driver on ring 9 use gpu addr 0x000000f7ff000068, cpu addr 0xffff9daf645f2068
[ 5.503201] amdgpu 0000:01:00.0: fence driver on ring 10 use gpu addr 0x000000f7ff000070, cpu addr 0xffff9daf645f2070
[ 6.755841] amdgpu 0000:01:00.0: fence driver on ring 11 use gpu addr 0x000000f403f80600, cpu addr 0xffffb8ef02a5a600
[ 6.757070] amdgpu 0000:01:00.0: fence driver on ring 12 use gpu addr 0x000000f7ff000098, cpu addr 0xffff9daf645f2098
[ 6.758230] amdgpu 0000:01:00.0: fence driver on ring 13 use gpu addr 0x000000f7ff0000b0, cpu addr 0xffff9daf645f20b0
[ 6.761279] amdgpu 0000:01:00.0: fence driver on ring 14 use gpu addr 0x000000f7ff0000c8, cpu addr 0xffff9daf645f20c8
[ 6.762443] amdgpu 0000:01:00.0: fence driver on ring 15 use gpu addr 0x000000f7ff0000d8, cpu addr 0xffff9daf645f20d8
[ 6.763527] amdgpu 0000:01:00.0: fence driver on ring 16 use gpu addr 0x000000f7ff0000f0, cpu addr 0xffff9daf645f20f0
[ 6.778584] amdgpu 0000:01:00.0: [mmhub] VMC page fault (src_id:0 ring:157 vm_id:0 pas_id:0)
[ 6.779712] amdgpu 0000:01:00.0: at page 0x000000f7ff200000 from 18
[ 6.780744] amdgpu 0000:01:00.0: VM_L2_PROTECTION_FAULT_STATUS:0x0000013A
[ 6.906787] [drm:psp_hw_init [amdgpu]] ERROR PSP firmware loading failed
[ 6.907709] [drm:amdgpu_device_init [amdgpu]] ERROR hw_init of IP block failed -22
[ 6.908715] amdgpu 0000:01:00.0: amdgpu_init failed
[ 8.053600] Modules linked in: amdkfd amd_iommu_v2 amdgpu(+) cirrus i2c_algo_bit ttm drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops ahci psmouse hid libahci drm
[ 8.056541] [] amdgpu_gtt_mgr_fini+0x39/0x70 [amdgpu]
[ 8.056541] [] amdgpu_ttm_fini+0xce/0x220 [amdgpu]
[ 8.056541] [] amdgpu_bo_fini+0x12/0x40 [amdgpu]
[ 8.056541] [] gmc_v9_0_sw_fini+0x32/0x40 [amdgpu]
[ 8.056541] [] amdgpu_fini+0x2af/0x460 [amdgpu]
[ 8.056541] [] amdgpu_device_init+0xf68/0x11b0 [amdgpu]
[ 8.056541] [] ? amdgpu_driver_load_kms+0x28/0x230 [amdgpu]
[ 8.056541] [] amdgpu_driver_load_kms+0x5d/0x230 [amdgpu]
[ 8.056541] [] amdgpu_pci_probe+0xbe/0xf0 [amdgpu]
[ 8.056541] [] amdgpu_init+0x93/0xa4 [amdgpu]
[ 8.105485] [drm] amdgpu: ttm finalized
[ 8.106176] amdgpu 0000:01:00.0: Fatal error during GPU init
[ 8.106764] [drm] amdgpu: finishing device.
[ 8.110492] amdgpu: probe of 0000:01:00.0 failed with error -22
root@ubuntu:~#