VFIO/Passthrough in 2023 - Call to Arms

Thank you . Will try that. I don’t think I came across this solution before.

PS: Same issue. I don’t think linux has the kernel module for Ryzen 7000 igpu. The font is primitive to me comparing R7 240.

I am using Fractal Design Meshify 2. Thus, case is not a problem for me.
The AM5 platform just doesn’t have enough PCIE lines. As you said, one option is to get sata SSDs. Actually, it is very practical if the performance is not important.
Alternatively, I can use USB 3 to u.2/nvme converters. Performance wise, they should be on par with sata.

1 Like

You have two main paths to running “many” NVMe drives in a single system:

  1. A hardware platform with a ton of PCIe lanes (as you’ve noted)—so think Threadripper (Pro), Sapphire Rapids, Epyc, etc.—and a bunch of four-slot carrier cards in PCIe x16 slots bifurcated to x4/x4/x4/x4.
  2. A tri-mode HBA and breakout cables (or a U.3 backplane). Each SlimSAS port can break out to eight NVMe drives at x1 each, four at x2 each, etc. Please note that you’d need 2.5" U.2/U.3 drives (or U.2/U.3 enclosures for M.2 drives) in this configuration.

It’s worth noting that operating fast NVMe drives at full bore requires commensurate amounts of CPU horsepower and system memory bandwidth. Wendell has covered this extensively on the channel. So to really push e.g. a dozen Gen4 NVMe drives at 7GB/sec or 1M IOPS each you’d probably want something like a Sapphire Rapids workstation with eight channels of DDR5 anyway, making the first option the obvious choice.

However, if you’re looking for a relatively simple alternative to SAS or SATA SSDs where maximum performance is not a priority—i.e. for a NAS or what have you—the second option would get you a nice solution without requiring an absolutely monstrous hardware platform. Drop a 9500-16i into any CPU-attached Gen4 x8 slot and you’ll be good to go.

Oh my.

which Linux distro are you using?

I’ve been using PCI passthrough for my past builds as daily driver. I like the technology very much. I recently built (my first) an AMD PC. Here’s my experience.

CPU: AMD 7950X
MoBo: ASRock X670E Steel Legend (AGESA 1.0.0.7b)
RAM: Corsair DDR5 6000mhz CL40 2x2x32
GPU1: MSI RTX 4080 (in PCI 5.0 slot)
GPU2: Gigabyte GTX 1080 (in PCI 3.0 slot)
Other: 1 NVME, 2 SATA SSD, 2 SATA HDD
Host OS: Arch kernel 6.4 (GTX 1080)
Guest OS: Win 10 guest 1 (RTX 4080 passed through)
Win 10 guest 2 (Raphael iGPU pass through work-in-progress)

Initial setup:
BIOS: No over-clocking, IOMMU enabled, advanced error reporting enabled, ACS enabled, reBAR enabled, SR-IOV disabled, primary output external GPU.
VR connected to 4080, monitor connected to 4080, monitor connected to 1080, no monitor connected to iGPU.
Followed Archlinux OVMF guide, virtualization hidden.
Notes: binding vfio-pci through kernel parameter was unsuccessful, while binding it through /etc/modprobe.d/vfio.conf did the trick.
Graphic only output through 4080 during boot sequence.
Host would freeze during boot after initramfs and loading udev.
After disconnecting the monitor and the VR on 4080, the host would pause for a moment and resumes booting.
Connecting the monitors back to 4080, starting guest 1 produces black screen.

reBAR disabled:
host still freezes during boot, guest 1 black screen
passing qemu arg “-fw_cfg opt/ovmf/X-PciMmio64Mb,string=65536”
guest 1 shows “Guest has not initialized the display (yet)”

efifb:off, bind amdgpu through /etc/X11/xorg.conf.d/10-gpus.conf, supplying RTX 4080 vBIOS in the xml of hostdevice section
host outputs bios and bootloader through 4080 only.
host video output through 4080 would freeze during boot sequence, the video output would shift to 1080.
guest 1 pass through achieved.

iGPU passthrough (still troubleshooting)
added vesafb:off, nofb, pci=realloc, and updated /etc/X11/xorg.conf.d/10-gpus.conf for all 3 GPU
successfully bind vfio-pci to AMD Raphael iGPU and sound devices, iGPU and sound devices were in different IOMMU groups
IOMMU group passed to guest 2 without crashing host. Guest 2 Code 43. AMD driver installer could not properly identify the iGPU.
Could not find vBIOS for Raphael iGPU.
Tried to dump vBIOS using UBU from MoBo BIOS or amdfbflash from live host OS, both failed. The former would return an empty folder, while the latter would prompt error message saying adaptor not found.
Found a vBIOS dump on this forum through cat /sys/kernel/debug/dri/0/amdgpu_vbios > vbios.rom. Code name was misspelled as rafael, but device ID matched.
Other people claimed that it is in fact a valid vBIOS, but it still got me a black screen for guest 2.
Found many recent posts on reddit complaining trouble passing through AMD 7000 series Raphael iGPU.

Side note:
The MoBo has 3 USB 3.0 controllers and 8 ports. 1 connected to CPU while the other 2 connected to a chipset through a PCIe 4.0 with other PCI sockets.
Passing the controller directly connected to CPU will crash host.
The other two controllers will not crash host, but can only be passed together even though they are in different IOMMU groups.

2 Likes

This is where I get stuck too. The AMD website will pick out a driver for me, but the actual installer will complain and bail. Device manager shows error code 43.

I use the Zen4 iGPU since Kernel 6.1 and 4k/120Hz would probably not be possible without suitable drivers, or what do you think?
I haven’t had any problems with kernel 6.1, since 6.2 I need “amdgpu.sg_display=0”, without I get a white flickering screen instead of kde login.

Did you select iGPU in your bios?

I am running Ubuntu Server without desktop environment. I am pretty sure once I install the desktop, everything would be fine.

I get most of not all of my hardware from recyclers, so I don’t really get to choose. I’ve got a Haswell Xeon HP box that has its IOMMU groups laid out fairly well, like literally everything is it’s in own IOMMU group, excluding chipset stuff. I’m passing a Maxwell Quadro that hasn’t put up a fight so far.

I do have a W5700 Radeon Pro that I’m using for the host, and using Looking Glass to do video and Fedora as the OS that has a broken xorg for some reason, so I’m on Wayland with buggy X11 windows and a disappearing mouse; not sure what happened there. Looking Glass seems to work fine without it’s kernel module, so I’m not sure what’s happening there.

I used the kernel cmdline arguments for the passthough, never got specific PCIe device binding working. I might try taking a crack at it next month though, as I need to host some more Windows VMs on it for my other folks soon.

I can say that it’s been extremely stable when I use it, although suspend does break the VM, and it has to be rebooted. I’ve been using it on and off on the weekly, and it hasn’t failed me thus far.

Using macOS as a guest instead of Windows does have it’s issues, most notably that I can’t use anything older than Maxwell due to UEFI support, and there aren’t any drivers, so yeah fail there. I did try a Kepler GPU and macOS did detect it, but couldn’t output video on Big Sur.

My only gripe about this setup is that it doesn’t have enough PCIe power, and that can’t easily be fixed.

For the life of me I missing a step. I had to reload due to my own mistakes. I am running an arch based distro and I can not recall how I got the VFIO drivers to load for the 1050 I want to pass thru. Since the other card is a 2080 Super I can’t really blacklist the nvidia driver or I just boot to a black screen. If anyone has a tip of what I am missing that would be great. I have add the lines back to grub loader for card and a vfio.con in modprobe.d with the following: options vfio-pci.ids=10de:1c82,10de:0fb9

To add my recent success with vfio on proxmox using an Intel Arc A770M on a NUC12SNKI72, which had its own reset issue:

I never posted my setup, but it’s as follows:

EPYC 7443p
Supermicro H12SSL-i
512GB RAM
A2000 for the host
A4000 for the Windows VM
Samsung NVMe boot drive
Samsung NVMe VM drive
Some 60T of spinny rust for storage and Plex/Jellyfin etc
Ubuntu 23.04 vanilla, no special sauce in the software stack

Sprinkled with healthy dose of RGB and maglev fans for silent, yet visually loud computing.

2 Likes

thanks, I’ve now tried Ubuntu 23.04, with “local APIC mode” set to auto, I can boot normally and the iGPU works without any problems.
With X2APIC I get all sorts of errors and the system won’t boot

I mean, I don’t really need X2APIC, so far everything works without it, but it still annoys me.

Edit: Windows 10 also won’t boot with X2APIC and it looks like it doesn’t make sense to bother ASUS with it either, because it seems to be a problem with ZEN4 and it’s a known issue

https://lore.kernel.org/xen-devel/[email protected]/T/

You should save the USB converters as a last resort. You know that every kind of converter has the ability to “just not want to work” in specific hardware/software configs… :-/
If you are building a media server or are just a r/datahoarder, then SATA would be plenty fast enough. And you wouldn’t have to worry at all about not having enough PCIe lanes since each SATA expansion card only requires a x1 slot.

I’d love to experiment more with GPU virtualisaton, since I’d like to have my games and their pesky anticheats contained in a separate env, plus serve some of my not-so-rich friends some cloud gaming.

IMHO vGPU comes in three flavors: SR-IOV, time slicing(vGPU-P), and Nvidias own enterprise vGPU which is a whole mess of a setup and licensing to deal with.

I wonder if we could get something close to a Hyper-V vGPU-P but in Proxmox.

I mean it’s already working on most of consumer-grade Nvidia cards without any hacks. It also works flawlessly on some AMD cards (no info on Intel).

I believe it’s the way to go, coz it doesn’t require to enable anything from the GPU vendor, which is very unlikely to change.

1 Like

When adding the GPU and it’s audio you need to make sure that the GPU and the audio device share the same PCI slot, the GPU has function 00 and the audio function 01 and the multifunction is enabled in the XML for the GPU.

If you were adding the GPU and the audio from the libvirt cofiguration by clicking “Add Host Device” they are both added as separate devices, when in reality they are part of the same device and this breaks the GPU drivers. Can you confirm how you added the GPUs to the VM config?

Also, for your vfio driver loading issue - you don’t need to add the kernel line and modprobe line, for me the kernel parameter is enough. Also, make sure that your hooks in mkinitcpio.conf are defined properly - are you following the arch wiki?

1 Like

I am adding them in the manner you mention. Thank you for the clarification on that. This has to be the first I have heard this. Although I do not promise this as I have read so many wikis etc… I am using Garuda and mkinitcpio is not used, its initramfs / dracut I think. So I have been for a lack of a better way saying it On The Job Training in getting this working. Whiles it is arch based and I love Garauda, trying to get this working has made desire something other distro. I am currently working well other than the audio from card. I will try adding it in a differnt fashion.

The arch wiki has been my goto for the most part, as newbie its a bit difficult at times but having fun learning :wink:

So I should add them both then edit the audio to be set that same PCI slot as the GPU card basicly?

Here is the current entry:

GPU:

<hostdev mode="subsystem" type="pci" managed="yes">
 <driver name="vfio"/>
  <source>
    <address domain="0x0000" bus="0x04" slot="0x00" function="0x0"/>
  </source>
  <alias name="hostdev0"/>
  <rom file="/home/brett/gpu-20230825T210850.rom"/>
  <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/>

Sound Card:

<hostdev mode="subsystem" type="pci" managed="yes">
  <driver name="vfio"/>
  <source>
    <address domain="0x0000" bus="0x04" slot="0x00" function="0x1"/>
  </source>
  <alias name="hostdev4"/>
  <address type="pci" domain="0x0000" bus="0x07" slot="0x00" function="0x0"/>
</hostdev>

So after much reading here I went ahead and set up Looking glass. Much success.

My setup

Ryzen 5800x
Asus X570 Tuf Gaming
32GB DDR4 4200MHz G.Skill kit
Passthrough GPU - Asus 3070ti - Bought at the height of Pandemic prices, so this sucker has to last
Desktop GPU - Intel Arc 750 - Bought at a discount
Base OS - Arch Linux no major modifications from base install (actually just moved over from Deb)

All went fine with one minor self imposed hiccup. Looking-Glass client was having trouble connecting and complained about bad versions but it was because I cloned repo and built the client which was the standard wrong assumption by me rather than reading what was written, 30 minutes lost is the worst that happened there. But otherwise the docs are solid and easy.

I am still messing around with CPU flags and topology to get windows to be happy with Memory integrity in Core isolation but that looks pretty straight forward. There is apparently performances costs there so I’ll play and see for myself.

At the same time, have been rockin’ VFIO passthrough in the traditional Proxmox to Plex LXC container setup.

Platform: (Basically half of VVKs setup :stuck_out_tongue: )
EPYC 7282
Supermicro H12SSL-i
128GB ECC RAM
Nothing for the host in GPU (relying on BMC)
Intel Arc A380 for the LXC. Works like a charm.

Nothing out of the ordinary there and its been pretty solid for years now. I used to have a nvidia card in there, but the Arc is amazing. I used to occassionally upgrade the kernel and forget to reinstall the nvidia kernel modules, the difference between GPU transcoding and CPU transcoding is significant. The intel stuff is nicer as there isn’t shitty drivers that have nvidia pecularities attached to it. IIRC it was install, apt and passthrough for basic support. (Minor note that intel_gpu_top won’t run or show meaningful results in a unprived container of course, do it on the host)

Had some good fortune that it was struggling to get Plex to see the GPU and use it (The container could see it but Plex wouldn’t use it) but then some four days later after I put the A380 in they updated for explicit GPU selection and it was happy with DG2.
Eventually I want to get everything in AV1 as this plays nicely with everything I use now.

For the desktop setup its all Gaming related. The OBS plugin is super helpful as (and its not graphically intensive) I do some Speedrun races which requires OBS streaming. Many of the tracking tools are windows only and don’t play nicely with Wine or Proton.

So huge thanks to everyone sharing their setups and experiences and especially those providing support - Gave me much to search through.

Gnif’s troubleshooting support both here and LG discord was great for searching and finding just about every question and minor pitfall I had. Solid stuff.

All of this setup was so much fun. Never thought I’d be such a fan of intel GPUs for so many things.

3 Likes

I never used this, what is it good for?

My GPU configuration looks like this, I haven’t noticed any problem with it yet

 <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
      </source>
      <rom file='/etc/libvirt/navi31.rom'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x2'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x2'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <source>
        <address domain='0x0000' bus='0x03' slot='0x00' function='0x3'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x3'/>
    </hostdev>

Not sure on hostdev, it placed that in there on its own.

The video seems a little faster having it set to VFIO.