Ubuntu 17.04 -- VFIO PCIe Passthrough & Kernel Update (4.14-rc1)

wendell · September 24, 2017, 1:14am

DRAFT of the article for ubuntu passthrough setup from the live stream. Images are missing for right now. Any comments? A video is being edited that goes through the how-to step by step.

Step 1
install ubuntu
step 2
install ukuu

sudo apt-add-repository -y ppa:teejee2008/ppa
sudo apt-get update
sudo apt-get install ukuu

Use ukuu to install the latest kernel. In our case we want Kernel 4.14-rc1 because that is the latest kernel as of the time of this writing and this kernel is known to work on this hardware configuration, even though it is a pre-release kernel. This kernel, or later, is also important for AMD Ryzen or Threadripper users as it contains a number of updates for those platforms.

The update utility will often fail the first time due to the cache folders not being created automatically. Just run it again and it will work the second time.

Notice that as it updates the kernel, it also automatically runs update-grub and update-initramfs and that it complains about missing microcode. It isn’t clear if the microcode is required, but the microcode is non-free. It can be found on the internet, however.

We must also add some kernel parameters and re-run the update commands above, but that can be done later.

step 3 (optional, skip)
find microcode and copy to /var/lib/firmware

step 4 (doing the vfio stuff)
We are ready to setup a virtual machine and pass through real graphics hardware to the virtual machine. To start, we need to make sure the host platform has been configured correctly: That hardware virtualization extensions are enabled, and that the IOMMU groupings are suitable.

First, make sure that vt-d (or SVM on the amd platform) is enabled in the UEFI. This varies from platform to platform but typically Level1 Techs motherboard review cover where these options are and what the IOMMU groupings are for particular motherboards.

Most Linux distros do not enable IOMMU by default. You will need to update your grub bootloader config to support IOMMU. Fortunately, Ubuntu makes it fairly straightforward.

Edit /etc/default/grub and find the line below. Append “iommu=1 intel_iommu=on” to the quoted string. e.g.

GRUB_CMDLINE_LINUX_DEFAULT="iommu=1 intel_iommu=on"

(Or if you are on the AMD platform, amd_iommu=on is more appropriate.)

another perfectly valid command line might be something like:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash iommu=1 intel_iommu=on"

Reboot the system, the it is a good idea to check to see that IOMMU grouping is working properly. This script will help you:

#!/bin/bash
for d in /sys/kernel/iommu_groups/*/devices/*; do
  n=${d#*/iommu_groups/*}; n=${n%%/*}
  printf 'IOMMU Group %s ' "$n"
  lspci -nns "${d##*/}"
done

This is a shells cript. Paste this into a file such as ls-iommu.sh and then run
chmod +x ./ls-iommu.sh
to mark it as executable. It should be run with
./ls-iommu.sh

You should see output similar to the following:

If you do not see any “IOMMU Group X” in the output, it means that IOMMU has not been properly enabled. You can run the dmesg command to look for clues after a fresh bootup.

Notice that our two “VGA compatible controller” entries (our graphics cards) show up as the R9 Fury and the Polaris12. The Polaris12 is our RX550. Each graphics card has an accompanying audio device, and each card’s audio device is in the same IOMMU group as the parent card.

We are ready to setup our devices to be used inside a virtual machine. To do this, we must bind the vfio-pci driver to the device(s) we want to pass through to the virtual machine, and this is most easily done by PCI device ID. Note the PCI device IDs: [1002:699f] and [1002:aae0] from the ls-iommu.sh script. (The device IDs can also be seen with the lspci -nn command).

It is important to understand the boot process on Ubuntu 17.04: The initial ram disk (/boot/initrd*) contains drivers and “other stuff” necessary to bootstrap the system from nothing and mount the root filesystem and continue with the rest of the boot up process. Because graphics drivers tend to load early, and because the amdgpu will want to gab hold of both of our graphics cards, we must carefully ensure that the vfio-pci driver is given the opportunity to load ahead of the amdgpu driver so that it takes priority. This is accomplished by modifying the initial ramdisk as follows:

Modify the
/etc/initramfs-tools/modules
file and add the following lines:

softdep amdgpu pre: vfio vfio_pci

vfio
vfio_iommu_type1
vfio_virqfd
options vfio_pci ids=1002:699f,1002:aae0
vfio_pci ids=1002:699f,1002:aae0
vfio_pci
amdgpu

(substituting 1002:699f and 1002:aae0 for your device IDs, of course). The syntax here is probably imperfect; it should not be necessary to specify an options line and vfio_pci with ids both. The author should clean this up later.

To keep our system consistent, we should also modify the “regular” bootup module options as well.
Add the follwowing contents to the
/etc/modules
file.

vfio
vfio_iommu_type1
vfio_pci ids=1002:699f,1002:aae0

We will also create explicit configurations for the modules in /etc/modprobe.d ; Again, this is likely overkill but we had some difficulty on the live stream.

/etc/modprobe.d/amdgpu.conf 

softdep amdgpu pre: vfio vfio_pci

/etc/modprobe.d/vfio_pci.conf

options vfio_pci ids=1002:699f,1002:aae0

Once all that is done, reboot the system and run lspci -nnv |less and search the command output for the device ID that you bound to vfio. You should see something similar to the following:

  Kernel driver in use: vfio-pci
  Kernel modules: amdgpu

This indicate that while the amdgpu driver understands this card, the driver servicing this card is vfio-pci. This is what one wants to see; it should be the same for both the graphics card and the audio driver.

Next we need to install virt-manager and the qemu-kvm:

apt install virt-manager qemu-kvm ovmf

If the reader is at this stage, you have done well. We can now setup our virtual machine, but we need to deal with App Armor, a security package for Ubuntu. App Armor, in a nutshell, prevents programs from doing suspicious things.

Edit the /etc/apparmor.d/abstractions/libvirt-qemu file and find the # for usb access section and modify it thusly:

# for usb access
 /dev/bus/usb/** rw,
 /etc/udev/udev.conf r,
 /sys/bus/ r,
 /sys/class/ r,
 /run/udev/data/* rw,

It is a known issue:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1708013

Note that if you intend to pass through a “real” block device to QEMU/libvirt, you will also need to open up things a bit more, adding lines such as:
/dev/sda rw,

If, for example, you intend to use /dev/sda as a passthrough block device with your virtual machine. If you are using a qcow2 or image file, you don’t need to do this.

With the file modified, you must restart apparmor with

service apparmor restart

If you find AppArmor is still causing problems, you can remove it entirely (not recommended):

service apparmor stop
service apparmor teardown
update-rc.d -f apparmor remove
apt remove apparmor

How do you know if AppArmor is causing problems? It will give you clues when you run the dmesg command or check the logs located at /var/log

audit: type=1400 audit(1506210770.241:42): apparmor="DENIED" operation="open" profile="libvirt-407d2e4f-3a8e-47ba-abfb-45a333b7fd8f" name="/dev/bus/usb/" pid=2714 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=64055 ouid=0

Start Virtual Machine Manager, and create a new VM. Be sure to check “Customize Confiruation” on the last step.

You must change the Firmware from BIOS mode to OVMF, which is a UEFI bios. You must also remove “Video QXL” and “Display Spice” and any other displays from the config. Use “Add Hardware” to add both the graphics card and its companion audio device. The author recommends plugging in a second mouse and keyboard and assigning those USB devices here as well. When the virtual machine is powered on, it will assign those USB devices to the virtual machine.

It would also be a good idea to add an operating system installation ISO if one was not specified during the initial setup wizard from the Virtual Machine Manager. This ISO should support UEFI boot mode.

We downloaded the Windows ISO from Microsoft and set it up using this method.

Finally, install drivers for your hardware. Note that some graphics hardware suffers from what is known as a “Reset Bug” – this means that the graphics card can only be initalized one time and the machine must be power cycled to support reinitialization. The Fury cards are known to suffer from this bug.

If you are using an Nvidia card, you are likely to get a cryptic “Code 43” error. The Nvidia drivers do not support running in a virtual machine, but this can be worked around by modifying your virtual machine configuration to hide the fact that a hypervisor is present.

So you need a raw disk? Virsh edit that sucker:

<disk type='block' device='disk'>
  <driver name='qemu' type='raw'/>
  <source dev='/dev/sda '/>
  <target dev='vdb' bus='sata'/>
</disk>

(dont forget to edit your app armor config to, for example, allow r/w access to /dev/sda or /dev/whatever. And that virtio is waaaaaay faster than sata, but you need to install the virtio drivers THEN switch to virtio from sata…)

<disk type='block' device='disk'>
  <driver name='qemu' type='raw' cache='none' io='native'/>
  <source dev='/dev/sda'/>
  <target dev='vda' bus='virtio'/>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x0d' function='0x0'/>
</disk>

Note that Fedora doesn’t care about not having driver/address, or it didn’t care, but ubuntu does care about having address type pci/etc. Probably also need bootable=1?

<boot order='1'/>

Might be needed, depending on the rest of your vm xml config. safe to edit the gui w/this part too.

When you switch from sata to virtio for the boot disk, you have to give it a push sometimes. Add a second virtio disk to be sure the drivers are working properly, THEN switch the primary disk from sata to virtio.

Build 1830 of windows is buggy. You have to ignore the MSRS in KVM.

options kvm ignore_msrs=1

And if nvidia is giving you code 43, you need to set the vendor id and hide kvm:

<hyperv>
  <relaxed state='on'/>
  <vapic state='on'/>
  <spinlocks state='on' retries='8191'/>
  <vendor_id state='on' value='whatever'/>
</hyperv>
<kvm>
  <hidden state='on'/>
</kvm>

inside your config.

DrewSaga · September 24, 2017, 1:35am

So does this enable us to run a Virtual Machine on the same Keyboard and Mouse as the Host?

1ncanus · September 24, 2017, 1:40am

When I tested ubuntu with my PCI passthrough, I used 16.04.3 with the lastest HWE kernel at the time (I believe 4.12.3?). Because of how my mobo looking for the GPU to use as the main screen, I had to blacklist “amdgpu” so that vfio_pci gets loaded first. For whatever reason the amdgpu driver continually insisted on loading before the vfio_pci driver until I did that.

I have not tested with 17.04, nor with your exact steps, but it may be something that someone might need to do to get their PCI passthrough functional.

I also had to tell the X server which GPU to use, in the “Device” Section, based on the Bus ID:

Section “Device”
…Identifier “Card1”
…BusID "PCI:2:0:0"
EndSection

This is from memory, so apologies if I forgot anything.

wendell · September 24, 2017, 1:42am

we cheated, we did it with

softdep amdgpu pre: vfio vfio_pci

which finally worked after some fiddling.

1ncanus · September 24, 2017, 1:43am

I missed that. That would work as well. thumbs up

wendell · September 24, 2017, 1:43am

Yes, but not this config we didn’t take it that far

1ncanus · September 24, 2017, 1:55am

One more thing, I initially needed the “Display Spice” and “Video QXL” until installed the OS and then the correct driver for the GPU, and THEN removed the Spice and QXL stuff. I couldn’t get mine working otherwise. Possible hardware difference that yours worked and mine required it this way?
Processor: i7-6850k
Mobo: Asrock X99 Extreme4
GPU(passthrough): RX580

DrewSaga · September 24, 2017, 3:35pm

Oh, also I forgot to ask if this would also work on a single display, I figured if that were the case I could use this type of hardware passthrough on a laptop with an iGPU and dGPU…

mihawk90 · September 24, 2017, 8:23pm

Mobile CPUs typically don’t have the necessary Virtualisation extensions. Check that first because chances are that will end your endeavour prematurely.

But yes it should work on a single display as far as I understood from the previous videos.

DrewSaga · September 24, 2017, 8:40pm

Looks like it does:

I was considering going for this laptop in particular:

mihawk90 · September 24, 2017, 8:49pm

Yeah the CPU seems to support it as it seems. Question is then if it can be enabled in the UEFI, if it is then you’re good to go.

DrewSaga · September 24, 2017, 8:51pm

That’s a good question, I have no idea myself as far as this laptop goes.

hondaman · September 24, 2017, 9:02pm

Lets get that NPT bug bounty fund raiser started!

Marf · September 28, 2017, 8:05pm

Hi Wendel,

YOU ARE MY HERO!!!

I managed to get GPU passthrough running on Ubuntu for around 3 weeks, thanks to your orignial articel on IOMMU & Virtualization with Ryzen, but I have some minor issues. Due the leak of Time I could’nt post an update on the original forum thread. But I think with the guidance of your new HowTo here I can get rid of the issues

Anyway, I have some notes to your guide here:

Step 2)
Just informational: May you don’t wish to register a kernel PPA? You could install the newer kernel easy manually:

Go to http://kernel.ubuntu.com/~kernel-ppa/mainline/
Choose your desired kernel version for, instance 4.14-rc2 http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc2/
Download linux-header, linux-header-generic and linux-image package. In this example the following:
- linux-headers-4.14.0-041400rc2_4.14.0-041400rc2.201709242031_all.deb
- linux-headers-4.14.0-041400rc2-generic_4.14.0-041400rc2.201709242031_amd64.deb
- linux-image-4.14.0-041400rc2-generic_4.14.0-041400rc2.201709242031_amd64.deb
open a command line and go to the download directory and install the packages via:

sudo dpkg -i linux-*4.14.0-041400rc2*.deb

Step 4) - Modules
Question: Does the redundant config of the vfio_pci stuff in the /etc/modprobe.d and /etc/initramfs -tools/modules files help to load the right module at runtime? Why I ask: I just configured the vfio-stuff modules in the /etc/modules file and passed the devices IDs for pci_stub or vfio_pci over the kernel parameters in the /etc/default/grub file.

Step 4) - before install qemu-kvm and virt-manager
Hmmm I missed the update of the grub-config and the initramfs in your guide. Before you proceed with the VM stuff you should issue the following commands:
sudo grub-mkconfig; sudo update-grub; sudo update-initramfs -k all -c

And now why you are my hero:

“Modify the
/etc/initramfs-tools/modules
file and add the following lines:
softdep amdgpu pre: vfio vfio_pci…”

I’m looking for weeks for such an parameter XD I stucked to blacklist the amdgpu in the /etc/modprobe.d/blacklist.conf and loaded it via /etc/modules as last module. Your solution is way robust! Great!

“Edit the /etc/apparmor.d/abstractions/libvirt-qemu…”

Woahh… I missed 4 of the 5 privileges -.- I’m dumb I ended up to remove all USB devices before I start my Windows VM and had to add it while the VM boots. Thanks again!

I presented my findings about Windows virtualization and IOMMU this monday on a department event in my company. I managend to do some benchmarks with Unigine Valley and Superposition. Here are my results on a slidly overclocked Sapphire RX 560 Pulse 4GB graphics card:

Native							                 Virtual
Unigine Superpostion - High preset without AA - 1080p [Points]
2579 (FPS: min 15.9; max 23.3; avg 19.3) 	     2466 (FPS: min 10.7; max 23.1; avg 18.5) 

Unigine Valley  - High preset without AA - 1080p [Points]
2248 (FPS: min 27.0; max 100.6; avg 53.7) 		 2051 (FPS: min 10.9; max 101.9; avg 49.0)

Some of my coworkers now want to setup a full virtulized Windows too!

Thanks for your awesome work here!

Edit @ 2017-10-05:
Hmmmm … I figured out: if I change the Performance Settings of a Virtural Disk for instance in QCOW2 format from Hypervisor standard to Buffer mode: DirectSync and EA-Mode to: Threads
I could improve my Valley Performance around 1 FPS or a bit more in avg. resulting in a Score of 2100 up to 2120 points. Even better: the benchmark sequence was noticeable less choppy.

I tested XEN virtualization 4 weeks ago but I didn’t got the pci_back module working right to claim my Sapphire RX 560.
May I’ll manage it to test it again this weekend?..

xekon · October 16, 2017, 2:54am

I appreciate the time spent to write this up, as well as the videos you have made.

I had to replace amdgpu with nvidiafb when I modified "/etc/initramfs-tools/modules"
I found it was nvidiafb by checking the kernel module listed when using lspci -nnv |less

I followed this guide to the tee, but I selected the latest kernel which is v4.14-rc4, this resulted in it not using the kernel driver vfio-pci
once I installed v4.14-rc3 it is now loading the vfio-pci driver.

However both rc2, rc3, and rc4 my menu bar takes a couple of minutes before it loads, and I have to use alt+f2 and search for and launch from there, rc1 worked without issue.
I am running ubuntu 17.04 with plasma-desktop (kde5)

so I am now at the fun part where I get to test out the VM, however I cannot until tomorrow when my displayport cable arrives.

liefer · November 26, 2017, 6:30am

Thank you for this! With this i was finally able to get everything to work perfectly. No other resource out there seem to offer these 2 details:

softdep amdgpu pre: vfio vfio_pci

and

AppArmor

Thanks again!

kitliasteele · July 18, 2018, 12:20am

One thing to note about AppArmour is that you can actually tell it to ‘aa-complain /etc/apparmor.d/libvirtd/*’ and it will whitelist the entire libvirt, including the dynamic files that are created there. Something you may want to add, @wendell to your guide. I tested it and it reports ALLOWED in dmesg, so we can keep apparmour going. I’m still suffering an issue with an error that may still be related. Not sure if you can give insight on it. Screenshot-20180717171947-561x383

EDIT: dmesg output-

[63433.893882] audit: type=1400 audit(1531873183.044:8024): apparmor=“ALLOWED” operation=“file_inherit” profile="/usr/sbin/libvirtd//null-libvirt-dbf05741-fb3e-4564-ba55-1a889636b7dc//null-/usr/bin/kvm" name="/dev/net/tun" pid=56568 comm=“kvm-spice” requested_mask=“wr” denied_mask=“wr” fsuid=1000 ouid=0

Second EDIT: Another thing to note about AppArmor, the command to whitelist is ‘sudo ln -s /etc/apparmor.d/libivrt/* /etc/apparmor.d/disable/’ but you want to carefully time it with entities like Looking Glass, since libvirt will delete the profiles the moment the VM stops. So you want to hit the command the moment you start the VM, so the symlink is retained on everything that was on there at the given moment. This allows you to keep using AppArmor. I’m still investigating the bootup issues, but making progress as I go

inthebrilliantblue · July 25, 2018, 12:45am

Tried this guide on my Ryzen 1700 and Vega 64, and it works when the V64 is not the first GPU on my board. When I move it to the first PCIe slot to have the full x16 bandwidth, the vm will refuse to boot. Just freezes. I know that the location ids for the PCIe passthrough changes, so I made sure to make sure the V64 is still on the right id. Checked the other devices as well and everything is correct. So why wouldnt the vm work when moving the hardware?

wendell · July 25, 2018, 12:46am

Change uefi to init other gfx card first?

On ryzen 1700 the first slot is x8 if the second slot is occupied. It’s x16/x0 or x8/x8

If you want the board to boot from graphics off the chipsets lanes, you need a board that lets you pick slot init order. Like gigabyte.

inthebrilliantblue · July 25, 2018, 12:47am

I did, Ubuntu 18.04 boots to the secondary card and displays.