I am trying to set up a windows passthrough on my new threadripper system and I have run in to a problems. I didn’t realize when purchasing my components that having two identical video cards would give me such a headache.
Anyway, I have a successfully installed version of win7 running on qemu through virt-manager (using the guide from Archlinux) and I know that PCI passthrough in general must be working as I was able to pass through my nvme HD for the install and it works.
However, I am failing to get PCI passthrough to work on my video cards. I have identical cards (Radeon pro wx 3100) and I have even managed with some frustration to get the system running were one card is bound to the amdgpu driver and the other is bound to the vfio-pci driver.
When I edit virt-manager to pass through the card bound to vfio-pci (and the associated audio deivce), things fail.
Virt-manager seems to show something running and there is a qemu process which is visible, but there is nothing coming into the display. X will become much less interactive and my logs show a lot of something not going right:
Output
Jun 1 14:07:13 threadripper kernel: br0: port 3(vnet0) entered disabled state
Jun 1 14:07:13 threadripper kernel: device vnet0 left promiscuous mode
Jun 1 14:07:13 threadripper kernel: br0: port 3(vnet0) entered disabled state
Jun 1 14:07:13 threadripper kernel: input: Belkin Corporation Flip CC as /devices/pci0000:40/0000:40:07.1/0000:44:00.3/usb5/5-4/5-4.1/5-4.1:1.0/0003:050D:3201.0008/input/input17
Jun 1 14:07:13 threadripper kernel: belkin 0003:050D:3201.0008: input,hiddev96,hidraw0: USB HID v1.10 Device [Belkin Corporation Flip CC] on usb-0000:44:00.3-4.1/input0
Jun 1 14:07:14 threadripper kernel: nvme nvme1: pci function 0000:42:00.0
Jun 1 14:07:14 threadripper kernel: nvme 0000:42:00.0: enabling device (0400 -> 0402)
Jun 1 14:07:14 threadripper kernel: nvme1n1: p1 p2 p3
Jun 1 14:07:14 threadripper ntpd[5133]: Deleting interface #12 vnet0, fe80::fc54:ff:fe64:842e%9#123, interface stats: received=0, sent=0, dropped=0, active_time=33 secs
Jun 1 14:09:41 threadripper kernel: nvme nvme1: failed to set APST feature (-19)
Jun 1 14:09:41 threadripper kernel: br0: port 3(vnet0) entered blocking state
Jun 1 14:09:41 threadripper kernel: br0: port 3(vnet0) entered disabled state
Jun 1 14:09:41 threadripper kernel: device vnet0 entered promiscuous mode
Jun 1 14:09:41 threadripper kernel: br0: port 3(vnet0) entered blocking state
Jun 1 14:09:41 threadripper kernel: br0: port 3(vnet0) entered forwarding state
Jun 1 14:09:43 threadripper kernel: vfio_ecap_init: 0000:42:00.0 hiding ecap 0x19@0x168
Jun 1 14:09:43 threadripper kernel: vfio_ecap_init: 0000:42:00.0 hiding ecap 0x1e@0x190
Jun 1 14:09:43 threadripper kernel: vfio-pci 0000:08:00.0: enabling device (0002 -> 0003)
Jun 1 14:09:43 threadripper kernel: vfio_ecap_init: 0000:08:00.0 hiding ecap 0x19@0x270
Jun 1 14:09:43 threadripper kernel: vfio_ecap_init: 0000:08:00.0 hiding ecap 0x1b@0x2d0
Jun 1 14:09:43 threadripper kernel: vfio_ecap_init: 0000:08:00.0 hiding ecap 0x1e@0x370
Jun 1 14:09:43 threadripper kernel: vfio-pci 0000:08:00.1: enabling device (0000 -> 0002)
Jun 1 14:09:44 threadripper kernel: usb 5-4.1: reset low-speed USB device number 3 using xhci_hcd
Jun 1 14:09:44 threadripper ntpd[5133]: Listen normally on 13 vnet0 [fe80::fc54:ff:fe64:842e%10]:123
Jun 1 14:09:45 threadripper kernel: usb 5-4.1: reset low-speed USB device number 3 using xhci_hcd
Jun 1 14:09:45 threadripper kernel: vfio_bar_restore: 0000:08:00.1 reset recovery - restoring bars
Jun 1 14:09:45 threadripper kernel: vfio_bar_restore: 0000:08:00.0 reset recovery - restoring bars
Jun 1 14:09:45 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:45 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:45 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Event logged [
Jun 1 14:09:46 threadripper kernel: IOTLB_INV_TIMEOUT device=08:00.0 address=0x000000083d4fbdb0]
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:46 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Event logged [
Jun 1 14:09:47 threadripper kernel: IOTLB_INV_TIMEOUT device=08:00.0 address=0x000000083d4fbde0]
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:47 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Event logged [
Jun 1 14:09:48 threadripper kernel: IOTLB_INV_TIMEOUT device=08:00.0 address=0x000000083d4fbe10]
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:48 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:49 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:49 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:49 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:49 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:09:49 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
.....
Jun 1 14:10:06 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:10:06 threadripper kernel: AMD-Vi: Event logged [
Jun 1 14:10:06 threadripper kernel: IOTLB_INV_TIMEOUT device=08:00.0 address=0x000000083d4fa170]
Jun 1 14:10:06 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:10:06 threadripper kernel: AMD-Vi: Completion-Wait loop timed out
Jun 1 14:10:07 threadripper kernel: AMD-Vi: Command buffer timeout
Jun 1 14:10:07 threadripper kernel: WARNING: CPU: 15 PID: 22340 at drivers/iommu/amd_iommu.c:1255 __domain_flush_pages+0xe1/0xf0
Jun 1 14:10:07 threadripper kernel: Modules linked in: rfcomm snd_hda_codec_realtek snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm amdkfd amdgpu mfd_core chash gpu_sched ttm vfio_pci vfio_virqfd ip_set_hash_net xt_set xt_recent xt_comment xt_addrtype ipt_rpfilter ip_set_hash_ip xt_hashlimit xt_CT xt_multiport xt_LOG nf_conntrack_sane nf_nat_snmp_basic nf_conntrack_snmp nf_conntrack_broadcast nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre devlink bnep iwlmvm mac80211 uas iwlwifi btusb btrtl btbcm usbhid btintel cfg80211 bluetooth igb rfkill efivarfs
Jun 1 14:10:07 threadripper kernel: CPU: 15 PID: 22340 Comm: qemu-system-x86 Not tainted 4.16.3-gentoo #4
Jun 1 14:10:07 threadripper kernel: Hardware name: Gigabyte Technology Co., Ltd. X399 DESIGNARE EX/X399 DESIGNARE EX-CF, BIOS F1 09/06/2017
Jun 1 14:10:07 threadripper kernel: RIP: 0010:__domain_flush_pages+0xe1/0xf0
Jun 1 14:10:07 threadripper kernel: RSP: 0018:ffff904ec79bfc30 EFLAGS: 00010282
Jun 1 14:10:07 threadripper kernel: RAX: 00000000fffffffb RBX: ffff8a12c2264210 RCX: 0000000000000000
Jun 1 14:10:07 threadripper kernel: RDX: 0000000000000000 RSI: 0000000000000286 RDI: ffff8a13bd409814
Jun 1 14:10:07 threadripper kernel: RBP: ffff904ec79bfc78 R08: 0000000000000780 R09: 0000000000000001
Jun 1 14:10:07 threadripper kernel: R10: 0000000000000002 R11: 0000000000000001 R12: ffff8a12c2264210
Jun 1 14:10:07 threadripper kernel: R13: 00000000fffffffb R14: 0000000000000000 R15: 7fffffffffffffff
Jun 1 14:10:07 threadripper kernel: FS: 00007f0154012b80(0000) GS:ffff8a13bddc0000(0000) knlGS:0000000000000000
Jun 1 14:10:07 threadripper kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Jun 1 14:10:07 threadripper kernel: CR2: 0000555731935000 CR3: 0000000704722000 CR4: 00000000003406e0
Jun 1 14:10:07 threadripper kernel: Call Trace:
Jun 1 14:10:07 threadripper kernel: amd_iommu_unmap+0x6c/0xa0
Jun 1 14:10:07 threadripper kernel: __iommu_unmap+0xed/0x190
Jun 1 14:10:07 threadripper kernel: vfio_unmap_unpin+0xfe/0x1d0
Jun 1 14:10:07 threadripper kernel: vfio_remove_dma+0x1d/0x50
Jun 1 14:10:07 threadripper kernel: vfio_iommu_type1_ioctl+0x74a/0x9b0
Jun 1 14:10:07 threadripper kernel: do_vfs_ioctl+0xba/0x6c0
Jun 1 14:10:07 threadripper kernel: ? syscall_trace_enter+0x181/0x320
Jun 1 14:10:07 threadripper kernel: SyS_ioctl+0x91/0xa0
Jun 1 14:10:07 threadripper kernel: do_syscall_64+0x5b/0x100
Jun 1 14:10:07 threadripper kernel: entry_SYSCALL_64_after_hwframe+0x3d/0xa2
Jun 1 14:10:07 threadripper kernel: RIP: 0033:0x7f014e2d2607
Jun 1 14:10:07 threadripper kernel: RSP: 002b:00007fff7d55d498 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Jun 1 14:10:07 threadripper kernel: RAX: ffffffffffffffda RBX: 00007fff7d55d590 RCX: 00007f014e2d2607
Jun 1 14:10:07 threadripper kernel: RDX: 00007fff7d55d4a0 RSI: 0000000000003b72 RDI: 000000000000002b
Jun 1 14:10:07 threadripper kernel: RBP: 00000000000c0000 R08: 000000007ff40000 R09: 000000007fffffff
Jun 1 14:10:07 threadripper kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 0000555732416270
Jun 1 14:10:07 threadripper kernel: R13: 000000007ff40000 R14: 0000555732416260 R15: 000000007fffffff
Jun 1 14:10:07 threadripper kernel: Code: 8b 1b 4c 39 e3 75 dd 45 85 ed 75 1f 48 8b 44 24 18 65 48 33 04 25 28 00 00 00 75 13 48 8d 65 d8 5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b eb dd e8 96 67 95 ff 66 0f 1f 44 00 00 53 48 8d 9f c0 fe
Jun 1 14:10:07 threadripper kernel: ---[ end trace 0b9754d6ffe42242 ]---
Jun 1 14:10:07 threadripper kernel: AMD-Vi: Command buffer timeout
Jun 1 14:10:07 threadripper kernel: AMD-Vi: Command buffer timeout
This text will be hidden
This goes on for quit a while, more or less repeating until I kill the system.
I don’t even know where to begin trouble shooting this problem.
Any suggestions?