Hang at recent drivers installation; vega 56 passthrough

Hi L1TECHS,

I need help in solving an issue for my passthorugh setup.

Issue: Guest KVM Win10 hangs up once amdgpu driver is about to finish install. However, older drivers appear to work fine. Most recent, in 18.x releases, 18.9.3 is working fine for me. I have been using this driver for last year and a half without any problems. Games work perfectly as what I would expect in a raw card power. However, I am compelled to upgrade to most recent one in 20.x releases due to it being appears to be a strict dependancy for Cyberpunk 2077. My game gpu process crashes as per Cyberpunk logs. Please help me in solving this as this is causing sanity loss for me. Under no circumstances I will be runing Windows bare-metal. With that being said, my setup not eligible for support from CDPR.

Hardware

  • AMD Ryzen 2600X
  • B450 TOMAHAWK
  • MSI VEGA 56 OC

Software

  • AMD ComboPI1.0.0.4 Patch B (SMU v46.54)
  • Gentoo GNU+Linux
  • Custom-compiled kernel

Script

#!/bin/bash

# disable efi framebuffer
if [[ -e /sys/bus/platform/drivers/efi-framebuffer/efi-framebuffer.0 ]]
then
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
fi

# amd reset bug workaround
sleep 3
rtcwake --verbose --utc --mode mem --seconds 7
sleep 5

# path config
workdir=/root/scripts
vbios=$workdir/vega-vbios/romfile.rom
ovmfcode=/usr/share/edk2-ovmf/OVMF_CODE.fd
ovmfvars=$workdir/ovmf_vars/OVMF_VARS-win10-name.fd
mkdir --parents --verbose /mnt/hugetlbfs
hugetlbfs=/mnt/hugetlbfs
iso=/home/sdb2/installers/isofile.iso
virtio=/home/sdb2/installers/virtio-windows-drivers/virtio-win-version.iso
rootdisk=/home/nvme0n1p4/qemu-img/windows10-rootdisk.img
userdisk1=/home/md0/qemu-img/windows10-userdisk1.img
userdisk2=/home/sda2/qemu-img/windows10-userdisk2.img

# performance options
ulimit_original=$(ulimit -l)
ulimit_target=$((28*1024*1024))
ulimit -l $ulimit_target
mkdir --verbose --parents $hugetlbfs
mount --verbose --types hugetlbfs hugetlbfs $hugetlbfs
sysctl vm.nr_hugepages=1024
/root/scripts/qemu-isolate.sh

# start qemu kvm vm
qemu-system-x86_64 \
        -nodefaults -nographic -display none -vga none -monitor stdio -enable-kvm -name win10,debug-threads=on \
        -m 26G -mem-prealloc -object memory-backend-file,id=mem1,size=2G,align=2M,mem-path=$hugetlbfs,share=off,discard-data=on,merge=on,prealloc=on \
        -machine pc,accel=kvm,kernel_irqchip=on,vmport=off \
        -no-reboot -boot menu=on \
        -rtc clock=host,base=localtime \
        -cpu host,kvm=off,svm=off,topoext,hv_relaxed,hv_spinlocks=0x1fff,hv_time,hv_vapic,hv_vendor_id=0xDEADBEEFFF,hv_vpindex,hv_synic,hv_stimer,hv_frequencies \
        -smp 10,sockets=1,cores=5,threads=2 \
        -device vfio-pci,host=28:00.0,multifunction=on,x-vga=on,romfile=$vbios,rombar=1 \
        -device vfio-pci,host=28:00.1 \
        -device vfio-pci,host=2a:00.3 \
        -device vfio-pci,host=21:00.0 \
        -device qemu-xhci,id=xhci0 -device usb-host,bus=xhci0.0,hostbus=1,hostport=8 -device usb-host,bus=xhci0.0,hostbus=1,hostport=9 \
        -device qemu-xhci,id=xhci1 -device usb-host,bus=xhci1.0,hostbus=4,hostport=1 -device usb-host,bus=xhci1.0,hostbus=4,hostport=2 \
        -device qemu-xhci,id=xhci2 -device usb-host,bus=xhci2.0,hostbus=1,hostport=7 -device usb-host,bus=xhci2.0,hostbus=1,hostport=10 \
        -device qemu-xhci,id=xhci3 -device usb-host,bus=xhci3.0,hostbus=1,hostport=3 -device usb-host,bus=xhci3.0,hostbus=1,hostport=4 \
        -device qemu-xhci,id=xhci4 -device usb-host,bus=xhci4.0,hostbus=2,hostport=3 -device usb-host,bus=xhci4.0,hostbus=2,hostport=4 \
        -drive if=pflash,format=raw,readonly,file=$ovmfcode \
        -drive if=pflash,format=raw,file=$ovmfvars \
        -device virtio-scsi,id=scsi0 -device scsi-hd,bus=scsi0.0,drive=root -drive file=$rootdisk,id=root,format=raw,cache=none,aio=native,if=none \
        -device virtio-scsi,id=scsi1 -device scsi-hd,bus=scsi1.0,drive=user1 -drive file=$userdisk1,id=user1,format=raw,cache=none,aio=native,if=none \
        -device virtio-scsi,id=scsi2 -device scsi-hd,bus=scsi2.0,drive=user2 -drive file=$userdisk2,id=user2,format=raw,cache=none,aio=native,if=none

# revert performance options
ulimit -l $ulimit_original
sysctl vm.nr_hugepages=0
umount --verbose $hugetlbfs
/bin/bash /root/scripts/qemu-revert.sh

# reboot host
if [[ -z $(who | grep -i -E 'ttyS|ttyUSB|pts') ]]; then reboot; else (echo -e "\nActive SSH Sessions:" && who); fi

Module Parameters

options kvm ignore_msrs=1
options kvm report_ignored_msrs=0
options kvm_amd avic=0
options kvm_amd sev=1
options vfio-pci ids=1002:687f,1002:aaf8,1022:1457,10ec:8168
options vfio-pci disable_idle_d3=1
options vfio_iommu_type1 allow_unsafe_interrupts=1

GRUB

rw forcefsck ipv6.disable=1 amd_iommu=on iommu.passthrough=1 iommu=pt pcie_acs_override=downstream,multifunction qemu_guest_vm_name=windows10

What I have tried:

  • Enabling/Disabling BIOS CSM Mode.
  • Enabling/Disabling PCI rombar.
  • Installing driver using QXL/RDP.
  • Switching between Q35 and PC.
  • Enabling/Disabling AVIC for kvm_amd module.

I can’t edit my post anymore. Are there any limits to number of edits a user is allowed?

Since, I can no longer edit my post here. I am linking my updated reddit post here. If you are willing to help me in solving this issue, please read the updated post here.