Increasing VFIO VGA Performance

Hello @gnif ,

I to am having freezing/game crashes even though games run fine otherwise. Been troubleshooting everything from drivers to bad card.

Never thought of a ‘pinning issue or non local memory access’.

I am pinning my CPUs via Libvirt. As far as memory is concerned supposedly all my memory is allocated on the NUMA node associated with the CPUs I have pinned the VM too.

I was just curious if you just meant that CPUs need to be pinned or if pinning can sometimes causes issues and how to troubleshoot that?

This may be getting too off topic sorry if its a loaded question.

Pinning will not affect stability of the VM, only performance.

Correct, please start a new thread. I apologise if this feels like a short answer but please note that my time is limited between the projects I have on at this time and I can not help debug a qemu fault of this nature.

1 Like

@gnif looks like qemu 3.1 has been released. Not sure if your patch made it, just looking at the pcie.c file. github reveals the file hasn’t been touched for at least 1 year.

In case it didn’t reach the release, can you please provide a clear cut way to apply the patch to qemu 3.1 ? Is there a catch to it ? I’m not entirely literate in C stuff but I can compile stuff from sources (like apache and php). (I’m using Fedora 29, if that matters).

I am sorry but I do not have the time at current to document how to apply the patches, nor do I need to, this information is freely available, see git am for applying patches from a mailbox file.

I am not tracking this patch as to when it will make it into qemu, I do however know it’s being actively worked on as I am seeing updates almost daily by Alex.

Went to do a new build and the patch set failed to apply, seems as of Dec 19 this patch set was committed to qemu master branch. Awesome!

2 Likes

Yep, lots of commits on github on the 19th December relating to PCIe link speeds. :slight_smile:

For those of us on OS’s where we aren’t able to compile and install from master (like unRaid). Am I correct in assuming we’ll see these features once we get an OS update which includes QEMU 4.0?

1 Like

Unraid gets an update when it gets an update.

Typically, yes, now that it’s in master, the next step would be waiting for a stable release that your distro maintainer will build and ship.

4.0 is when it will default to using the higher link speeds, last I read however the 3.2 and later builds have these patches but you must specify the link speed. I have not checked as I have been on break however and could have the versioning wrong :slight_smile:

5 Likes

I applied the changes provided by @gnif by hand on the 3.1 release, it got the job done. nVidia system information reports x8 pcie 3.0 connection.

Then I tried the qemu on the master branch on github, nVidia system information reports x1 1.1 operation or something like that, just like the unpatched qemu. However, gpuz can run a basic test and determines that on load the card runs x8 pcie 3.0. Games run fine, though. Might be a power saving thing, idk.

I’m looking for a configuration example with virt-manager for the master branch (compiled it just yesterday) on how to configure speed and width of the ports. I tried the configuration mentioned by @nibbloid but I think he applied some patches not yet available on github and the vm wouldn’t boot, a popup in virt-manager says the options are not supported.

To install AMD drivers in Windows 10 I use 2 monitors:

  • monitor #2 connected normally via displayport to VM (it is also connected to host via HDMI)
  • monitor #1 connected via DVI-D to Linux host

On the Linux host I connect to the VM’s Remote Desktop (i.e over RDP) with Remmina (this makes Windows 10 logout on monitor #2). Leave Monitor #2 on the logout screen.

I then install the AMD drivers over the RDP session. The RDP session normally exits at around 55-60% driver installation progress & I just reconnect with RDP again to finish the installation.

This method will install the current 18.12.x / 19.1.1. drivers.

I have done some testing on this recently and came across your post. Windows definitely seems runs better for me when the hypervisor flag is left on.

Here is a 3Dmark bench I did with Hypervisor off: https://www.3dmark.com/fs/18253341

And here is on with it on: https://www.3dmark.com/fs/18253824

Almost a 500pts difference. In general, when actually using the OS I notice some slight sluggishness with hypervisor off as well (It seems to negatively impact 2D performance). With it on there’s no perceptible difference to bare metal.

Hi All,

Appologies if this is hyjacking the original topic of the thread, but i do feel its still related to an extent…

The PCIe patch got pushed to us UnRaid users not long ago in an RC for the next version.
We can see the correct slot info populated in the nvidia driver, and seeing the expected performance improvement (which is great).

The UnRaid devs have asked the question as to WHY we are even using the Q35 machine type in the first place for a windows VM, with PCIe passthrough, as i440fx would be better suited, and doesnt suffer from the issue with PCIe speeds that Q35 had. They’re even considering removing the option in their GUI to select Q35 when creating a windows VM because of this: https://forums.unraid.net/topic/77499-qemu-pcie-root-port-patch/?do=findComment&comment=724656

Ive always been under the impression that latency\performance improvements were only really able to be taken advantage of by the Q35 machine type…

So I guess my question is… Why do you guys use Q35 over i440fx? Should we be using Q35 at all if all we’re doing is passing though a PCIe graphics card and NVMe? Is i440fx being developed and improved? Will we still see latency and performance improvements down the line using i440fx?

At the end of the day, I dont care what machine type my VM uses, I just want the closest performance possible to bare metal!

This thread was never about how q35 is faster or lower latency, it’s about those of us that do use it either for improved compatibility with pass-through or whatever can obtain optimal performance from a q35 system.

Please see: https://wiki.qemu.org/Features/VT-d

These features are only available on the q35 platform and some of us require them to either get optimal performance out of our hardware, or even get passthrough to work properly in the fist instance.

The Idea that UnRaid will remove Q35 or even warn about using a newer platform topology that even the Qemu developers are trying to push is idiotic. Not only are they going to make it harder for their users to use UnRaid where Q35 is a requirement, it is going to make people shy away from the option when it is completely valid and may for their particular configuration yield performance gains.

Also we have evidence that shows that when the driver detects a link speed of 0x, or that it’s not on PCIe (ie, i440fx), it programs the SoC differently. Evidence both through benchmarks and the AMDGPU source code in the Linux Kernel which has an todo to implement the PCIe 3.0 specific configuration in: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/soc15.c#L446

5 Likes

It’s also in the complete opposite direction of the oVirt team who recently added Q35 options for BIOS Type (with options for Legacy BIOS, UEFI BIOS, and SecureBoot) to help provide more PCI passthrough options. Definitely a step in the right direction that they included this, it will help when I decide to add a GPU for my media server VM for transcoding 4K.

5 Likes

QEMU 3.1.90 monitor - type ‘help’ for more information
(qemu) qemu-system-x86_64: -device pcie-root-port,id=root_port1,chassis=0,slot=0,bus=pcie.0: Property ‘.speed’ not found

Compiled the latest qemu from git which seems to have this patch already applied. I’m using the following configure;

./configure --prefix=/opt/qemu --target-list=“x86_64-softmmu” --audio-drv-list="" --cpu=x86_64 --disable-blobs --disable-bluez --disable-strip --enable-kvm --enable-libiscsi --disable-xen

Can someone tell me what I’m doing wrong?

-device pcie-root-port,id=root_port1,chassis=0,slot=0,bus=pcie.0
-set device.root_port1.speed=8
-set device.root_port1.width=16
-device vfio-pci,host=05:00.0,bus=root_port1,addr=00.0,multifunction=on
-device vfio-pci,host=05:00.1,bus=root_port1,addr=00.1

3.1.90 is an old version… 3.10 or later please.

Perhaps I’m blind, but I can’t find anything later than 3.1. I’m pulling the latest version from git which is apparently 4.0.0-rc0

root@ws:/usr/src/qemu# ./x86_64-softmmu/qemu-system-x86_64 --version
QEMU emulator version 3.1.90 (v4.0.0-rc0-dirty)
Copyright © 2003-2019 Fabrice Bellard and the QEMU Project developers

Edit: Perhaps you could tell me what commit you’re using and I can try that?

Ah, 4.0 already does it (qemu versioning is odd) and the option names were changed to x-speed and x-width

See: https://wiki.qemu.org/ChangeLog/4.0

Generic PCIe root port link speed and width enhancements: Starting with the Q35 QEMU 4.0 machine type, generic pcie-root-port will default to the maximum PCIe link speed (16GT/s) and width (x32) provided by the PCIe 4.0 specification. Experimental options x-speed= and x-width= are provided for custom tuning, but it is expected that the default over-provisioning of bandwidth is optimal for the vast majority of use cases. Previous machine versions and ioh3420 root ports will continue to default to 2.5GT/x1 links.

Those arguments work, I built the 4.0.0-rc0 release :slight_smile:

Now I am getting windows throwing the error 0xc0000225 whenever I try to boot (even from the ISO itself)
I get the error even without the compile options.

If I switch back to Debian’s packaged qemu and change back to the ioh3420 the VM boots fine, as well as the ISO.

I’m stumped. I don’t understand why the ISO can’t boot.

Edit: I have a feeling this is something to do with the default drivers in the new versions of qemu. I’m going to try adding my VM to Virtual Machine Manager and see what happens.

I’m having (possibly a similar) issue with Windows 10 on v4.0.0-rc0 release, I wrote here about it:

https://forum.level1techs.com/t/qemu-4-0-0-rc0-released/140315/6

I wish for the same benefit of default max speed and width of pcie ports.