Increasing VFIO VGA Performance

silviu · January 3, 2019, 9:28am

I applied the changes provided by @gnif by hand on the 3.1 release, it got the job done. nVidia system information reports x8 pcie 3.0 connection.

Then I tried the qemu on the master branch on github, nVidia system information reports x1 1.1 operation or something like that, just like the unpatched qemu. However, gpuz can run a basic test and determines that on load the card runs x8 pcie 3.0. Games run fine, though. Might be a power saving thing, idk.

I’m looking for a configuration example with virt-manager for the master branch (compiled it just yesterday) on how to configure speed and width of the ports. I tried the configuration mentioned by @nibbloid but I think he applied some patches not yet available on github and the vm wouldn’t boot, a popup in virt-manager says the options are not supported.

itoffshore · January 25, 2019, 4:17pm

To install AMD drivers in Windows 10 I use 2 monitors:

monitor #2 connected normally via displayport to VM (it is also connected to host via HDMI)
monitor #1 connected via DVI-D to Linux host

On the Linux host I connect to the VM’s Remote Desktop (i.e over RDP) with Remmina (this makes Windows 10 logout on monitor #2). Leave Monitor #2 on the logout screen.

I then install the AMD drivers over the RDP session. The RDP session normally exits at around 55-60% driver installation progress & I just reconnect with RDP again to finish the installation.

This method will install the current 18.12.x / 19.1.1. drivers.

x3sphere · February 10, 2019, 5:35am

I have done some testing on this recently and came across your post. Windows definitely seems runs better for me when the hypervisor flag is left on.

Here is a 3Dmark bench I did with Hypervisor off: https://www.3dmark.com/fs/18253341

And here is on with it on: https://www.3dmark.com/fs/18253824

Almost a 500pts difference. In general, when actually using the OS I notice some slight sluggishness with hypervisor off as well (It seems to negatively impact 2D performance). With it on there’s no perceptible difference to bare metal.

billington.mark · February 25, 2019, 11:05am

Hi All,

Appologies if this is hyjacking the original topic of the thread, but i do feel its still related to an extent…

The PCIe patch got pushed to us UnRaid users not long ago in an RC for the next version.
We can see the correct slot info populated in the nvidia driver, and seeing the expected performance improvement (which is great).

The UnRaid devs have asked the question as to WHY we are even using the Q35 machine type in the first place for a windows VM, with PCIe passthrough, as i440fx would be better suited, and doesnt suffer from the issue with PCIe speeds that Q35 had. They’re even considering removing the option in their GUI to select Q35 when creating a windows VM because of this: https://forums.unraid.net/topic/77499-qemu-pcie-root-port-patch/?do=findComment&comment=724656

Ive always been under the impression that latency\performance improvements were only really able to be taken advantage of by the Q35 machine type…

So I guess my question is… Why do you guys use Q35 over i440fx? Should we be using Q35 at all if all we’re doing is passing though a PCIe graphics card and NVMe? Is i440fx being developed and improved? Will we still see latency and performance improvements down the line using i440fx?

At the end of the day, I dont care what machine type my VM uses, I just want the closest performance possible to bare metal!

gnif · February 25, 2019, 3:09pm

This thread was never about how q35 is faster or lower latency, it’s about those of us that do use it either for improved compatibility with pass-through or whatever can obtain optimal performance from a q35 system.

Please see: https://wiki.qemu.org/Features/VT-d

These features are only available on the q35 platform and some of us require them to either get optimal performance out of our hardware, or even get passthrough to work properly in the fist instance.

The Idea that UnRaid will remove Q35 or even warn about using a newer platform topology that even the Qemu developers are trying to push is idiotic. Not only are they going to make it harder for their users to use UnRaid where Q35 is a requirement, it is going to make people shy away from the option when it is completely valid and may for their particular configuration yield performance gains.

Also we have evidence that shows that when the driver detects a link speed of 0x, or that it’s not on PCIe (ie, i440fx), it programs the SoC differently. Evidence both through benchmarks and the AMDGPU source code in the Linux Kernel which has an todo to implement the PCIe 3.0 specific configuration in: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/soc15.c#L446

lessaj · February 26, 2019, 1:44am

It’s also in the complete opposite direction of the oVirt team who recently added Q35 options for BIOS Type (with options for Legacy BIOS, UEFI BIOS, and SecureBoot) to help provide more PCI passthrough options. Definitely a step in the right direction that they included this, it will help when I decide to add a GPU for my media server VM for transcoding 4K.

Josh_Jameson · March 20, 2019, 1:28pm

QEMU 3.1.90 monitor - type ‘help’ for more information
(qemu) qemu-system-x86_64: -device pcie-root-port,id=root_port1,chassis=0,slot=0,bus=pcie.0: Property ‘.speed’ not found

Compiled the latest qemu from git which seems to have this patch already applied. I’m using the following configure;

./configure --prefix=/opt/qemu --target-list=“x86_64-softmmu” --audio-drv-list="" --cpu=x86_64 --disable-blobs --disable-bluez --disable-strip --enable-kvm --enable-libiscsi --disable-xen

Can someone tell me what I’m doing wrong?

-device pcie-root-port,id=root_port1,chassis=0,slot=0,bus=pcie.0
-set device.root_port1.speed=8
-set device.root_port1.width=16
-device vfio-pci,host=05:00.0,bus=root_port1,addr=00.0,multifunction=on
-device vfio-pci,host=05:00.1,bus=root_port1,addr=00.1

gnif · March 20, 2019, 2:05pm

3.1.90 is an old version… 3.10 or later please.

Josh_Jameson · March 20, 2019, 2:11pm

Perhaps I’m blind, but I can’t find anything later than 3.1. I’m pulling the latest version from git which is apparently 4.0.0-rc0

root@ws:/usr/src/qemu# ./x86_64-softmmu/qemu-system-x86_64 --version
QEMU emulator version 3.1.90 (v4.0.0-rc0-dirty)
Copyright © 2003-2019 Fabrice Bellard and the QEMU Project developers

Edit: Perhaps you could tell me what commit you’re using and I can try that?

gnif · March 20, 2019, 2:14pm

Ah, 4.0 already does it (qemu versioning is odd) and the option names were changed to x-speed and x-width

See: https://wiki.qemu.org/ChangeLog/4.0

Generic PCIe root port link speed and width enhancements: Starting with the Q35 QEMU 4.0 machine type, generic pcie-root-port will default to the maximum PCIe link speed (16GT/s) and width (x32) provided by the PCIe 4.0 specification. Experimental options x-speed= and x-width= are provided for custom tuning, but it is expected that the default over-provisioning of bandwidth is optimal for the vast majority of use cases. Previous machine versions and ioh3420 root ports will continue to default to 2.5GT/x1 links.

Josh_Jameson · March 20, 2019, 3:38pm

Those arguments work, I built the 4.0.0-rc0 release

Now I am getting windows throwing the error 0xc0000225 whenever I try to boot (even from the ISO itself)
I get the error even without the compile options.

If I switch back to Debian’s packaged qemu and change back to the ioh3420 the VM boots fine, as well as the ISO.

I’m stumped. I don’t understand why the ISO can’t boot.

Edit: I have a feeling this is something to do with the default drivers in the new versions of qemu. I’m going to try adding my VM to Virtual Machine Manager and see what happens.

silviu · March 26, 2019, 1:15am

I’m having (possibly a similar) issue with Windows 10 on v4.0.0-rc0 release, I wrote here about it:

https://forum.level1techs.com/t/qemu-4-0-0-rc0-released/140315/6

I wish for the same benefit of default max speed and width of pcie ports.

gnif · March 26, 2019, 1:17am

This is not the place to ask about general QEMU issues.

You’re using 4.0-rc0 which is not a stable release, which if you intend to do such things you’re on your own with support.
You’re having a VM crash/freeze, it could be due to passthrough or any number of other things, ask on the QEMU mailing list, not here.

silviu · March 26, 2019, 6:54pm

All due respect, asking a small question or sharing tips and tweaks is far from what “support” means, no one actually demands support.

I understand and respect your point of view, though.

Now, going back to the subject and gnif’s initial observation:

I see they released v4 rc1 which seems to have fixed some issues with windows 10 (of which I’m concerned) so I’m able to experiment with q35 4.0 machine type. They do over-provision the VM, a q35-3.1 machine had x1 bus as reported by nvidia control panel (check system information) and a q35-4.0 machine now has x16 bus. GPUZ detects a 8 lanes in either case, which reflects the actual configuration in my case (2 slots running x8 x8).

gnif · March 26, 2019, 9:28pm

No, but you asked an off topic question, this thread isn’t about why Qemu might be crashing. This forum is full of very smart people willing to help, but if cross posting to off topic posts is how people are going to behave it’s going to drive them/us away.

The topic of this post and discussion here is highly technical and it’s hard enough to follow as it is simply due to the evolving nature of this thread as the discovery process went.

It is not “released”, it has been made available, RC stands for “Release Candidate” which means, it might be ready for release, but needs testing and may contain serious bugs, etc.

Yes, they now default to the maximum specified configuration of PCIe 4, but you can (and really should) set it to what it really is by specifying the values via the x-speed and x-width parameters.

Just because GPUZ sees it correct doesn’t mean the driver is identifying the card the same way, it may be programming the GPU registers incorrectly. We already know that GPUZ doesn’t see it the same as when you view the “System Information” in the nVidia control panel. Here you get the true configuration of the card as the driver sees it.

gnif · March 28, 2019, 1:22am

Start a new thread asking this question, it is indeed of topic. This is NOT a qemu support thread.

SonWon · March 28, 2019, 9:19am

I deleted the post to clean up the thread and created a new post, sorry for the distraction!

gnif · April 18, 2019, 11:32pm

2 posts were split to a new topic: Improving Looking Glass Capture Performance

FurryJackman · April 24, 2019, 5:56am

QEMU 4.0.0 has just been released. This post is now in stable code:

teacult · March 13, 2022, 8:32am

This thread is very valuable for learning thank you very much. I have passed through 6500 XT .
Struggled with same GPU-Z bus shows PCI instead of PCI-e. Since I am using ubuntu 18.04 I thought it belongs here. After updating to latest arch kernel qemu and libvirt it works on bare metal performance.
And I have used ioh - xio downstream and upstream pci-e switches on ubuntu 18.04 replicated the structure on both on qemu2.11 with libvirt and qemu 6.1 commandline , even it did not work. I guess the trick is in the kernel.