This might be unusual because it is not a request for help, but rather a solution that took me a day to figure out and I could not find anything useful regarding this issue with my keywords: nvidia RTX 3090, qemu, black screen. May someone find this on their search for a solution and save a day or two of debugging. Here goes.
tl;dr: If you’re trying to use an RTX 3090 with qemu and all you’re getting is a black screen in the guest VM, try disabling resizable BAR in BIOS. Yes, even if enabling that setting worked just fine with another card.
(To be honest I found a similar post talking about how disabling “Above 4G decoding” in BIOS fixes the issue. However this was no option for my setup, and also it is just a second-hand fix as disabling Above 4G decoding on most boards also automatically disables “Resizable BAR”. Only rebar however was the real issue here.)
Longer version:I’ve been using virtualization with GPU passthrough for my work Windows VM for many years now. I have a tiny Radeon Pro WX2100 (in slot 2) for my host GPU and I have been passing through various GPUs over the years, all without any real issues.
The GPU I had been using for passthrough before all this mess was a Vega 56 (in slot 1). Passthrough to Windows just worked, no surprises.
Then I got an RTX 3080, added that to slot 3, passed it through to the VM and it just worked. Windows even told me resizable BAR was enabled and everything. Cool.
Then I found out I needed a lot more VRAM for a bunch of projects and got an RTX 3090. So out with the Vega 56, in with the RTX 3090 and after adding the card my VM refused to boot. Just an instant black screen, no error messages in any log. The RTX 3080 still passed through fine, but the RTX 3090 gave me the black screen with no feedback.
I tried pretty much everything:
- switching around the cards in different slots
- initializing the cards with either nvidia-driver or vfio
- passing the VBIOS ROM in various variations (self-extracted, downloaded and modified, etc)
- all kinds of tricks and settings from various wikis, forums and blog posts
- all kinds of qemu command line parameters
- trying the git versions of various components in the chain
- different chipsets for the VM, BIOS/EFI boot
- even tried booting the VM using Seabios instead of OVMF (yeah, I know…)
- a ton of other things I already forgot
I then stumbled over the parameter “max-ram-below-4G=1G” as a fix for people having initialization issues with high-memory Tesla cards. But no luck for me.
But that last “fix” gave me the idea that my low-level initialization problem might be related to the amount of memory on the new card. After all the RTX 3080 is basically the same card and worked just fine. So I tried disabling “Resizable BAR” in system BIOS and voila the VM booted up again with no issues at all. Now I can even pass through both the RTX 3090 and RTX 3080 with no issues. No VBIOS loading needed, none of the tricks and hacks you read about required. Just straight up passthrough works fine.
So again: If you have issues passing a high-memory card like an nvidia RTX 3090 to a guest VM, try disabling resizable bar in your UEFI. I’m sure this will be fixed eventually and this post will be obsolete. However since the adoption of resizable BAR is rather new there’s currently not much information on this potential issue.
May this help someone in a similar situation.