Yeah, I’ve only tested on Intel Mobo chipsets.
I forgot to mention before I’ve disable in mobo S states S3 and above. Not all mobo bioses have this option.
To be safe I also make sure to disable suspend and powering off monitors in gnome power settings. Then I just use win/menu-key + L when I go afk. The gnome lock screen will blank/power off monitors after about a minute but not power off GPUs. When combined with the devcon enable/disable startup/shutdown scripts, this consistently works. I have tested now with Ubuntu 16.04 LTS all the way to Fedora 27 with the 4.15+ kernels installed. Without all this I found on Ubuntu 16.04 LTS the guest GPU doesn’t just fail to reinitiatize but the host sometimes actually freezes, hard locks, when the guest is rebooted.
I was really excited when I heard from a bunch of people that this was all fixed in kernels 4.15 onward. Broke out all my AMD cards and put myself thru hell testing. Found this to be Fake News Bigly on most cards. Was a real let down.
I’ve noticed in a bunch of AMD documentation that they consistently state they don’t test or validate S4 power states, even on the AMDgpu-pro drivers and Radeon Pro GPU firmware.
kernel: vfio-pci 0000:0d:00.0: Invalid PCI ROM header signature: expecting 0xaa55, got 0xffff
Yeah, I’ve seen similar to this in the VM logs for my host machine a lot. I’ve actually seen similar messages not pretaining to vfio but just pci rom header and GPU in dmsgs.
It has sparked some glimmer of ideas for me. There is a mod/hack that GPU crypto miners, who bios mod their AMD cards, do in order to get them to function properly after they modify the firmware/bios on their cards. They flash some code onto their cards to disable the hardware’s stupid checks/validation of the signature that AMD/AIBs sign their firmware/bioses with and the cards always check for when initialized. No idea if this will impove things, but it might be worth a try.