Hello everyone,
I’m facing a persistent issue trying to get GPU acceleration working in a VM and I’ve hit a wall after extensive troubleshooting. I’m hoping someone with experience with this specific hardware combination might have some insight.
The Goal: Successfully pass through an Intel Data Center GPU Flex 170 to a Rocky Linux 9.4 VM running on a Proxmox host. The final goal is to use the GPU for hardware-accelerated 3D graphics in a ThinLinc remote desktop session, using VirtualGL.
The Hardware & Software Stack:
- Host Server: Dell PowerEdge R7525
- Host CPU: AMD EPYC 7702
- Host OS: Proxmox VE 8.x (latest version, fully updated)
- GPU: Intel Data Center GPU Flex 170
- Guest OS: Rocky Linux 9.4 (fully updated)
- Key Software: ThinLinc, VirtualGL 3.1.1
What Works So Far (The Successes):
We have successfully managed the entire hardware passthrough process.
- BIOS is Correctly Configured: IOMMU (AMD-Vi), SR-IOV Global Enable, and the crucial
AMD IOHC Workaround
are all Enabled.Memory Mapped I/O Limit
is set to a high value (the equivalent of Above 4G Decoding). - Proxmox Host is Correctly Configured: Using an aggressive, multi-layered approach, we have successfully prevented the host’s
i915
andxe
drivers from claiming the GPU. After reboot,lspci -k
on the host shows the GPU is unbound, with no “Kernel driver in use”, making it available for VFIO. The IOMMU groups are clean, with the GPU being perfectly isolated. - Passthrough to VM is Successful: The Intel Flex 170 GPU appears correctly inside the guest VM when checked with
lspci
. - Guest Permissions are Correct:
- A regular user (
rockydesktop1
) has been created. - This user is a member of the
vglusers
andrender
groups. vglserver_config
has been run successfully.- Permissions on
/dev/dri/card0
and/dev/dri/renderD128
correctly showvglusers
as the group owner with read/write permissions.
The Problem: The Final Step Fails
Despite the successful passthrough and correct permissions, the final software link-up fails.
When logged into a graphical desktop via ThinLinc as the user rockydesktop1
, running the final verification command fails with a consistent error:
Bash$ vglrun glxinfo -B name of display: :10 [VGL] ERROR: Could not open display :0.
What We’ve Tried to Solve the “:0 Display” Error:
We have tried to solve the display :0
error by forcing a dedicated X server to run on the GPU, based on a detailed troubleshooting guide.
- Forced X11 Mode: Confirmed Wayland is disabled in
/etc/gdm/custom.conf
. - Attempted to Manually Configure X.org:
- We first tried to install
xorg-x11-drv-intel
, but the package does not exist in the Rocky 9.4 repositories. - We then pivoted to using the modern
modesetting
driver. We manually created an/etc/X11/xorg.conf
file specifyingDriver "modesetting"
and the correctBusID "PCI:1:0:0"
for the GPU. - After restarting
gdm
, theCould not open display :0
error persists, indicating that the X server still fails to start on the headless, passed-through GPU.
- Ruled out SELinux: Temporarily running
sudo setenforce 0
did not change the outcome.
Final Diagnosis: We have reached an impasse. The passthrough works, the permissions are correct, but VirtualGL cannot function because the underlying X.org server fails to initialize on the passed-through GPU inside the VM. This seems to be a fundamental incompatibility or bug in the interaction between the kernel driver (xe
or i915
), the X.org modesetting
driver, and this specific virtualized hardware context.
Any advice, suggestions for kernel parameters I might have missed, or similar experiences would be hugely appreciated. Thank you.