Rather than binding a second (more powerful) GPU to vfio-pci
in early early boot and using it exclusively for VFIO passthrough, I’d like to make use of it directly in the host OS.
Eventually the plan is to rebind the GPU between the host and guest as required but, for the time being, I’ll settle for getting it working as a second display/output in Xorg; preferably as an extended screen spanning all monitors.
I assume this should be straightforward enough, but despite trying various configuration options I can’t seem to get it working. I’m hoping someone can spot where I’m going wrong.
System details:
- Nvidia GT 710 (primary PCIe slot), monitor #1 attached (1920x1200)
- Nvidia GTX 1660 Ti (secondary PCIe slot), monitor #2 attached (1920x1080)
- Nvidia proprietary driver 440.59
- Ubuntu 19.10
- GNOME 3.34.2
Attempt 1 - default configuration
Removing early binding to vfio-pci
and allowing Xorg to automatically configure both GPUs results in a single screen on the GT 710. As far as GNOME is concerned, the second GPU/monitor doesn’t exist.
$ xrandr --listproviders
Providers: number : 1
Provider 0: id: 0x278 cap: 0x1, Source Output crtcs: 4 outputs: 3 associated providers: 0 name:NVIDIA-0
Amongst other things Xorg.0.log
reports automatic configuration of a screen on the GT 710, an error creating a GPU screen (more on this later), and correct detection of the 1660 Ti
(==) NVIDIA(0): No modes were requested; the default mode "nvidia-auto-select"
(==) NVIDIA(0): will be used as the requested mode.
(==) NVIDIA(0):
(II) NVIDIA(0): Validated MetaModes:
(II) NVIDIA(0): "CRT-0:nvidia-auto-select"
(II) NVIDIA(0): Virtual screen size determined to be 1920 x 1200
(--) NVIDIA(0): DPI set to (93, 92); computed from "UseEdidDpi" X config
(--) NVIDIA(0): option
[...]
(==) NVIDIA(G0): Depth 24, (==) framebuffer bpp 32
(==) NVIDIA(G0): RGB weight 888
(==) NVIDIA(G0): Default visual is TrueColor
(==) NVIDIA(G0): Using gamma correction (1.0, 1.0, 1.0)
(II) Applying OutputClass "nvidia" options to /dev/dri/card1
(**) NVIDIA(G0): Option "AllowEmptyInitialConfiguration"
(**) NVIDIA(G0): Enabling 2D acceleration
(EE) NVIDIA(G0): GPU screens are not yet supported by the NVIDIA driver
(EE) NVIDIA(G0): Failing initialization of X screen
[...]
(II) NVIDIA(1): NVIDIA GPU GeForce GTX 1660 Ti (TU116-A) at PCI:15:0:0
(II) NVIDIA(1): (GPU-1)
(--) NVIDIA(1): Memory: 6291456 kBytes
(--) NVIDIA(1): VideoBIOS: 90.16.20.40.60
(II) NVIDIA(1): Detected PCI Express Link width: 16X
nvidia-smi
reports the presence of both cards
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59 Driver Version: 440.59 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GT 710 Off | 00000000:0E:00.0 N/A | N/A |
| N/A 44C P8 N/A / N/A | 252MiB / 1992MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce GTX 166... Off | 00000000:0F:00.0 Off | N/A |
| 0% 42C P8 9W / 160W | 1MiB / 5944MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
Attempt 2 - xorg.conf
file
Use NVIDIA X Server Settings to produce an xorg.conf
file.
`xorg.conf`
# nvidia-settings: X configuration file generated by nvidia-settings
# nvidia-settings: version 440.44
Section "ServerLayout"
Identifier "Layout0"
Screen 0 "Screen0" 1920 0
Screen 1 "Screen1" LeftOf "Screen0"
InputDevice "Keyboard0" "CoreKeyboard"
InputDevice "Mouse0" "CorePointer"
Option "Xinerama" "0"
EndSection
Section "Files"
EndSection
Section "Module"
Load "dbe"
Load "extmod"
Load "type1"
Load "freetype"
Load "glx"
EndSection
Section "InputDevice"
# generated from default
Identifier "Mouse0"
Driver "mouse"
Option "Protocol" "auto"
Option "Device" "/dev/psaux"
Option "Emulate3Buttons" "no"
Option "ZAxisMapping" "4 5"
EndSection
Section "InputDevice"
# generated from default
Identifier "Keyboard0"
Driver "kbd"
EndSection
Section "Monitor"
# HorizSync source: edid, VertRefresh source: edid
Identifier "Monitor0"
VendorName "Unknown"
ModelName "DELL 2405FPW"
HorizSync 30.0 - 81.0
VertRefresh 56.0 - 76.0
Option "DPMS"
EndSection
Section "Monitor"
# HorizSync source: unknown, VertRefresh source: unknown
Identifier "Monitor1"
VendorName "Unknown"
ModelName "LG Electronics LG TV"
HorizSync 30.0 - 83.0
VertRefresh 58.0 - 62.0
Option "DPMS"
EndSection
Section "Device"
Identifier "Device0"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GT 710"
BusID "PCI:14:0:0"
EndSection
Section "Device"
Identifier "Device1"
Driver "nvidia"
VendorName "NVIDIA Corporation"
BoardName "GeForce GTX 1660 Ti"
BusID "PCI:15:0:0"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
Option "Stereo" "0"
Option "nvidiaXineramaInfoOrder" "CRT-0"
Option "metamodes" "nvidia-auto-select +0+0"
Option "SLI" "Off"
Option "MultiGPU" "Off"
Option "BaseMosaic" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection
Section "Screen"
Identifier "Screen1"
Device "Device1"
Monitor "Monitor1"
DefaultDepth 24
Option "Stereo" "0"
Option "metamodes" "nvidia-auto-select +0+0 {AllowGSYNC=Off}"
Option "SLI" "Off"
Option "MultiGPU" "Off"
Option "BaseMosaic" "off"
SubSection "Display"
Depth 24
EndSubSection
EndSection
Observations:
-
xrandr
still shows the same single provider - I can move the mouse cursor onto the second monitor, but the desktop does not extend. Nor is it possible to drag application windows across.
- However it is possible to run certain applications on the second monitor with the
DISPLAY
environment variable. E.g.DISPLAY=:1.1 glxgears
-
Xorg.{0,1}.log
files contain no obvious errors
Attempt 3 - Xinerama
Using NVIDIA X Server Settings, enable the “Enable Xinerama” option. Which manifests as Option "Xinerama" "1"
in the xorg.conf
file. After restarting GDM (systemctl restart display-manager.service
), the monitor that used to work now shows a black screen with a blinking cursor.
Observations
- No outright errros in
Xorg.0.log
and confirmation that Xinerama being enabled
(**) Option "Xinerama" "1"
(==) Automatically adding devices
(==) Automatically enabling devices
(==) Automatically adding GPU devices
(==) Automatically binding GPU devices
(**) Xinerama: enabled
- But the Nvidia driver complains about it
(WW) NVIDIA: The Composite and Xinerama extensions are both enabled, which
(WW) NVIDIA: is an unsupported configuration. The driver will continue
(WW) NVIDIA: to load, but may behave strangely.
(WW) NVIDIA: Xinerama is enabled, so RandR has likely been disabled by the
(WW) NVIDIA: X server.
Nvidia’s documentation suggests that the composite extension can be disabled nvidia-xconfig --no-composite
, which adds this to xorg.conf
Section "Extensions"
Option "COMPOSITE" "Disable"
EndSection
Attempt 4 - Disable Xinerama, disable composite
Results in this lovely screen
Oh no! Something has gone wrong.
A problem has occurred and the system can’t recover. Please contact a system administrator
Attempt 5 - Enable Xinerama, disable composite
Results in a black screen with blinking cursor, same as Attempt 3
Attempt 6 - PRIME offload
If I can’t extend the desktop across multiple GPUs/displays, I’d at least be happy with offloading rendering to the more powerful GPU. If Nvidia’s documentation is to believed, this should be possible with it’s Prime Render Offload feature. The main caveat being that Xorg requires a set of patches to be applied. Something that, fortunately, has already happened in Ubuntu.
Create a minimal xorg.conf
Section "ServerLayout"
Identifier "layout"
Option "AllowNVIDIAGPUScreens"
EndSection
After restarting the dispaly sever, Xorg.0.log
confirms creation of the screen (the same “G0” seen in logs earlier)
(==) NVIDIA(G0): Depth 24, (==) framebuffer bpp 32
(==) NVIDIA(G0): RGB weight 888
(==) NVIDIA(G0): Default visual is TrueColor
(==) NVIDIA(G0): Using gamma correction (1.0, 1.0, 1.0)
(II) Applying OutputClass "nvidia" options to /dev/dri/card1
(**) NVIDIA(G0): Option "AllowEmptyInitialConfiguration"
(**) NVIDIA(G0): Enabling 2D acceleration
(II) NVIDIA: The X server supports PRIME Render Offload.
(--) NVIDIA(0): Valid display device(s) on GPU-1 at PCI:15:0:0
(--) NVIDIA(0): DFP-0
(--) NVIDIA(0): DFP-1
(--) NVIDIA(0): DFP-2
(--) NVIDIA(0): DFP-3
(--) NVIDIA(0): DFP-4 (boot)
(--) NVIDIA(0): DFP-5
(--) NVIDIA(0): DFP-6
(II) NVIDIA(G0): NVIDIA GPU GeForce GTX 1660 Ti (TU116-A) at PCI:15:0:0
(II) NVIDIA(G0): (GPU-1)
(--) NVIDIA(G0): Memory: 6291456 kBytes
(--) NVIDIA(G0): VideoBIOS: 90.16.20.40.60
(II) NVIDIA(G0): Detected PCI Express Link width: 16X
xrandr
displays two providers
Providers: number : 2
Provider 0: id: 0x3f1 cap: 0x1, Source Output crtcs: 4 outputs: 3 associated providers: 0 name:NVIDIA-0
Provider 1: id: 0x198 cap: 0x0 crtcs: 0 outputs: 0 associated providers: 0 name:NVIDIA-G0
Running vkcube
(from the vulkan-tools
package) with environment variables described in the documentation works, but I can’t tell if it’s actually making use of the second GPU. I.e.
$ __NV_PRIME_RENDER_OFFLOAD=1 vkcube
The second test fails
$ __NV_PRIME_RENDER_OFFLOAD=1 __GLX_VENDOR_LIBRARY_NAME=nvidia glxinfo
name of display: :1
XIO: fatal IO error 17 (File exists) on X server ":1"
after 47 requests (47 known processed) with 0 events remaining.
XIO: fatal IO error 17 (File exists) on X server ":1"
after 47 requests (47 known processed) with 0 events remaining.
As a final test, attempting to launch Shadow of the Tomb Raider (with and without the environment variables) crashes with the error message
Vulkan device has no suitable graphics queue families
After more searching around, it seems as though others attempting this kind of configuration have discovered that, ironically, it doesn’t work with two Nvidia cards.
Any help finding a solution is greatly appreciated.