Automatically check hundreds of laptops for GPU pass-through compatibility

I recently heard of a few people getting GPU pass-through to work on gaming notebooks. Unfortunately it is pretty much impossible to know if a notebook is compatible with GPU pass-through before you actually get your hands on one. And I have still not been able to find a good list of compatible/incompatible notebooks.

So my plan is to go to Europe’s biggest electronics store and check every single notebook they have for GPU pass-through compatibility and then publish the list.

So I guess what I need would be a USB stick:

  • with a Linux live distro
  • that has the kernel parameters intel_iommu=on and amd_iommu=on enabled
  • the kernel version should be very recent (at least 4.16(?))
  • it has to be booted in EFI/UEFI mode (is this possible for a live distro?)
  • it should automatically run a bash script on startup to parse the output of the iommu-group-script* and check if there is a dedicated GPU that doesn’t have other important (?) devices in its iommu group
  • check if the conenction between the GPUs and the notebook’s screen is MUX-less or MUXed and in case it is MUXED, if it uses the new or the old technology
  • boot a test KVM and check if the GPU can actually be passed-through successfully

*iommu-group-script:

#!/bin/bash
shopt -s nullglob
for d in /sys/kernel/iommu_groups/*/devices/*; do
    n=${d#*/iommu_groups/*}; n=${n%%/*}
    printf 'IOMMU Group %s ' "$n"
    lspci -nns "${d##*/}"
done;

So in the end the only thing I’d have to do is to plug in the USB stick, go to the UEFI, disable secure boot, enable VT-x/AMD-V & VT-d/IOMMU, save, enter the boot menu, select the USB stick, boot it EFI mode and wait.

Does anyone of you have a lot of experience with this and could tell me if my idea could work and how I could automate the Linux stuff? I’m especially curious about how I could properly determine if the iommu groups are okay automatically. And how I could find out if the GPU to screen connection is MUXed etc.

Edit: I got it to work on some notebooks and with my gained experience decided to start this project: https://github.com/T-vK/MobilePassThrough

For a list of GPU passthrough compatible notebooks see: https://gpu-passthrough.com

2 Likes

This sounds like a great idea, and might be partially successful.

The Caveat

Often the display laptops only contain the shell of the laptop they are emulating and only contain enough hardware specs to run the on screen demo. Their UEFi’s might not be the exact same and any data collected due to this will not be accurate.

1 Like

I’m pretty sure the last time I’ve been there (maybe 2 years ago) this wasn’t the case. They would allow you to interact with it and they would even sell you the display items if you asked for it. I can’t say if they run special UEFIs, but I kinda doubt it.

If they offer to sell the displays then yes it is indeed the as advertised product.

If not then it is the faux.

Thinking about it, I would find that extremely weird. I mean, I couldn’t run a benchmark, I couldn’t check how heavy it feels, … I pretty much coulnd’t do anything other than looting at it… it would basically destroy any purpose I could see of going there instead of ordering from Amazon in the first place.

Never seen anyone really use faux display models for laptops. For phones sure, but not for laptops.

One caveat that’s more realistic though:
Most laptops if they have VT-x & VT-d or AMD-V IOMMU option may have it disabled by default in the bios.

So you may need to enable that first.

But overall you will be screwing with so damn many machines, rebooting changing bios options etc to test these that you should not be surprised if you get kicked out for looking suspicious. Prepare for that scenario.

And I’m quite certain that asking for permission in advance wont help your effort unless you can convince a real nice tech guy there.

1 Like

Well, my plan is to go there at a peek time when no employee would bother looking over my shoulder anyway. But even if they would, I have my doubts they ever even heard of the word BIOS… not to mention UEFI. The last time I’ve been there it was pretty much impossible to get to any employee to ask a simple question because you basically had to wait in line for that.

But yes, you are of course right, this is most likely the most time consuming part… But I’m more than willing to spend my day there to check all the notebooks.

Good plan.
I recommend best doing it in stages on random days to get through all of the machines. Keep good track of what you have and haven’t tested so as not to loose track.

Documented both the supported and unsupported machines.
You have the potential to produce some real good data here.

if you can/want get video footage of the bios screens, if you need to change stuff, but that’s really optional.

All the best :smiley_cat:

1 Like

My recommendation for how to do this is to create a persistent Live disk.
Basically it’s got a rw partition and a read partition for the iso. With the home or some other path mapped to the rw partition.

Then just autorun your script at boot and dump cpuinfo, dmidecode & iommu lists into a file with the time as filename.

Evaluate after the fact.

Basically this for a pre-existing thing

Better would be to craft a custom ultralight arch version for stupid fast booting and evaluation.

https://wiki.archlinux.org/index.php/Installing_Arch_Linux_on_a_USB_key

Some more alternative pointers if you need it:

http://allican.be/blog/2016/02/04/creating_custom_persistent_arch_live_iso.html

1 Like

Yes, I would of course try to document as much as possible and keep all the logs etc. Filming and taking photos is not allowed in there though. Probably to prevent evidence on illegal price fixing from being collected… I don’t know…

I have no experience with Arch yet. I’ll look into it, but I’d prefer Fedora which I’ve been using for around 2 years now. Does Arch come with a recent kernel like Fedora or do you have to manually go through the hassle of installing a new one?

Arch is basically identical and what manjaro is based anyway. Except less stuff preinstalled.

it’s always bleeding edge.

BTW I really recommend Arch over manjaro as it’s much simpler for your purposes. For what you’re trying to do you don’t even need a desktop or much of any tools. Just a kernel and a few binaries to get dmidecode, and generic bash scripts to run.

Heck If I get the time tomorrow I might make a custom usb image to do just that.

1 Like

Not really possible, as things stopped being shipped with BIOS around ~2008 (I think?). Those people are not 18 yet.

I used to work in tech retail; we had to know how to at least get into the BIOS/UEFI to boot our diagnostic flash drive. These people are generally not stupid.

There’s at least one tech who actually knows what they’re doing.


Regarding the showing up at peak times, you probably won’t have to do that. Most of the time the employees have shit to do, so if you’re not going to buy anything then thy usually won’t pester you.

Just do what you’re going to do, and if they ask say you work in IT for some company and that you’re recording information about the computer needed to see if it will work with your business software.

People used to come into the store all the time for info. I’d generally comply as best I could. I really didn’t care if they fucked with the display models, they’d just be re-imaged anyway; creates some extra tech labor as a bonus so if anything it might help them. Just don’t steal them.

2 Likes

Turns out you can simply install Fedora on a USB stick instead of a hard drive to get a portable Linux with persistent storage. So I went with that.

I have also been able to verify that booting with all kernel parameters (those for Intel and those for AMD at the same time) actually works. So I will most likely really only need one USB stick. I have also managed to automate the process of enabling the kernel parameters.

It’s all in this Github repo: https://github.com/T-vK/GPU-pass-through-compatibility-check

edit:
Okay, I found a way to automatically run a script as the user gets logged in on boot. Also, the login happens automatically now and sudo doesn’t require a password anymore. So it should be possible to fully automate it now. You can find the changes in the github repo. The setup.sh automatically configures the system in the way I just described. The gpu-pt-check.sh is the script that still needs to be completed and I will most likely need your help with that.

edit2:
If you’d like to help me and you have Linux with the iommu kernel params enabled, it would help a lot to see what the output the iommu-group-script and “sudo lshw -class display” is on your system.
I need that information In order to properly parse the output of the iommu-group-script to find out which entries are dedicated GPUs and if the IOMMU Groups would even allow GPU pass-through.

Here is an example output of the iommu-group-script*:

*iommu-group-script:

#!/bin/bash
shopt -s nullglob
for d in /sys/kernel/iommu_groups/*/devices/*; do
    n=${d#*/iommu_groups/*}; n=${n%%/*}
    printf 'IOMMU Group %s ' "$n"
    lspci -nns "${d##*/}"
done;

And here is an example output of “sudo lshw -class display” on the same system:

3 Likes

Not a laptop. But a Ryzen desktop. Has no IGPU of course.

So probably not quite what you need.

  *-display                 
       description: VGA compatible controller
       product: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: [email protected]:26:00.0
       version: e7
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
       configuration: driver=amdgpu latency=0
       resources: irq:41 memory:e0000000-efffffff memory:f0000000-f01fffff ioport:e000(size=256) memory:fe800000-fe83ffff memory:c0000-dffff
  *-display
       description: VGA compatible controller
       product: Ellesmere [Radeon RX 470/480/570/570X/580/580X]
       vendor: Advanced Micro Devices, Inc. [AMD/ATI]
       physical id: 0
       bus info: [email protected]:27:00.0
       version: e7
       width: 64 bits
       clock: 33MHz
       capabilities: pm pciexpress msi vga_controller bus_master cap_list rom
       configuration: driver=amdgpu latency=0
       resources: irq:43 memory:c0000000-cfffffff memory:d0000000-d01fffff ioport:d000(size=256) memory:fe700000-fe73ffff memory:fe740000-fe75ffff
IOMMU Group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 10 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 11 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
IOMMU Group 12 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 59)
IOMMU Group 12 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU Group 13 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0 [1022:1460]
IOMMU Group 13 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1 [1022:1461]
IOMMU Group 13 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2 [1022:1462]
IOMMU Group 13 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3 [1022:1463]
IOMMU Group 13 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4 [1022:1464]
IOMMU Group 13 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5 [1022:1465]
IOMMU Group 13 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6 [1022:1466]
IOMMU Group 13 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 [1022:1467]
IOMMU Group 14 01:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961 [144d:a804]
IOMMU Group 15 03:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b9] (rev 02)
IOMMU Group 15 03:00.1 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b5] (rev 02)
IOMMU Group 15 03:00.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:43b0] (rev 02)
IOMMU Group 15 1d:00.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1d:01.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1d:02.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1d:03.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1d:04.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1d:06.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1d:07.0 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] 300 Series Chipset PCIe Port [1022:43b4] (rev 02)
IOMMU Group 15 1f:00.0 Ethernet controller [0200]: Intel Corporation I211 Gigabit Network Connection [8086:1539] (rev 03)
IOMMU Group 16 26:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
IOMMU Group 16 26:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
IOMMU Group 17 27:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 470/480/570/570X/580/580X] [1002:67df] (rev e7)
IOMMU Group 17 27:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Ellesmere [Radeon RX 580] [1002:aaf0]
IOMMU Group 18 28:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:145a]
IOMMU Group 19 28:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor [1022:1456]
IOMMU Group 1 00:01.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 20 28:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) USB 3.0 Host Controller [1022:145c]
IOMMU Group 21 29:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device [1022:1455]
IOMMU Group 22 29:00.2 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
IOMMU Group 23 29:00.3 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller [1022:1457]
IOMMU Group 2 00:01.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 3 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 4 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 6 00:03.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe GPP Bridge [1022:1453]
IOMMU Group 7 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 8 00:07.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) PCIe Dummy Host Bridge [1022:1452]
IOMMU Group 9 00:07.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B [1022:1454]
1 Like

Thanks for the data @catsay it was actually pretty helpful. I have made a lot of changes to the gpu-pt-check.sh script in the repository. It should now be able to report for every GPU in your system if it could be passed through to a VM or if the IOMMU groups wouldn’t allow it.
If I mock your data, the script reports:

GPU ID: 26:00.0 - GPU IOMMU group: 16
Success: GPU with ID '26:00.0' could be passed through to a virtual machine!
GPU ID: 27:00.0 - GPU IOMMU group: 17
Success: GPU with ID '27:00.0' could be passed through to a virtual machine!
----------------------------------------------
Success: There seems to be at least one GPU in this system that can be passed through to a VM!

Which is probably correct. There is still room for improvement. But I’d need to collect more data from other computers first.

The next thing I’ll try to add to the script will be some logic to check if the virtualization technologies have been enabled in the UEFI. I also wonder is there is a simple way to tell if an iGPU has been disabled in the UEFI.

I have still no clue how I could check how the GPUs and the notebook screen are connected.

edit:
The gpu-pt-check.sh script now automatically checks if AMD-V / VT-X and if AMD’s IOMMU and Intels VT-D is enabled in the UEFI. It also checks if the IOMMU kernel parameters are set. I also improved the logic for checking if the IOMMU groups allow a GPU to be passed to a VM and made the script print the information in a more readable way.
I think this looks pretty okay:

In some cases it may still report inaccurately. For instance I found a case where you could only pass all GPUs together:

Edit2:
The script can now handle cases like the one mentioned above.
I also added logic to

  • detect if the device is a laptop
  • detect the laptop name
  • detect the BIOS version
  • detect if the laptop is MUXed or MUX-less (I couldn’t test this one yet)

And the script is now automatically logging all the information to a log file and it also creates extremely detailed hardware/system logs now.

Edit3:
I’m not sure how I should go about the automatic VM creation.
If this is possible, I think it would be best to first create a normal VM and virtual disk, then install windows and all the tools / drivers on it that would be required to verify that it works as expected. Then I would actually delete the VM, but keep the virtual disk. And every time I’d check another laptop a new VM would be created automatically using something like this:

sudo virt-install --name=win-gpu-pt-vm \
--disk path="$VHD_FILE" \
--machine q35 \
--boot uefi \
--vcpus=2 \
--memory=2048 \
--os-variant=win10
--host-device $GPU_ID

This way I wouldn’t have to go through the hassle of fixing the VM’s config because the whole hardware layout changed etc. I could also simply specify to use something like 50% of the available RAM and CPU cores (which is bound to vary on different laptops).

I’m also trying to figure out the exact implications of MUXed/MUX-less laptops when it comes to GPU passthrough.

1 Like

I have some news. Today I went to a small local electronics store to check what kind of notebooks they have and to test my USB stick in its current state.
They do actually have about 6 gaming notebooks (and also about 30 normal notebooks/ultrabooks). I managed to check the UEFI of 3 of the gaming notebooks. (The others were installing Windows updates after I initiated the shutdown and didn’t finish before I left…)
The 3 gaming notebooks all had a VT-X and VT-D option in the UEFI which is great! And they also had an iGPU related option (I think it was how much RAM the iGPU should get) indicating that the iGPU is not disabled as I have seen it on other gaming notebooks in the past.
After a while an employee came to me asking if he could help me. And I just said something among the lines of “Maybe. Can you tell me which of these models have a VT-D or IOMMU option in the UEFI or which of them have the dGPU in a separate IOMMU group or which of them have a multiplexed GPU to screen connection.” To which he replied something like “VT what? Sorry I would have to look that up myself. Just go ahead, I trust you don’t change any settings?” …
I didn’t tell him that I was about to disable secure boot and boot into my Linux stick, but unfortunately I couldn’t get it to fully boot. It always got stuck at some point and the output on the screen was sort of glitched and only showed the last 5 lines which were of no help. (The quiet kernel parameter was not set.)

I’m not really sure what to do now. I guess I could go there again tomorrow and try to boot a normal live distro, but since I don’t have Internet there, I couldn’t install all the stuff that I need. But I guess I could manually add the iommu kernel params before the live distro boots and then I could plug in another USB stick that has the IOMMU script.

On the other two gaming notebooks that I managed to check out, I didn’t understand how to get into the boot menu or how to configure the boot order in the UEFI so that it would boot from my USB stick… Maybe some of you could explain it to me. I secretly took a few photos:

The boot order option that you see is selected in the photo only had two options: “Windows Boot Manager …” and “Disabled”

When I clicked Add New Boot Option I got into this menu:

(Sorry the photo came out so blurry I didn’t wanna risk getting kicked out so I didn’t take another one.)

I didn’t understand what exactly I was supposed to do in that menu and decided not to do anything because I was too scared I would actually somehow mess with the Windows boot loader.

Edit:
In case anyone of you would like to try something like that as well, it was not very trivial to get into the UEFI. I had to hold Shift while selecting Shutdown in Windows and then I had to hold F2 before turning the notebook on, then turn it on (still holding F2) until the UEFI showed up.

Update (sort of):
I’ve created gpu-passthrough.com which for now simply hosts a list of tested notebooks. I will add a few more to the list the next few days.

If you know of a device that is compatible or not, it would be nice if you’d submit an issue or pull request on the GitHub project for the website.

If you want to test a device, you might want to check out: [How to] Set up GPU passthrough on notebooks with one click (or two)

1 Like