2 GPUs, poor performance on one

To start with, here’s the hardware I’m trying to get working in tandem:
-AMD Vega 56 (top PCI-E slot)
-AMD R7 250 (lowest 8x PCI-E slot)
-Asus Z-97 AR mobo

Running Pop!_OS 20.10

When I test with one GPU connected in their respective PCI-E slots, performance is excellent and rendering goes as smoothly as you can expect for their classes.

But for whatever reason, when connecting both at the same time, suddenly I get absurdly terrible performance on the poor little R7 250 if I either designate it as the “primary monitor” using the “Settings>Displays” tool in a spanned desktop, or disable the Vega 56 and attempt to use only the R7 250.
Talking like (exactly) 1 frame per second on a consistent basis.

While I can’t figure out how I’m supposed to offload GPU rendering in xrandr yet, the same thing happens if I turn off the Vega 56 using --output --off. ((I think the offloading problem is just me being dumb, my attempts at using the “–setprovider(…)” switches either don’t seem to do anything or throw an error–will post log if deemed important))

Mouse movement is always updating at the same time as Hz, but applications, DE/GUI elements, and the terminal are almost exclusively stuck updating at what seems to be exactly 1 frame per second, which really makes me near certain it’s a software issue. (The one exception is a video game called “Freedom Planet” that can escape this after some time and run full speed for some reason???)

For whatever reason, the inverse is true for the Vega 56 and it always performs wonderfully regardless of what I’m doing with the R7 250 or if it’s enabled/disabled.

Briefly testing in a live media session for Manjaro xfce seems to yield the same behavior, so I’m probably not getting Ubuntu’d or Gnome’d. Using the two GPUs in this manner works perfectly fine under Windows 10 20.04 as well, so I’m really thinking it’s not a hardware/UEFI issue.

I’m awfully sorry, but I may need some guidance as for how I’m supposed to report what mesa drivers I’m using or if this distro is even using x.org or wayland, how to set up or fetch some logging for system states, or similar such things if needed. I’m not really sure where I should be going or what I should be doing at this point tbh lol.

I kinda’ was hoping I could render on the beefy Vega 56 and output through the R7 250, so I want to get the two working in harmony ideally. To avoid any possible confusion, I am not attempting passthrough with VM–I simply have 2 GPUs in the PC and everything is running bare metal.

Thanks for helping a noob if you got this far lol

I followed up with more testing to eliminate some possible variables.

Thinking poteeeeentially that the age disparity of the GPUs could be a contributing factor to this problem, I decided to swap in 2 GPUs in my bin that are basically identical, (and Nvidia to boot):

A GTX 750, and 750 Ti. I tested only with the Noveau drivers, but unfortunately the exact same behavior occurred on the GPU I dedicated to bottom-slot VGA-out.

I removed all other PCI-E devices just on the off-chance that perhaps they were somehow causing issues, but the results were identical.

With a freed up 8x PCI-E slot, I moved the GPU from the bottom-most to the middle slot, same results.

I also changed from my typical UEFI settings to default, but this made no difference.

Feeling like I’m probably going to have to git gud at Linux and probably alter something pretty distro-integral to fix this issue, whatever it is exactly, but I’m kinda’ shooting blind with my level of knowledge.

[Information is completely incorrect I misread the title, skip to later posts]


I have exactly zero experience with this so I could be way off,

But,

Is one CPU for want of a better term and sla secondary CPU under the control of the primary one. As in it cannot directly manage itself and has to confer with CPU/Socket1 to make sure whatever CPU/Socket2 does is not going to mess things up with a conflict.

So the performance of a GPU attached to CPU2 suffers because it has to communicate:

CPU1>CPU2>GPU2>CPU2>CPU1 to complete one instruction cycle.

Thanks for the reply.

When you say “CPU”, I’m guessing you mean a core or thread as managed by a governor within the OS?
I’m not sure how I could monitor or confirm such a thing.

I would think it pretty odd that apparently many popular distros are set up with defaults that could cause conflicts like that. There must be users out there who use multiple GPUs for things other than VM passthrough…

Sorry I completely misread the title and thought you had a 2 socket server style motherboard.

Totally my mistake, misread the G for a C.

Okay back to the reality.

I assume it being Z-97 that it only has 16 lanes direct from the CPU at PCIe 3.0 speeds which would be your main GPU.

The other lanes will be split out from the Chipset at DMI 2.0 speeds which if I remember correctly is 4 PCIe lanes at 2.0 speeds multiplied a few times possibly and made available to other slots and PCIe things like potentially an m.2 (not sure with the age)

That in itself might be the limit right there. I know GPUs will get along fine with 8 lanes of 2.0 speed PCIe for the most part but it might be that the 4 lanes are a limit and if anything else is on the DMI lanes with it, it will definitely begetting juggled with them to keep them all running.

EDIT: Yes the Ark confirms my suspicions. Up to x4 PCIe lanes at 2.0 speeds which can be broken out to x8 lanes.

This would be a problem for 3d rendering yes. But PCI Express lanes will have little performance impact on regular work. It would certainly not result in 1 FPS performance in desktop.

You can always confirm this by swapping the positions and if the Vega still performs find - issue is not there. Some BIOS can help with setting to 2x8 mode. Giving both GPUs enough lanes.

I assume that this is also a SW problem. Whether the PCI lanes increase it or not.
And honestly I would check the driver - those architectures are bound to have different approaches in the driver.

1 Like

I’d have to remove the PSU to make clearance for the Vega in the bottom-most slot (too chonky), but I did actually remove those AMD cards and test with a GTX 750 and 750 Ti, which gave me the same results: The card in the bottom slot with VGA out performed poorly.

I feel like the Noveau driver must treat those two almost identically.

I’m skeptical this can be a PCI-E lanes issue as well because I can operate the two AMD cards in Windows (10, 20.04) just fine.

Not totally writing off these possibilities manifesting in some manner, but I’m ears to any other suggestions.

I wonder if linux does not treat the DMI lanes the same as windows? Like it is stuck on a much lower speed or number of lanes?