I have been working to improve Looking Glass performance with the Arc DG2 after @Wendell kindly sent a care package containing one and I have run into a few problems. I am curious to know if anyone out there has run into these issues also and might have any hints as to what might be the cause and resolution?
Firstly. my motherboard doesn’t support REBAR officially (SuperMicro H12SSL-i), but I can use the support that Alex Williamson ported across from the amdgpu driver to resize the BAR of my NVIDIA and AMD GPUs without issue. Unfortunately this doesn’t work for the DG2 as it sits behind three(!) Intel PCI bridge devices (WHY Intel??).
Alex wrote a RFC patch that might address this issue however it doesn’t work for me. I am curious to know if this has worked for anyone at all and if so, what is your motherboard/chipset, etc.
Secondly, while rebar is not working, I can still use the GPU without issue as far as I can tell, except for one application… Looking Glass. Everything works fine provided I do not enlarge the client window too large, or, get this… set the guest resolution to 800x600. If I run LG on Sway instead of Xorg, I get the same result, unless I go full screen… sometimes… I have to toggle full screen a few times before performance becomes what I would expect.
As it’s clearly possible to get good performance from the GPU without rebar, I am certain that it’s not the cause, however I am quite beside myself with what else it could be. Lowering the guest resolution to 800x600 reduces the memory bandwidth required, the texture size used in the client application, and it should be blistering fast. Instead it LOWERS the performance.
While I obviously can’t rule out a problem with the code in LG and how we are making use of the GPU, I’d love to know if anyone else has seen any other applications that exhibit similar behaviour on this GPU.
I’ve been finding some GPU driver bugs, Arc power modes seem more aggressive depending upon varying conditions. There isn’t much to go on from Intel’s driver known bug list, still seems like an encoder bug as I’ve been testing AV1.
Yeah, in this instance though I don’t think it’s related. gpu_top reports the GPU running at 2GHz+ while it’s performing poorly. It’s very likely a driver/mesa bug, as full screen output (unredirecting) in sway after several attempts resolves the issue.
Running vblank_mode=0 glxgears I see 4000+FPS on xorg, until I resize the window, as I approach the size of the monitor it drops to below 100 FPS.
Doing the same on sway gives 4000+FPS with any window size, but if I run the application full screen, the frame rate drops to around 100FPS, and there is clear jittering in the output.
Running Minecraft yields the same performance drop that LG suffers from, the instant the window resolution exceeds about 1000x1300 the FPS starts to tank.
Hacking on LG, using the legacy OpenGL renderer, adjusting the viewport size to render 1/4th the size, results in good performance (obviously useless though).
I am completely convinced there is some underlying bug in either the kernel module or mesa.
Another find. Using the vulkan translation layer completely resolves the performance issue. There are some other issues it introduces, but it points to a possible issue with iris_dri in MESA.
did wrote an email to supermicro about this? Asrock for an instance was more than happy to provide an Beta Bios with ReBar on my AsrockRack itx board for epyc.
Thanks yes, I have. I have had what seems like a positive response basically requesting more details, but also stating that it’s a long shot at best. I am still curious though if that patch has helped anyone.
Ok, something very interesting… it’s not specifically a DG2 issue, but rather an issue with iris_dri in general.
I decided to test it out on my laptop, which only has a 1080p display. As performance has always been quite acceptable for a laptop I had never thought there could be an issue here either, but there is.
Running Looking Glass directly, if I full screen the window, the frame rate drops quite a bit,which I expected, but as I generally do not game on my laptop I had never noticed that the issue was far worse then I realised.
I set the guest with a custom resolution with a refresh rate of 330Hz (the highest my guest GPU will let me set), and LG could with the new D12 capture backend can provide this update rate. If the LG window is smaller I could achieve 200FPS+, but full screen, only 50-80FPS.
Using the zinc vulkan wrapper both on the DG2 and the iGPU in my laptop (UHD Graphics P630), performance is improved enormously. With this in use, my laptop can easily run full screen, at the full 330FPS.
I’d be very curious if people find that using the zinc wrapper fixes all manner of performance issues on Intel based GPUs. To use this simply run the application as follows:
MESA_LOADER_DRIVER_OVERRIDE=zinc glxgears
If you’re running Looking Glass, you will see that it’s in use as it will report the renderer is zink Vulkan 1.3(...)
Seems like this bug hits programs like Maya, found this in the release notes of the driver:
“Autodesk Maya* may experience an application crash while running SPECAPC* benchmark.”
Found a strange bug on 11th gen, if you don’t have a 2nd screen plugged in you’ll have Iris Xe performance drop into energy save mode–doesn’t matter if you’re running W11 or Linux so there is an ugly Iris specific bug.
While I am convinced there is a performance issue here, further testing needs to be performed as today I got REBAR working on the H12SSL-i with the Intel A770 with a fairly minor kernel patch thanks to some help from Alex Williamson of RedHat.
As it’s his work I don’t want to spoil anything, hopefully this is something that can be upstreamed into the kernel as it’s a bit of a hack. If he decides that it’s not suitable upstreaming I will get his permission to publish it before posting it here.
Anyway, the result is that the performance is now where it should be without the zinc layer, but it does raise some questions.
Why is my laptop’s iGPU so slow unless I use the zinc layer?
Can I force my laptop to assume my iGPU has REBAR support and try and use it anyway? (ie, is this an undocumented hidden feature?)
If 2 holds true, can we get a nice perf boost for Intel iGPU users for general workloads?
Thanks for the offer, the A380 is essentially the same part when it comes to this kind of testing and development so I don’t think there is much here that can be gained by this, but I will keep it in mind.
Did some digging on the laptop iGPU thing, while I believe the functionality is likely there, there seems n0 viable way to make this work as the PCI configuration space doesn’t map the REBAR control register anywhere, and writing random values around in the unused config space values yields no discernible changes.
Oh well, atleast we know that if we use the zinc layer for OpenGL applications, you can attain quite a massive performance improvements for some workloads.
Seems like there is a bug in the Iris driver, try doing any QuickSync encoding such as OBS or transcoding and performance drifts.
Far too many OEMs cap their memory allocation on the IGP, you’ll find ASUS Vivobooks/Zenbooks have a 256-384 MB limit and other OEMs use adaptive memory. Strangely before I gave up on testing Gateway models, their IGP had been capped to 256 MB didn’t matter if Intel UHD/Iris or AMD Radeon(Ryzen 3).
I’m thinking it could be a DVR encoder bug of the TV tuner using dual tuner mode or OEM specific(REBAR), oddly the encoding glitch doesn’t happen on N95/N100 IGP that shares the same Xe driver.
Edit: Seems like the recent Intel driver W11 fixed the encoding(MPEG) for TV tuners, Linux side is still your mileage may vary.