UE4 games on DXVK and Wine/Proton with AVX vs AVX2... Performance difference?

Edit: It’s partly a single threaded render thread issue that is common with many UE4 games, but also some systems lacking a TSC clocksource. Using HPET as a clocksource causes UE4 performance to plummet in Wine. Not only that, even if you have the IPC to elimnate a CPU bottleneck, the API bottleneck will arrive before your CPU bottlenecks.

I’m bringing this up because my Ryzen 5 3600X machine seems to have no bottlenecking issues with DXVK and UE4, but my 4960X, a pre-AVX2 CPU, seems to be bottlenecking like crazy at all resolutions…

The specific game is Breathedge, which got a big map update recently that sees my framerate dip to 30fps at ALL resolutions on non-AVX2 CPUs. Native Windows though doesn’t see this problem on non-AVX2 CPUs.

Could it also be Wine vs Proton? I just don’t understand why one has such a performance deficit because they use the same Vulkan driver and I’m using an identical GPU.

Could it be LC_ALL=C? That’s the difference between my 3600X machine and my 4960X machine.

Just crashed with a “Rendering Thread Exception” error playing for a few hours at 4K on the properly performing 3600X system so it seems like it may be DXVK related. Seems UE4 also doesn’t like being run above 1440p for extended periods of time when DXVK is in play.

It’s stuff like this, mostly Unreal Engine related, that makes me want to do GPU passthrough.

Edit: Thinking back, I was using r.ScreenPercentage 80 when it crashed. I could try again with no resolution scaling.

Changing the prefix between Windows 10 and Windows 7 on the non-AVX2 system had no difference to performance. This rules out Wine. It’s for sure because one system doesn’t have AVX2, and something in DXVK likes AVX2 better.

Which instruction set that the 4960X lacks does DXVK REALLY like for UE4 graphics calls? This could give forewarning to DXVK people wanting to use UE4 that it’s only optimized for newer CPUs.

And I guarantee you UE4 will not make things easy and go straight up Vulkan. They say they want to do so, but they’ll never do it.

Deep Rock Galactic sees around a %36 percent in performance drop between windows to proton.
This isn’t a bit deal at 1080p and 1440p but at 4k/2160p it hurts a little too much. You will need to run at 1800p to keep close/above 60fps. It can still dip in certain situations tho.

For me, 1800p crashes the game because of the “Rendering Thread Exception” after playing for a few hours… So not ideal, but Ryzen 3000/Zen 2 is better for less UE4 to DXVK bottlenecking. Older processors seem to be taking a much bigger performance hit because of lack of AVX2 in my observations.

You are aware you need clearcpuid=514 in the boot options to avoid ryzen3 crashing with proton right? There is a fix coming out in kernel 5.4, but thats quite some time off.

I already did that fix in GRUB, and it’s still crashing. The UMIP crash happens instantly, this happens over the course of a few hours.

Maybe log your temps of CPU and GPU

I do remember reports of instability with steam after couple hrs of use or something, can’t remember where I read it. Supposedly fixed, was it that latest proton update -5?

CPU and GPU temps were fine. I’m using Proton-GE which is slightly behind vanilla Proton, but it has Media Foundation fixes in protonfixes.

use latest proton if you have crashes, see if that fixes it. Not everything needs media foundations.

The game I was playing in the original post, Breathedge, needs it or it won’t progress the game because you cannot skip a video.

DXVK 1.4 makes no difference. It’s simply missing the AVX2 instruction set to regain performance in UE4 games. I’m certain it’s because non AVX2 CPUs don’t work well with DXVK and UE4.

So if you’re planning to run UE4 games over DXVK, make sure your CPU supports AVX2 or you will suffer up to a 50+% performance loss.

Just a few more stats if someone doesn’t take my word for it:

The game I mentioned in this thread gets up to 7000+ draw calls and 138+ render passes per frame in it’s most intense area, hitting a low of 65FPS in 4K on the 3600X. The 4960X would only hit 27-29FPS in the same spot.

For CB15 performance:

The 3600X is 1600 multicore and 196 single thread, the 4960x is 1204 multicore and 161 single thread. This should not equate to a 50+% difference, especially since the game engine is properly multi-threaded, but DXVK somehow favors the AVX2 instructions on the 3600X.

For the record, DXVK itself does not use any instruction sets higher than SSE2. Only some glibc internals (like memcmp, memcpy) might use them.

UE4 in particular is usually bottlenecked by its own render thread which interacts with the D3D11 runtime, not DXVK’s worker thread, and the runtime layer really doesn’t do much besides submitting work to that worker thread. It’s weird to see such a huge difference between the two CPUs.

There might be some other issue at play, such as the i7 not ramping up its clocks due to governor issues or something.

Both were most certainly on performance CPU governor, Philip. That was most certainly not the issue.

The Cinebench R15 score differences in Wine between the two CPUs is 400 for multithread and 40 for single thread. There shouldn’t be this large of a performance deficit. I’m just as stumped as you.

UE4 games I’ve noticed go up to 15+ queue submissions while Unity games only use at max 6. (with the average being 5)

I just switched from a GTX 1080 to a GTX 1080 Ti to find indeed the render thread in UE4 for this particular game Breathedge isn’t spreading the render thread CPU load across multiple threads. This game is heavily single threaded for it’s render thread, even on Windows.

Windows sees 52FPS on the 4960X, where the same intense spot with Wine and DXVK only nets 28FPS.

For the worst case scenario for UE4, Breathedge is a good game to see the worst case scenario for a render thread optimization.

This is the game: (to run on Wine, you need to install the mf-install script)

If other games program the render thread better, DXVK optimizations for this game may carry through to other UE4 games.

Another game demo that has issues is the Backbone: Prologue demo, with the “Vogue” scene pegging a single thread because of heavy volumetric effects:

Thinking outside the box but does the Linux system have Spectre meltdown patches, does the VM also have those patches

The 4960X was on the Ubuntu HWE 5.0 kernel with mitigations=off applied. The 3600X is using the hardware Spectre mitigations so there is no need to turn them off, and turning mitigations off on Zen 2 can actually hamper performance.

This was most definitely because of an unoptimized render thread made worse by Wine AND DXVK lack of optimizations on top of it being unoptimized.

Well, I found the culprit:

By default, the UE4 rendering thread on a default blueprint is SINGLE THREADED.

The only way to multi-thread the rendering thread is to use a THIRD PARTY blueprint plugin that COSTS MONEY:

This means on Windows, to get good 4K performance, you need a 8086K at 5.1Ghz. On DXVK, that would require a chiller and 5.3Ghz on the same chip.

UE4 on the current consoles only target 30fps because the single rendering thread is overwhelmed at 4K, and the Jaguar cores don’t have the single threaded clock speed to do 60fps and keep the image pretty with volumetric effects.

I wonder how the PS5 and Scarlett chips will handle single threading… Because this problem won’t go away anytime soon if people use UE4. Godot Engine should listen to this because multi-threading the rendering thread should be expected, especially with the next consoles having 8 cores 16 threads.

Epic has dropped the ball on this.

3 Likes

That plugin is to enable someone using Blueprints to write multithreaded code/nodes, Otherwise they would have to make use of C++ to either do it directly or expose to BP.

That plugin does not directly affect rendering at all.

HTH

Edit: I did find this on threading and rendering