Why "Pro" cards are faster in some workloads? (closer look at WX4100)

Remisc · February 3, 2021, 2:36pm

Why are professional cards faster in certain workloads?

The quick answer would be “double precision fp performance!”.
But what I am specifically curious about are workloads where even with lower FP64 performance the “pro cards” are faster.

As an example I looked on Radeon Pro WX 4100:

vs RX 570:

vs GTX 1080:
https://manatails.net/blog/2017/03/radeon-pro-wx-4100-review/

Specifically wire-frame performance is a sticking point.

WX4100 has FP64 performance of ~150 GFLOPS
RX 570 has ~300
GTX 1080 has ~250

So the FP64 argument goes out the window…

Some would say that “verified drivers”. But from what I can tell this essentially means no corner cutting when implementing a particular API (OpenCL for example). This usually means that all the corner-cases need to be handled correctly instead of doing something ‘kinda correct’ but enough for gaming. And there are rigorous validation tests for all those cases.

So that argument would mean something opposite: less performance, but more correctness.

Then I thought about silicon itself but for example Wx 4100 is a polaris 10 chip which is also used in RX 400 series consumer gpus.
But it’s possible that some silicon is just locked. But if that’s the case then what specifically is locked? It’s not like some parts of API are missing.

If the wire-frame performance is so good then which API calls are faster? Under what circumstances?

The only detailed info I could find about actual AMD performance tuning are here: Developer Guides, Manuals & ISA Documents - AMD
but that’s mostly focused on CPUs, and didn’t really help with my original question.

TL;DR
Do any of you have any Idea why something like wire-frames would be so much faster on “pro” cards?
And If I wanted to develop my own programs that would target “pro” cards, which API calls / instructions specifically would be faster?

Kinda relevant L1 video: Putting the Radeon Pro WX7100 to Work: Testing (Part 1) - YouTube

miak · February 3, 2021, 5:26pm

I don’t think that the only difference between the drivers is validation, they definitely have different optimizations. For Vulkan on the same card there’s a decent amount of variation: RADV vs. AMDGPU-PRO vs. AMDVLK Vulkan Linux Driver Performance - Phoronix

Remisc · February 3, 2021, 7:06pm

Yeah, for sure the validation isn’t the only difference.
amdgpu-pro was supposed to share most of it’s codebase with open-source amdgpu, but they keep some secret sauce parts closed.

But then if the difference is in the driver fast-paths then what are those fast-paths? How would I go around and check if they would be applicable for a given workload?

I would assume AMD would want developers to take advantage of superior performance of AMDGPU-PRO drivers but I can’t find anything that would be dedicated to such developers.

The amdgpu-pro driver package has closed OpenCL, OpenGL and Vulkan drivers so I assume some of those API calls would be faster.

There is also a proprietary llvm version - that could mean either some custom optimization passes or maybe some undocumented in the open machine instructions.

Edit: I would expect something like this guide but dedicated to the “pro” cards:

PaintChips · February 3, 2021, 7:49pm

Generally the undocumented features of the GPU may it be AMD(ATI’s previous FireGL) and Nvidia’s Quadro/NVS tends to be more about error correction on the render side(CAD, CGI, etc which need precision wire-frames), when I had a Thinkpad with a Quadro(based on the 8400) there were cases it rendered things much smoothly/accurately–GeForce on the other hand you could see small stuff being slightly less perfect, even a small Blender project you could tell the difference between consumer vs Pro card. OpenGL performance and quality of output has always been better on Pro cards.

For day to day stuff, I’ve been tempted to get another Pro card for precision leaning stuff but I don’t do enough projects which require a high-end model.

Remisc · February 3, 2021, 8:05pm

Thank you for insight.

So from what I gathered until now it seems like the magic is in software. And a difference would only be visible when using pro-drivers (+ maybe ProRender SDK).

I.E. Pro cards with open source stack seem like kind-of a waste.
There are of course hardware considerations like builtin ECC, rack-friendly-design, etc. but that is outside of this-topic interest.

But maybe any of you saw benchmarks that would indicate improved CAD performance on “pro” cards with open drivers?

PaintChips · February 4, 2021, 5:42am

Many years ago before Nvidia locked their drivers down, it was possible to use GeForce drivers on Quadros to squeeze higher frame-rates–hardware changed rapidly in that time period of Pro cards to the point AMD/Nvidia would up the number of encoder/decoder units vs consumer. The greatest difference is if developers take advantage of the hardware and software SDK they can squeeze more performance out of a Pro card just like game developers are offered deals to get access to SDK for showing off eye candy of consumer GPUs. If there were more commercial software on Linux there could be a much different environment.

There are plenty of benchmarks which various Linux leaning sites mainly test mainstream GPUs with a limited number of commonly used Pro GPUs that OEMs ship their workstation models, the main downside is “open drivers” lag far behind the closed source driver when a major GPU has launched. At the moment Radeon Pro is still a mix of old and recent GPUs so open drivers more than likely had gotten closer to the closed drivers… on GeForce RTX and Quadro RTX, the open drivers are a mess.

Remisc · February 4, 2021, 11:19am

One more thought: amdgpu-pro is supposedly a usermode driver running on top of open amdgpu kernel driver.

If there is no kernel-mode voodoo or “authentication” then it should be possible to take something like RX 5700 and lie to the pro driver that it is W5700. (same Navi 10 XL chip)

system · November 5, 2021, 5:20am

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.