Why Asus Pro WS X570-ACE not working with 3 GPUs? Code 43

@wendell

@Seff has also tried running the three GPUs without any risers.

ASUS are being dicks with the Pro WS X570-ACE’s BIOS, you cannot change the PCIe generation in the AMD PBS menu like on various other motherboards. :frowning:

Any idea how to force the motherboard to use Gen3-only and disable the “Auto” detection I assume is on by default?

3 Likes

I only know how to do this in linux, the script to do it is in the thread where I fixed linus’ nvme array

Yup, definitely did all the above with risers on/off, and got the same results, sigh.

Last night I switched the CPU on the mobo (from Ryzen 7 3700X to Ryzen 3 3200G - troubleshooting to see what works best)

With the Ryzen 3200G, the motherboard only supports 2 GPUs due to the limited lanes, but even with the 2 GPUs, one or both show up with the code 43 error in Device Manager…However, I’ve gotten both to work by doing the following:

  • 1 GPU was giving code 43 error in Device Manager, so I uninstalled it (no PC reset) then it showed up as working in device manager.

  • Card was also detected in GPUz (No CUDA though until I uninstalled the card)

Then it worked.

At this point, my troubleshooting feels “logicless” as though everything is just working “randomly”…like I can’t find a definitive answer for why it works so randomly…

the randomness feels like pcie errors

1 Like

Put that on the list of things that don’t make sense…for a Pro board I was expecting more from the BIOS

Last thing I can think of at the moment: Did you check if the GPUs have gotten VGA BIOS/firmware updates since the end of March?

Hi there,
I own two of these boards, each running 3 GPUs (2x 1650 Super + 1x GT 730/1030) as Parsec remote desktop hosts. Each build runs Proxmox with all GPUs being passthrough-ed to individual Windows VMs. They all work flawlessly.

I can’t comment on what’s causing your issue, but I can suggest my working config. If I can recall the setting name correctly, please try changing the KVM Mode of the Realtek NIC’s “IPMI” to Remote Multi-Display. Apparently this prevents the “IPMI” function from ‘grabbing’ and using the GPUs (one of them) on boot.

No BIOS updates since release, on Jan 12 2021.

@wendell did mention something that took me back to something I saw in the manual. The cards are PCIe 4.0 Cards, but because the board is only compatible with 3.0 as per the manual, maybe that is exactly what’s causing the problem…even though my thoughts were that the cards are backwards compatible…so it shouldnt be an issue.

Okay, this is helpful, I’ll take a stab at it.

Are your cards PCIe 4.0 / 3.0?

They’re all PCIe 3.0 unfortunately, but I guess it’s worth a shot :grinning:

2 Likes

Hey guys,

Just an update to let you know:

  • After many many different attempts, I’m getting code 43 in Device Manager with only just 1 GPU plugged into either PCIe_1/2/3 now.

  • I switched CPUs and I’m now using a Ryzen 3 3200G, and two GPUs are working fine in PCIe_1 and PCIe_3 directly plugged in.

I have the Ryzen 7 3700X in another motherboard (B450M Aorus) with the 3rd GPU and that is working flawlessly as well.

I am at a loss, but I will try my best to work with ASUS Tech Support to come to a solution (I prefer an advance RMA, but I’ll see how they decide to handle it once I’ve exhausted all of their troubleshooting requests.

Thanks a lot for your assistance guys, I really appreciated your effort in trying to assist me! @aBav.Normie-Pleb @wendell @WarwicK7

@Seff

As mentioned I tried to see if I could recreate your issues. While I don’t have more than one RTX 30 GPU I tried using two Radeon VIIs and ran into something that might result in the issues you’ve been experiencing with NVIDIA GPUs. But I am not certain about this and your difficulties might have a different cause.

(Unfortunately I am cursed regarding technology and am used to it never working as intended at first)


Short answer: The key might have been you taking the CPU out of the socket. It’s not that the 3700X is defective or that the 3200G works better - it’s that there is some setting in the UEFI that is non-volatile (CMOS resets without any effect) that can bug up.


The patterns I’ve discovered with a different motherboard model (ASRock X570 Taichi Razer Edition, two completely separate systems where the only shared parts were the two GPUs).


Stage 1 - 1 GPU - Everything fine:

No matter in which slot (PCIe x16 #1 (x16, from CPU), PCIe x16 #2 (x8, from CPU) or PCIe x16 #3 (x4, from X570 chipset) each of the two GPUs works splendidly.

Stage 2 - 2 GPUs - Let the games begin:

With the second GPU at first it works fine (using CPU PCIe lanes only, so it’s running with x8/x8), Windows’ Device Manager and the driver shows both GPUs.

Then some time later after a power cycle the second GPU no longer shows up, no matter which one and reseating the AIC has no effect.


Stage 3 - UEFI malfunctions:

!!! At this stage USB BIOS flashback does no longer work!!!
(*Note: ErP/Deep Sleep is NOT enabled, otherwise this would be expected since the motherboard doesn’t get the required 5 V stand by power line if this setting is enabled)

The ASRock X570 Taichi Razer Edition has - like many ASRock motherboards - an USB flasback option: If you plug in an USB stick formatted with the FAT file system in a certain USB port at the back IO shield of the motherboard and copy the BIOS file renamed to a specific name on it the motherboard can reflash the BIOS only with a PSU attached, no other components needed when you press a certain button that then flashes an LED until the process is completed.

(The ASUS Pro WS X570-ACE doesn’t have this feature, unfortunately)


Stage 4: No more display output at all :frowning:

Even with only one GPU installed in the system, you can no longer get an image output from it but the computer is booting the OS just fine.


Stage 4.5: Switch to an NVIDIA GPU

If you change the AMD GPU to an NVIDIA GPU now, you again get a display out signal and can enter the UEFI. Reflashing the same UEFI version brings back USB flashback functionality.


Stage 5: Error 00 debug code

If you don’t change to an NVIDIA GPU some power cycles later even this is no longer possible, the motherboard hangs at POST with the error code 00.

At this point switching the GPU manufacturer doesn’t help anymore. The motherboard appears to be bricked.

CMOS resets via jumper or removing the BIOS battery don’t change anything.

Only taking the CPU out of the socket, waiting a short time and then reseating it helps.

Then, the USB BIOS flashback feature also works again.


@wendell

Could you please find out what exactly this non-volatile piece of shit AMD-specific thingy is that survives CMOS resets but is seemingly not corrupting the BIOS itself?

(Ran into this with a 3950X and a 5950X, both CPUs fine)


I’m a bit pissed about the wasted time and thermal paste until I found out this pattern.

3 Likes

Hey @aBav.Normie-Pleb , thought I would chime in too. I’m pretty curious about this discovery as I had somewhat similar issues (persisting bugs in UEFI across CMOS resets) on one of my Asus X570 Pro WS ACE’s after having flashed and updated the UEFI revision with what I presume was unstable ram settings. My issue was that the Realtek NIC would not initialize on boot at the hardware level half the time.

It’s described in detail in my other recent thread here. Actions, in sequence, that seemed to have caused the issue :

  1. Ram (2x32GB Hynix) had XMP Enabled @ 3200 CL16 1.375V
  2. Added additional, identical 2x32GB sticks.
  3. Did not reset UEFI settings to default and proceeded to flash UEFI
  4. Flash successful, but on subsequent boots the Realtek NIC was at times not detected by hw switch / OS. The only way I managed to temporarily fix it was to drain the PSU of power first.

CMOS/UEFI resets did not help, until I reset the CMOS and then reflashed the UEFI right after. It has been stable since, for about 2 weeks now.

May I ask if the symptoms you have found apply to both the X570 Taichi and the X570 Pro WS ACE? And were both tests (3700x/3200G) on stock/stable UEFI settings ? (E.g. XMP)

?

Did you mean a CPU reseat?

Maybe the issues are related but I want to emphasize that in my case ultimately nothing except physically removing the CPU from the socket helped.

Unfortunately I’m not competent helping out regarding memory settings since ever since I got my very first Ryzen system I only ever used ECC memory with boring DDR4-2666 and -3200 JEDEC timings.

I’m not even sure if I’m out of the woods yet: Spent days with the issues described and for now it seems to be working.

Forgot to mention:

The way to get the two GPUs to seemingly work now consists of the following steps:

  1. Remove every AIC from slots;
  2. Remove CPU from socket;
  3. Remove BIOS battery;
  4. Reseat BIOS battery;
  5. Use USB BIOS flashback to be sure to not have a corrupted BIOS;
  6. Reseat CPU;
  7. Install one GPU in the secondary x8 CPU PCIe slot, not the primary x16 CPU PCIe slot;
  8. Power on system and set up the desired BIOS settings;
  9. Power off the system, disconnect AC power from PSU, drain all charges from components by holding the Power Button for 10 s;
  10. Install second GPU in primary x16 CPU PCIe slot;

My testing was done with two units of ASRock X570 Taichi Razer Edition, a 3950X and a 5950X only - the OP of this thread uses a 3700X and a 3200G.

Sorry for the confusion, what I meant was that my issue was most likely due to a corrupt UEFI flash. In which case a CMOS reset alone did not help, but had to be paired with a reflash as well. Since you mentioned using USB Flashback, then I believe your UEFI should be safe.

In any case, as I mentioned earlier I have been using 2x triple GPU setups with the Asus X570 ACE Pro WS to much success albeit with Linux and PCIE 3.0 cards. Perhaps your findings are more specific to the ASrock X570 Taichi?

Note that the ASRock X570 Taichi (sans “Razer Edition”) is a completely different motherboard that is very mature and offers very reliable operation after a dozen or so UEFI updates.

I thought that I would give the Razer Edition with a refined hardware design a shot but now I’ve had three units from different retailers at home and every single one of them was a piece garbage (tested UEFI versions P1.40, L1.42, P1.44 and P1.50).

I was triggered to post my previous wall of text here to advertize that some issues might only disappear by removing the CPU from its socket, @Seff noted changes when he had changed the 3700X out for a 3200G (so the CPU had to be taken out of the socket). My gut feeling says that his issues might also be related to this strange “bugged up state” of the X570 platform - but with slightly different symptoms than the one I experienced.

Another update:

ASUS has offered me an advanced RMA, so they will be sending me a board and I will send them my board 14 days after I receive the “new” one.

It will be a recertified refurbished unit coming from their manufacturing site…which I’m a little concerned about. I also hope that all of this is not being done in vein.

==First Code 43==

@aBav.Normie-Pleb just to note, this would be the second time I’m switching out the CPU. I first started with the 3200G, but realized shortly after that the 3rd card wasn’t working because the 3200G only has 20 PCIE lanes.

==Second Code 43==

Got the 3700X with 24 lanes to solve the PCIE lane issue, but that didn’t solve anything…if anything the problems actually became worse (anywhere from 1, 2 or 3 cards not working and giving code 43)

Have you tried reflashing the BIOS properly outside of the UEFI GUI?

Some UEFI GUI tools don’t actually reflash the BIOS but check if the checksum of the “new” one is the same as the installed one is supposed to be. They show that 0 to 100 % process bar but nothing actually really happens. This might potentially “protect” a corrupted installed BIOS.

(Not to sound like a dick but) You didn’t actually have a technical problem with the 3200G, it was simply the wrong component choice for the intended purpose due to its externally usable PCIe lane configuration of 8 (PCIe x16 #1) + 4 (M2_1) + 4 (Chipset) PCIe 3.0 lanes.

Your “actual” issues began after the 3700X had first taken a seat.

You’re correct, thanks for highlighting that (the simple things we typically miss in frustration :smiley: )

I flashed the BIOs with the EZFlash tool and USB drive in the BIOs twice (3402 then 3501)

Could free up one of my ASUS Pro WS X570-ACE units and a Radeon Pro WX 4100 so I can build a configuration with a 3700X and three GPUs, too.

Unfortunately these are all PCIe Gen3, so if Gen4 is the source of your issues it might not help.

Could you ask ASUS’ Support to provide a Beta BIOS in which the user can actively control the PCIe generations?
(Basically like on any other current AM4 motherboard, usually in the AMD PBS UEFI sub menu).

This might take some time and is definitely worth a shot so you don’t wate any time during the 14-day grace period before you have to send a motherboard back.

If they are properly packaged and shipped I don’t have an issue with “refurbished” units, here a human being likely spend some time checking its functionality more in depth than the OEM in China, Taiwan or lately more and more Vietnam.