ASRock Rack X470D4U2-2T

I have some good news, which should make reproducing (and validating after fixing) the issue a lot easier for Asrock Rack! It seems like Passmark have updated their MemTest86 product from version 8.2, which didn’t fully support Ryzen 3000, to version 8.3, which does fully support Ryzen 3000 (they forgot to put it in the changelog though).

This is very interesting, as MemTest86 Pro (not the Free version) supports ECC Injection:

“ECC injection: Enabled/Disabled (Pro version only) - if CC detection/correction is supported/enabled and CC injecton is supported by the system this option enables/disables injecton of CC errors to simulate how the system responds to real CC errors. CC errors are injected at the start of each individual test. If CC injection is successful the details of the CC error shall be reported and displayed on screen as if an actual CC error was detected.

Notes Although ECC injection may be supported by your hardware, it may be locked by the BIOS. Some BIOS may allot you to unlock the ECC injection feature in the BIOS setup.”

And Asrock Rack did do very well on that regard, as there is an option in the BIOS called “Disable Memory Error Injection”:

afbeelding

After setting this BIOS setting to false and enabling “ECC Injection” in MemTest86:

afbeelding

I ‘ve ran MemTest86 and it re-produces the issue perfectly:

afbeelding

As you can see, it successfully injects ECC errors, but doesn’t detect them, which is exactly the same as I was seeing when trying to trigger memory errors using unstable settings.
https://www.passmark.com/forum/memtest86/5984-how-do-you-verify-ecc-error-injection-working

Also I am very curious if this is only a Ryzen 3000 issue, as the motherboard was initially designed for Ryzen 1000 and Ryzen 2000 CPUs alone. Perhaps ECC does work for those older CPUs. Unfortunately I don’t have such a CPU to try this on (feel free to send me one for testing ).

I’ve forwarded this info to Asrock Rack…

1 Like

I reached out to support but figure if anyone else have seen this. I got a new motherboard but it won’t POST.

Depending on whether I’m clearing the cmos or not i’m getting a 61 or a b6 in Dr. Debug. Per support/faq.asp?id=334
I see 61 is a chipset initialization and b6 is not documented.

I’m running a Ryzen 9 3900X and I’m not sure what the BIOS is since it will not POST. general/productdetail.asp?Model=X470D4U2-2T#CPU indicates that 3900x is only supported since P3.10

I’ve tried two sets of memory from the QVL List at general/productdetail.asp?Model=X470D4U2-2T#Memory first 128GB of Samsung M391A4G43MB1-CTDQ
then suspecting a bios issue, 16 GB of Crucial CT16G4DFD8266.C16FD1. Both are acting the same way.

I’m also unable to IPMI in at the part where the POST is failing perhaps because that is not initialized until later.

Is this a defective board? What are my next steps?

I don’t completely understand this sentence, but your IPMI should work and be reachable, even when your server is turned off. No need for a working POST…
If you can get into your IPMI, you can check and update your BIOS from there and get support for your Ryzen 3900x.

Here is how I got it working

Had a similar problem. First make sure your addblockers are off, and IPMI isn’t connected to a different vlan or something like that.

Next in the IPMI Settings, select the Keep NIC Link Up setting. Without that checked for me I always missed the post screens.

From there open the remote control section, tell the unit to reboot and fingers crossed that lets you see the posts and get into BIOS like it did me.

The link and activity lights of the IPMI port just blink slowly about every 3/4ths of a second. At no point is there a DHCP request made, and no new MAC addresses are in the switches ARP cache. Is there a setting in the Bios that is required to enable IPMI initially or is it always on for IPv4 by default?

  1. Make sure DHCP is working on your router
  2. Perform a CMOS Reset on your motherboard
  3. Attach only the IPMI network (using a working network cable :wink: ) and power cable
  4. Turn on the system

If the IPMI then doesn’t get a DHCP assigned IP, then I’m afraid you have DOA.

Bad news I’m afraid… I’ve received a response from Asrock Rack, with “official statement” from AMD on this, regarding ECC on this mobo (and AM4 in general):

Dear Mastakilla,

So many thanks for you detail experience.
We will share this information to RD 

However we got AMD official respond today

  1. AM4 support ECC function
  2. AM4 does not support ECC error reporting function

Here is the conclusion:
AM4 platform CPU (Ryzen 1000,2000,3000 series) can all support ECC correction, but not ECC report function

Best regards,
Kevin Hsiueh
Asrock Rack Incorporation

To which I responded:

Hi Kevin,

Thanks for getting back to me!

That is very unfortunate news…

Does this mean that the sensors for “DRAM ECC Error A1/A2/B1/B2” in the IPMI Event Log are unused and always will remain empty, even if memory errors do occur?
Do you know why these sensors then exist on this board? Were they simply copied over from an existing Intel /TR4 / Epyc Board, without testing them? Or were they added explicitly, but weren’t you aware of this missing feature (and also didn’t test it)?

Kind regards,

Mastakilla

And their response:

Dear Mastakilla,

According to AMD, X470 is desktop MB, and our QT won’t test ECC report function on desktop MB.
We follow AMD POR to writes specification.
In order to prevent misunderstanding, we will also remove ”DRAM ECC Error A1/A2/B1/B2” in the IPMI Event Log”.
Thanks for doing so many test and kind remind, and we will pay more attention on similar case in the future.

Best regards,
Kevin Hsiueh
Asrock Rack Incorporation

So no ECC reporting is supported…

Not entirely sure of this, but doesn’t this mean that:

  • there is no way to know for sure ECC is actually doing something or to validate that it actually works (even for Asrock Rack or AMD themselves).
  • there is no way to know if your memory is stable or not (ECC might be correcting errors all the time without you knowing about it). This is especially relevant if you want to overclock it.

I’m also not entirely sure all of this is true. Wendell told me he knew about people who reported logged error corrections on Ryzen. Perhaps AMD / Asrock Rack told me this to stop asking annoying questions about it? I certainly hope so :wink: (please prove me wrong)

3 Likes

AM4 most definitely supports ECC memory. It is a requirement for Ryzen Pro support that both the 1Gb and 10Gb versions support.

I’m smelling BS. Hopefully ASrock is not trying to get away with feature removal.

Normally with the dedicated IPMI port you can get rid of this problem.

Following up from my previous post. The problem was a short with the motherboard and case. I removed it from the case and unplugged all but the Power button from the header and was able to IPMI in. (There’s an IPMI activity light that would flash for 1/10th of a second when power was first applied then never again, so anyone reading this later you should see a blinking IPMI light between the PCIe slots.)

There was still a problem with the CPU but that probably was just the thermal paste from taking the fan on / off to try different RAM combinations. So cleaning the CPU and heatsync with Rubbing alcohol, reapplying thermal paste and I can POST!

I’m now installing proxmox from a NFS mounted ISO and the motherboard see’s all 128GB of memory. Direct support from ASRock has well Rocked.

1 Like

Using a Ryzen 5 2600 I was seeing corrected ECC errors in Memtest86 for one of my two sticks of RAM (which I then RMA’d). This was with X470D4U2, BMC 1.60, BIOS P1.50. I never tried injecting errors, though.

2 Likes

Did you also see them in the IPMI Event Log? (if you didn’t check yet, could you perhaps have a look?)

I’ve been looking at this board for an upgrade to my home server, I’m wondering if anyone can tell me how the PCI-E behaves with both slots filled with respect to bifurication?
Board spec says x4/4/4/4 and auto switch to x8 when slot 4 is filled, is it able to do x4/4+x4/4 or x4/4+x8 when the second slot is filled?

The corrected ECC errors are not in the IPMI Event Log. The only things in my log are “Timestamp Clock Synch - Asserted”. A few corrected ECC errors were listed in my ESXi logs, though. They looked like:

2019-09-27T15:06:53.331Z cpu0:2097725)MCA: 136: CE Poll G0 Bf Sd42040000000011b A40000011966b480 Mc000000000000000 P242cd6880/10 Memory error, read

That’s interesting…

@ https://www.ixsystems.com/community/threads/freenas-build-with-10gbe-and-ryzen.77752/page-3 someone with broken DIMMs recently replied that he also had MCA errors (in FreeBSD). But his MCA errors were only regarding the CPU cache (although replacing his faulty memory did fix them). Your MCA errors clearly state “Memory error” however, so it seems like somehow it was reported to the OS…

Could it be that these came from the infinity fabric then? Because infinity fabric memory errors are reported even when using non-ECC memory…
If they are really coming from the memory itself though, then this seems to indicate that your memory errors somehow did get reported :astonished:

Did you perhaps make a screenshot / picture of when Memtest86 reported these “corrected memory errors” on your Ryzen 2600? Did it clearly say that it was a “corrected memory error”? (I never saw this myself, so I have no clue what it would look like)

[SOLVED]
Turns out there is a BIOS setting that needed to be changed:

Advanced -> AMD PBS -> PCIe x16/2x8/4x4 Switch

It was set to 1x16, but if changed to 2x8, both cards and all 8 HDDs showed up in Proxmox.


Hello everyone!

I’ve recently bought this motherboard for a Proxmox server.
The trouble I have is that I have installed a Dell perch H330 flashed to a IT HBA with a guide from servethehome: Google >flash H330< and you will find it (sorry but I’m not allowed to post links)

The HBA can:

  1. Show boot message and find all 8 HDDs on startup.
  2. If I set CSM OpROM to UEFI only, it can display itself in the MB Advanced options.

But for the life of me, cannot display itself or the drives anywhere else in Proxmox and therefore is useless.

How have you setup your bios to let the HBA card and its drives show to the OS?

Server specs:
X470D4U2-2T (BMC fw 1.70.00, BIOS fw P3.30)
Ryzen 7 3700x
2x 16GB kingston ECC ( on the supported RAM list of the MB)
1x 1TB WD red SSD for VMs
2x 120GB Kingston SSD ZFS mirror for Proxmox
2x 120GB Kingston SSD ZFS mirror for Freenas
Dell perch H330 flashed to Dell HBA330 IT mode fw 16.17.00.05
8x 4TB WD red

EDIT:

  • I have installed the HBA on the bottom PCI slot named “PCIE4” in the manual, but it only shows in the MB bios when CSM PCIE6 is set to UEFI.
  • I have also installed a 4 port Intel NIC that worked fine in another pc in PCIE6, that doesn’t show up anywhere either.

Hope Im in the right place.

Turns out there is a BIOS setting that needed to be changed:

Advanced -> AMD PBS -> PCIe x16/2x8/4x4 Switch

It was set to 1x16, but if changed to 2x8, both cards and all 8 HDDs showed up in Proxmox.

1 Like

I had the same issue when I installed my 3700x. Had to manually set 2x8 so all three pcie slots would work.

1 Like

I’m really interested in using this motherboard and a 3700X but for live performance, and using my collection of 1U server cases.

I’m confident there’s plenty of IPC and acceptable latency, but hope to get opinions on heat, build quality and stability.
Won’t be using GFX Card, but the ASPEED chip for meager 2D monitoring.

Love reading posts from folks here. Way above my pay grade, plus I get to learn.

Thanks for any advice.

3700X and a 1U, you might want to look at the Dynatron L3. I bought a couple directly from Dynatron via there website, and got the AM4 brackets to go with it.