ASRock Rack X470D4U2-2T

Hey nx2l - I have tried to update the BIOS to 3.2, however it doesn’t actually come up with the BIOS number in the webportal, it’s just blank there. However I think it also shipped with 3.2 anyway as it was built in October and has a X470D4U22T P3.20 sticker on the board.

I have the same CPU, similar RAM, same mobo. Just a different PSU (which it’s also pretty sensitive too apparently. Here’s what I did to make it work:

If that doesn’t work for you, then I think you have a problem somewhere (like your PSU perhaps?)

1 Like

Hey Mastakilla, thanks for the response! I was actually following your steps to get as far as I did. It’s a pity that the board is so sensitive to the PSU. I literally can’'t even get into the BIOS at the moment (no signal when plugged into VGA and ‘No Signal’ through KVM). I have a PSU coming on Wednesday which I’m not too hopeful about. Also not had any beeps and the LED’s for the post codes on the board have never come on so I think it’s either a duff board or the PSU! Will update when I get any further!

  • Make sure you have at least BIOS 3.20
  • Inside the case: Only install Mobo, CPU (+ cooler), RAM and PSU.
  • Outside the case: Unplug everything except power and IPMI network
  • Reset CMOS
  • Boot using the IPMI

If you still get no signal, contact Asrock Rack and perhaps try a different PSU.

Hey Mastakilla! I’ve just tried as you suggested but I’m still getting no signal. I think it might be the PSU as the it does say PSU error in the sys info. I’ll wait till Wednesday and try the different PSU and will go from there!

After a lot of testing I’ve now finally mastered RAM overclocking in a way that I can vary between stable and unstable settings. The trick is lowering the voltage for unstability, as lowering the timing or increasing the frequency too much mostly will cause it to stop booting instead of becoming unstable.

After figuring this out I’ve done a lot of testing. I’ve tested from hardly bootable to slightly unstable using MemTest86, memtester (on Fedora Rawhide with kernel 5.4.0.0.rc3 and 5.4.0.2) and prime95/aida64_bench/Ryzen_Master_test (on a fully updated Windows 10 Pro, first with amd_software_1.09.27.1033.zip and later with amd_chipset_software_1.11.22.454.zip chipset drivers).

To give you an idea of the testing I’ve done, here is an Excel I’ve created to keep track of things:

In mean time I’ve had millions of memory errors (in total) in very varied conditions. It seems almost impossible to me if there was not a single single-bit-error or two-bit-error in all these millions of errors.

But… Unfortunately I couldn’t find any report of a corrected or logged memory error in either the IPMI Event Log, the Linux edac-util or the Windows Event Viewer (even though all of these report ECC to be active and correctly configured - see my posts above).

Now I know that doesn’t mean that no memory error-corrections have happened, but that is only half of what ECC functionality is. Reporting / logging these memory error-corrections is at least as important as the actual correcting itself (How else can you know your RAM is dying or is unstable. That’s like having a RAID5 which doesn’t notify you that one of your disks is dead :stuck_out_tongue: ).

So it seems to me that ECC is not working on this motherboard with a Ryzen 3000 CPU (I don’t have the older Ryzen CPUs for testing).

I’ve reported this to Asrock Rack and they’ve send me the following response:

Dear Mastakilla,

Due to X470 belongs to desktop series
It’s not like server MB has native support of ECC report.
We are checking with RD and AMD if X470 can support ECC report.
We will reply to you ASAP

Best regards,
Kevin
Asrock Rack Incorporation

I’ve replied to this with:

Hi Kevin,

Thanks a lot for looking into this! That is greatly appreciated…

I understand that the X470 is indeed a desktop chipset. Also all AM4 CPUs don’t have officially validated ECC support by AMD (although AMD confirmed that it wasn’t disabled).
So you could argue that non-validated half-working (not reporting / logging) ECC support is acceptable. And I also agree with that, for consumer brands like Asrock, Asus, MSI, etc.
However, if a brand like Asrock Rack or SuperMicro creates a X470 motherboard with “Supports 4x DDR4 ECC and non-ECC UDIMM, max. 128 GB” in the specifications and if the IPMI Event Log contains sensors for “DRAM ECC Error A1/A2/B1/B2”, then people (like myself) will assume that it is actually working and validated. In that case, I don’t think that it is acceptable for it not to work 100%, as people buying these brands, actually are expecting it to fully work. I don’t think that is a reputation or name you are looking for, as a brand called “Asrock Rack” :slight_smile:

Please let me know if there is anything else I can do to assist.

Kind regards,

Mastakilla

The response from Asrock Rack seems to admit that it currently does not fully support ECC, however, it could also just mean that Kevin is not sure about it… So I’m hoping for a decent response from their R&D.

It would be nice if someone could try some testing with a Ryzen 1000 or Ryzen 2000, to see if ECC works with those CPUs…

6 Likes

Awesome work. I have a 2600X in my desktop. What would be the quickest route to validation?

@Mastakilla

Just wanted to express my gratitude for your work! I would have had to do similar testing but currently I’m in “various-stuff-has-to-be-done-within-two-weeks” mode with little to no free time for further going down to X470D4U rabbit hole.

Awesome work. Just a quick reminder: From your findings, we cannot know if there is ECC functionality at all.

What your work shows is that when instability up to the point of non-correctable errors is reached, no such errors are reported. Thus, we can imply that corrected errors are not shown either. Therefore, you are surely correct in saying that ECC reporting does not work.

However, lacking reporting, we can also not be sure that potentially-correctable errors have ever been corrected. Possibly, ECC RAM does “nothing at al” ™ on this board/chipset/CPU family, so maybe the question is not if ECC is “not working 100%”, but: “does it work at all”?

In order to verify basic ECC functionality, one could try to use rowhammer to show instability - if ECC is at least basically functional, a rowhammer attack should show no effect with ECC, but indeed should cause errors on non-ECC memory.

I cannot try, because I am still waiting for my board (and I did not buy ECC RAM in the first place).

I wondered why no other company apart from Asrock Rack offers a server board for Zen2, because with ECC support even on desktop CPUs and cheap prices for 6-16 core CPUs this platform seems like the ideal candidate for SOHO servers. Probably, you have now found an explanation…

For those who want to test ECC themselves:

If you’re still in testing-phase then the best would be to do a multiboot of some OS, like

All using UEFI.

On Linux I’ve installed memtester and edac-utils packages (yum install …). On Windows Prime95, AMD Ryzen Master, Aida64.

I used mainly AMD Ryzen Master on Windows, for changing the frequency and timings. And the BIOS for changing the voltage (not sure if that’s working on Ryzen Master). See my posts from earlier on how to change the voltage in the BIOS.

Then just play with it and be prepared to do MANY CMOS resets :stuck_out_tongue:

First I increased the frequency till it stopped booting (1533Mhz).
Then I tightened the main timings till it stopped booting.
Finally I lowered the voltage till I got errors in memtest86. Those settings I then used for testing on Linux and Windows as well…

In my case frequency and timing changes hardly caused memory errors (when going to low it just stopped booting). So after awhile, for many changes of the frequency and timings I didn’t even bother testing it with Memtest86 anymore. If it booted, I just tried lowering it more…
When lowering the voltage I changed the number per 4 or 8 in the beginning (it’s quite fine grained), but I did test each change with MemTest86 (not always a full run)

In your case this might be totally different though… So just try a bit :wink:

When you get errors in MemTest86, then you can run

  • Windows:
    Prime95 (usually that crashed / rebooted my Windows after awhile).
    See some post of mine above for which event you should look for in Event Viewer.
  • Linux:
    swapoff -a (to disable the swap)
    memtester 30g (to stresstest the memory, leaving 2GB for the OS)
    edac-util -v (in another window, to check for logged / reported errors)
  • And I also regulary check the IPMI Event Log for “DRAM ECC Error A1/A2/B1/B2” as well. (those from MemTest86 can only come in here)

These sites were pretty useful for understanding the timings:
https://www.overclock.net/forum/18051-memory/381699-ram-timings-explained.html
https://www.techpowerup.com/review/amd-ryzen-memory-tweaking-overclocking-guide/2.html

For those wondering if they should “downgrade” from the test version L3.31 to the final 3.30 that got released yesterday, I’ve received following answer from Asrock Rack

Dear Mastakilla,

If everything is good so far then we won’t suggest to update BIOS.

Beside the newer version contain more bug fix.

Best regards,
Kevin Hsiueh
Asrock Rack Incorporation

I did upgrade my IPMI from 1.60 to 1.70.

Same, I also stayed on 3.31 but upgraded BMC.

I’ve just received most parts for my new desktop (the Asrock Rack system I was testing my ECC on previously will become my NAS). This is a MSI MEG Unify x570 mobo + Ryzen 3900x.

I did a quick test (on Windows 10 only) with the ECC memory from my NAS on this MSI mobo as well:

  • MSI does run with the ECC memory
  • But it doesn’t support the ECC functions at all. All programs that previously reported functioning ECC memory on the Asrock Rack, say there is no Error Correction on the MSI. Aida64 is the most precise and says “ECC: Supported, Disabled”
  • It also (logically) didn’t report any memory errors after running prime95 with unstable memory settings
  • Anandtech actually reported the same in their MSI x570 Godlike review (which should be similar to the Unify and Ace)

Although it seems like this result is even worse, I actually think it is better to have it disabled then to have enabled but not working (actually pretending to have it).
Asus (Pro WS X570-ACE for example) and Gigabyte (Aurus Prod for example) say some of their boards have full ECC support. Asus says “depending on the CPU”, but nowhere specifies which CPUs. Gigabyte say Ryzen-3000 and Ryzen-2000-pro (which is weird, because according to AMD there is no difference in ECC capability between pro and none pro). Anyway, I don’t have Asus or Gigabyte, so I can’t test those…

Would be nice if someone could… :)
With all the knowledge I’ve gathered so far (and shared in this thread), it was less than a day work.

The information relevant to us end customers here regarding the “Pro” suffix is merely that APUs (AM4 Ryzen CPUs with built-in graphics) without the “Pro” at the end don’t work in ECC mode at all.

This doesn’t affect CPU-only Ryzen SKUs (so far), but maybe AMD is pulling an Intel sometime in the future (I hope not).

I have some good news, which should make reproducing (and validating after fixing) the issue a lot easier for Asrock Rack! It seems like Passmark have updated their MemTest86 product from version 8.2, which didn’t fully support Ryzen 3000, to version 8.3, which does fully support Ryzen 3000 (they forgot to put it in the changelog though).

This is very interesting, as MemTest86 Pro (not the Free version) supports ECC Injection:

“ECC injection: Enabled/Disabled (Pro version only) - if CC detection/correction is supported/enabled and CC injecton is supported by the system this option enables/disables injecton of CC errors to simulate how the system responds to real CC errors. CC errors are injected at the start of each individual test. If CC injection is successful the details of the CC error shall be reported and displayed on screen as if an actual CC error was detected.

Notes Although ECC injection may be supported by your hardware, it may be locked by the BIOS. Some BIOS may allot you to unlock the ECC injection feature in the BIOS setup.”

And Asrock Rack did do very well on that regard, as there is an option in the BIOS called “Disable Memory Error Injection”:

afbeelding

After setting this BIOS setting to false and enabling “ECC Injection” in MemTest86:

afbeelding

I ‘ve ran MemTest86 and it re-produces the issue perfectly:

afbeelding

As you can see, it successfully injects ECC errors, but doesn’t detect them, which is exactly the same as I was seeing when trying to trigger memory errors using unstable settings.
https://www.passmark.com/forum/memtest86/5984-how-do-you-verify-ecc-error-injection-working

Also I am very curious if this is only a Ryzen 3000 issue, as the motherboard was initially designed for Ryzen 1000 and Ryzen 2000 CPUs alone. Perhaps ECC does work for those older CPUs. Unfortunately I don’t have such a CPU to try this on (feel free to send me one for testing ).

I’ve forwarded this info to Asrock Rack…

1 Like

I reached out to support but figure if anyone else have seen this. I got a new motherboard but it won’t POST.

Depending on whether I’m clearing the cmos or not i’m getting a 61 or a b6 in Dr. Debug. Per support/faq.asp?id=334
I see 61 is a chipset initialization and b6 is not documented.

I’m running a Ryzen 9 3900X and I’m not sure what the BIOS is since it will not POST. general/productdetail.asp?Model=X470D4U2-2T#CPU indicates that 3900x is only supported since P3.10

I’ve tried two sets of memory from the QVL List at general/productdetail.asp?Model=X470D4U2-2T#Memory first 128GB of Samsung M391A4G43MB1-CTDQ
then suspecting a bios issue, 16 GB of Crucial CT16G4DFD8266.C16FD1. Both are acting the same way.

I’m also unable to IPMI in at the part where the POST is failing perhaps because that is not initialized until later.

Is this a defective board? What are my next steps?

I don’t completely understand this sentence, but your IPMI should work and be reachable, even when your server is turned off. No need for a working POST…
If you can get into your IPMI, you can check and update your BIOS from there and get support for your Ryzen 3900x.

Here is how I got it working

Had a similar problem. First make sure your addblockers are off, and IPMI isn’t connected to a different vlan or something like that.

Next in the IPMI Settings, select the Keep NIC Link Up setting. Without that checked for me I always missed the post screens.

From there open the remote control section, tell the unit to reboot and fingers crossed that lets you see the posts and get into BIOS like it did me.

The link and activity lights of the IPMI port just blink slowly about every 3/4ths of a second. At no point is there a DHCP request made, and no new MAC addresses are in the switches ARP cache. Is there a setting in the Bios that is required to enable IPMI initially or is it always on for IPv4 by default?

  1. Make sure DHCP is working on your router
  2. Perform a CMOS Reset on your motherboard
  3. Attach only the IPMI network (using a working network cable :wink: ) and power cable
  4. Turn on the system

If the IPMI then doesn’t get a DHCP assigned IP, then I’m afraid you have DOA.