Ryzen 5700X ECC reporting

, ,

Thank you @KeithMyers and @aBav.Normie-Pleb for your comments!

Also, thank you @aBav.Normie-Pleb for your time and commitment, and thank you everyone who has commented so far! I really appreciate your helpful insights and help.

@aBav.Normie-Pleb
I was thinking of ordering a 5750G PRO as well, because after all, it is hard to point fingers at ASRock directly, even if there is only a small chance that this may be due to the 5700X. However, I am still more or less convinced that the root of this problem should be the MB, especially after seeing the reports on this thread of just about every 5000 series CPU working properly.

I just want to comment that I have two Asrock Rack server boards, EPYCD8 and ROMED8-2T and both support and report ECC errors.

I assume just these X470/X570 based boards are the ones they dropped the ball on.

1 Like

Keep in mind OP is looking at the consumer boards, not the Rack division boards. I imagine there’s at least some technical crossover internally, but they may very well operate independently.

I think @aBav.Normie-Pleb has been testing with ASUS, and all the other ASRock success comments in this thread appear to be Rack division boards.

2 Likes

Thank you @KeithMyers for the additional information.

Yes, as @Quension wrote, I am currently looking at the consumer boards. Although, to be fair, originally I considered the ASRock Rack x470/x570 boards, but after reading their reviews, I found that they had tons of bugs and undesirable features, ranging from buggy and seldomly updated BIOS versions, slow I/O due to some of the ports being substantially slower, the bad placement of the RAM slots and the lack of space between them and the CPU cooler, bad RAM speeds, etc. Therefore, in the end I went with a regular x570 board, as lots of users wrote that they regret buying the ASRock Rack boards and having their current knowledge, they should have bought a normal x570 consumer board.

Out of the x570 consumer boards, the Taichi seemed to fulfill my requirements the best: 8 SATA ports, 3 NVME slots, advertised cough ECC cough support and so on. It was only because of the last item (“guaranteed” and marketed support for ECC RAMS) that I decided on this particular board, because I thought that if they are brave enough to market the board as “rock-solid” and “server-grade” with an additional ECC support section on the product’s webpage, that they should then either really support said functions or be ready to be questioned upon said functions not working. However, judging by my experience with ASRock support so far, it seems I was wrong. :slight_smile:

You are correct about the Asrock Rack X470/X570 server boards. They too have big ECC issues along with a myriad of other issues.

1 Like

I made the distinction because post 4 reported success with one of those boards. Searching the X570D4U thread, I see similar comments of ECC error reporting working. Whatever issues ASRock may or may not have, it doesn’t seem to generalize to a particular chipset — but the Rack boards have BMCs with another reporting path.

Searching more broadly, I see one report of ECC errors working on the X570 Taichi, although it would have been a few BIOS versions ago by now:

1 Like

Yes, honestly, that was one of the posts that I took “inspiration from” to choose the x570 Taichi, because everything seemed to work fine. That is also why I am puzzled that the function does not seem to work now.

Sorry for the late reply, but I was out of town and therefore have not been able to make any substantial progress.

@aBav.Normie-Pleb
I apologize for spamming you, but did you perhaps manage to find out anything with the spare parts you had in the meantime?

The parts weren’t exactly “spare” so I needed to shuffle things around to free them up.

If nothing extra comes up I should be able to get to it this upcoming week.

1 Like

Got a first look at my ASRock X570 Taichi with UEFI P5.01 (AGESA 1208, a version older than the ASUS ProArt X570-CREATOR WIFI’s UEFI with AGESA 120A).

  • Ryzen 7 PRO 4750G
  • 2 x 16 GB Samsung ECC UDIMM DDR4-2400@2933, 1.20 V
  • With the current UEFI version the PFEH option is gone from the UEFI :frowning:

ECC also seems to be working correctly, regular memory overclocking without additional “heat treatment” was enough.

Can try a 3700X next, don’t mean to be a cheapskate but ran out of bundled “shitty” thermal compounds and only have very expensive stuff left (Thermal Grizzly Kryonaut Extreme, works great even at 270 W PPT on AM4), hope to find some more in some old cooler’s packaging.

My current Vermeer (5900X, 5950X) systems with ASUS ProArt X570-CREATOR WIFI motherboards have that expensive stuff on them which I don’t want to just throw away to harvest the CPUs to put them into the Taichi.

4 Likes

Currently continuing with the 3700X

  • 2 x 16 GB Samsung ECC UDIMM DDR4-2400@3200, 1.20 V

So far no errors reported in MemTest (which I find HIGHLY suspicious). I’ll let it run over-night and if after its default 4 passes there are still none reported I’ll get the hot-air station.

What air temperature should be relatively safe (keep the nozzle a few centimeters away and constantly move it along the space between the two memory modules?) - 150 C?

1 Like

Thank you very much @aBav.Normie-Pleb for your time and efforts!

It is really weird that so far every CPU worked fine on the ASUS motherboards but ASRock seems to be a lot more nitpicky. The absence of PFEH is honestly also quite strange.

I am currently thinking of either getting an ASUS MB (which will not be easy as I would like 8 SATA ports) or a 5750G PRO (which is also a bit risky as ASRock support could not manage to get it to report ECC errors).

Regarding the heatgun, I started from the very minimum temperature (which was 60 C in my case) and continously checked the temp. in MemTest. Once the RAM hit 60 C, I slowly increased the temp. Also, be wary of the position of the HDD/SDD if you have any connected to the MB (for MemTest you should not need any) as those are more sensitive to heat.

1 Like
  • The 3700X with the heaviliy overclocked Samsung ECC UDIMM DDR4-2400@3200, 1.20 V didn’t report any errors even after 4 complete passes.

  • Then took the hot-air station and heated the memory up until it reached 75C (from around 30C), still nothing and the system still worked completely fine.

  • Got new respect for Multi-bit ECC on AMD systems

  • Will try the 5750G next, my gut feeling says the Tech Support is incompetent (I don’t care if they don’t have the time or funding to actually test stuff, then they should outright say it instead of gaslighting you)

  • If ASRock isn’t willing to fix this and you want to switch motherboards to not be constrained regarding the CPU choice the ASUS Pro WS X570-ACE has 8 SATA ports from the X570 chipset; 4 via “normal” SATA ports and 4 via its U.2 port you can control via the UEFI (switch between PCIe for an NVMe SSD or 4 x SATA to use with a U.2/SFF-8643-to-4x-SATA breakout cable).

3 Likes

Thank you for the detailed answers @aBav.Normie-Pleb!

I am dumbfounded that you essentially got the same results with the 3700X as I got with the 5700X (which is no logged errors).

Also, thank you very much for having a look at the 5750G PRO too, even in advance! And thank you for the MB recommendation as well. I was honestly unaware that you could control it via UEFI… Which is a bummer, as that MB is essentially the same price as the X570 Taichi in my country… Should have went with that I guess. :slight_smile:

Well, I guess we will see how the 5750G PRO goes. I would not bet anything on ASRock Support doing their job correctly at this point. I know this is just pointless venting on my part, but I am still baffled how bad that support is; the worst of any other support team that I have communicated with, so far. In all honestly, I am still not convinced they even gave it a proper go, as even though I explained it to them multiple times with links, screenshots, etc. that they should overclock the RAM and then run MemTest to see if the errors get logged, they still only reported that they did not see anything and that Windows and Linux reports Multi-Bit ECC working, so everything should be working properly.

  • Unfortunately had a brain fart mixing up my remaining Ryzen 9 PRO 3900 and the 5750G, could test the 3900 PRO on the Taichi, the 5750G is in my passive ASUS system, its heat-pipe-to-case-heatsink design is a pain to take apart.

  • But I’d bet that the 4750G and 5750G behave the same.

  • I overclocked the 2 x 16 GB Samsung ECC UDIMM DDR4-2400@3400, 1.20 V, am surprised how well that old memory clocks

STILL NO REPORTED ERRORS, which is extremely unlikely, comparing it to the 4750G’s results where the same memory modules reported errors at DDR-2933.

Based on your tests with the 5700X and my tests with the 4750G, 3700X and PRO 3900 ASRock messed up AMD’s firmware implementation for ECC error reporting regarding the parts that are used for chiplet-based CPUs (Matisse Ryzen 3000, Vermeer Ryzen 5000). Both Ryzen generations use the same IO-Die that is handling the system’s memory.

  • AM4 APUs (Ryzen PRO 4x50G/5x50G) are monolithic designs that use a different firmware module, these seem to currently work on the Taichi with ECC error reporting to the operating system.

  • Note: On none-PRO APUs AMD made the product segmentation decision to disable ECC support there, otherwise every Zen2/3 CPU has ECC enabled from AMD’s site and it’s up to the motherboard manufacturers to not mess it up.

Maybe you can relay that the 3900 PRO’s ECC error reporting is also broken.

The fix for that will likely fix the issues with your 5700X as well, since as mentioned these use the same IO-Die.

You should also mention that ECC error reporting works on ASUS motherboards with the very same CPUs, maybe that would add a little pressure to fix this.

@wendell

I know that ASRock has been a sponsor but are you able to shoot them a message so that this might actually get fixed? Their AGESA 120A UEFI update is also quite late.

3 Likes

Thank you @aBav.Normie-Pleb for the detailed summary, as well as your continuous help!

Yes, it is indeed shocking that the same CPUs work just fine on ASUS motherboards while failing to work properly on ASRock MBs.

I will once again contact them via e-mail (although they still did not answer to my two months old e-mail either to this day) and try to have them actually spend a few minutes working on this issue instead of pointing fingers at other things.

Hello Everyone,

just wanted to do a quick update. In the end, I ordered a 5750G PRO, which should arrive tomorrow. I will try my best and test it until the weekend.

(Of course, I did not receive any sort of reply from ASRock to my e-mail mentioning that the problem is indeed with the motherboard since everything works on ASUS motherboards.)

1 Like

Please also give an update if the 5750G is correctly reporting ECC corrected errors like the 4750G I tested on the Taichi did.

I would have a guilty conscience if it didn’t since I had stated that the 5750G should work exactly like the 4750G.

1 Like

Of course, I will. :slight_smile:

In the meantime, I swapped the CPU. I could only run Memtest for about an hour with the RAM @ 3800MHz and will have to shut it down soon, because of a lightning storm nearby. Sadly, no ECC reports so far.

I will give this a more thorough try tomorrow because currently I am on BIOS version P4.40. Tomorrow, I will update it to 5.x and run MemTest for longer periods of time.

Do not have a guilty conscience :slight_smile: I still appreciate your helpful intentions, even if the 5750G PRO will not report the ECC errors.

Also, a silly question that popped up in my head in the meantime: the free version of Memtest86 also reports ECC errors by default, right? In that case, I will download the new 10.5 version to give it a go.

@aBav.Normie-Pleb @KeithMyers @Log @Quension @Exard3k @Mach3.2 @GeorgePatches

Update
(I apologize for spamming you by tagging everyone, but I thought it would only be natural to notify everyone of the outcome of my tests.)

Conclusion: ECC error reporting works on the X570 Taichi with a 5750G PRO CPU on BIOS version 5.03 BETA by turning PFEH OFF

Yesterday, I was unable to get ECC errors to show up in Memtest Pro v10.3 PRO @3800 MHz, so today I gave it another try by upgrading both Memtest to 10.5 (Free) and the BIOS version. I remembered that ASRock support said that they were able see the PFEH option in the BIOS, so I searched for the exact e-mail in which they mentioned this and looked for the attached BIOS file that I asked from them waaaaay back in March 2023. I then saw that they sent a file with BIOS v5.03 BETA, which is currently NOT listed on their website. Then I downloaded it and upgraded my BIOS version and “lo and behold” the PFEH option appeared in the BIOS menu. Then, I turned PFEH OFF and ran the Memtest86 v10.5 Free, this time with the RAM @3866 MHz. Although there were no errors in the beginning, after about 2.5 hours, an ECC error appeared on the screen, with another error following it shortly (see the image below).

Thank you so much everyone for all the help in the last few months, and especially @aBav.Normie-Pleb for the thorough tests and recommendation! Now I just need a pair of reliable SSDs to replace my Kingston A400s that have been reporting errors for a while now :smiling_face_with_tear: but that is beside the point of this post.

Also, I am genuinely shocked and baffled that ASRock support was not able to do this for the last 4+ months now… (They, too, had a X570 Taichi with a 5650G PRO but told me they could not confirm whether if ECC works or not, instead blamed my RAM sticks and just about anything that is not the Taichi). Terrible support service, I am never going to buy anything from them.

Also, just a quick info that grabbed my attention when I ran Memtest: @aBav.Normie-Pleb : in v10.5, now Memtest seems to show the actual values (i.e. 3866 MHz) instead of the stock values.

8 Likes