Anybody seen Samsung/Micron/Hynix 32 GB DDR4-3200 ECC UDIMMs in the wild?

Hi @igoodman. Can you confirm that the ones you have working are the MTA18ASF4G72AZ-3G2B1? Do you have them in slots A2 and B2? Thanks!

I can also confirm that these work fine on the ASUS Pro WS X570-ACE.

1 Like

Cannot recommend the Micron 32 GB DDR4-3200 ECC UDIMMs enough.

Just tested two of them on an ASRock X570 Taichi with a 4750G APU (iGPU enabled), set them to 1.35 V and the system is 3 h prime95-stable at DDR4-4000 with IF clock 2000.

The system then went to sleep (whoopsie, forgot to disable S3 - Suspend to RAM) and it woke up just fine and continued its prime95 run without any errors or crashed workers.

All PRO/virtualization features remain enabled, I only test stuff in a configuration in which it could actually be deployed for my use cases and I use prime95 with AVX et al. to test worst-case load.

*Note: I did slap some heat spreaders on the bare modules

Beautiful result. At what timings though?

What brand/type heat spreaders are those?

1 Like

Unfortunately just loose “Auto-Timings”, was very pleasantly surprised by this though.

Don’t have much experience tweaking timings manually, is there an easy way to get suggested settings from these DRAM calculator tools into the UEFI without having to manually typing in all values?

Have to do a CPU swap to a non-PRO SKU before, unfortunately AMD seems to block the installation of the Ryzen Master software (not the use of an already installed copy) with PRO SKUs :frowning:

That previous statement is unfortunately outdated; the current Ryzen Master version also blocks being launched if it had been installed with a non-PRO SKU and is now detecting a PRO SKU :frowning:

The only way to OC AM4 Renoir right now seems to be to lower the voltage (offset) so that it can hold higher frequencies in MC loads with the same power limits.

SC is limited to 4.4 GHz even if T (Ctrl) is below 60 degrees Celsius.

PBO settings in the UEFI are also missing with the PRO SKU.

Dickish move by AMD, display warnings but don’t pull an Intel and go actively blocking features! :angry:

The used heat spreaders:

  • They are made of aluminium, the only copper part is the little heat pipe.

  • Sadly they are a bit too large so you cannot populate slots directly next to each other.

  • Threw away the bundled garbage thermal pads and replaced them with Alphacool Eisschicht 0.5 mm pads with 14 W/mK, for the joy of it)

1 Like

Hi. Very helpful thread! I have been trying to decide on motherboard and memory for my Ryzen 5950x workstation build. But this thread convinced me to go with the Asus Pro WS X570-ACE and two sticks of Micron DDR4 3200MHz ECC 32GB (MTA18ASF4G72AZ-3G2B1) for a total of 64 GB.

My understanding is that this Asus workstation board can handle the ECC well. Not only accepting it but actually both run in ECC mode and report one and two bit errors to the OS. Right? Does the ASRock X570 Taichi do that as well? That seems to be another successful pairing in this thread.

I’m hoping to OC the memory a little bit, maybe to 3600. But we will see what it will do with the timings. Looks like @aBav.Normie-Pleb was able to OC them even further, but I don’t want to get headspreaders unless I have to and I don’t want to bump the voltage if I can avoid it. Were your “loose auto-timings” around the stock? Any chance I can maintain the 22-22-22 @ 3600? :thinking: Will have to benchmark I guess.

The CPU is on its way. Going to order the remaining parts in a few days. Should I consider some other combination of mobo and RAM? I’m not going for extreme overclocking. I just want to see if I can nudge it somewhat past the stock ratings without too much trouble.

These appear to have been listed in November. The 3200MHz version of the Samsung sticks: https://www.ebay.com/itm/324389420237

It’s priced pretty much the same as Kingston’s KSM32ED8/32ME (using Micron’s rev. E) on Provantage, but I wonder if the Samsung DIMMs (seems to be 2nd gen/A-die?) overclock as well as these probably do?

Users of the X570 Taichi and the Asus WS x570 Ace, can you please confirm if there is a setting for PFEH (Platform First Error Handling) in the UEFI? This is the setting that governs whether or not 1-bit errors are properly reported to the OS, and unfortunately the vast majority of non-server AMD boards hide this setting and infuriatingly also set it to enabled.

It is supposed to be under AMD CBS → Zen common options, but you might need to dig around the AMD CBS menu tree to find it if it has been moved.

PS:
If PFEH is enabled, corrected errors will NEVER reach the kernel, only uncorrectable errors. I have no idea why this feature exists, seems stupid to me.

The only positive benefit of it that I could find, is that corrected errors slow down the system, because issuing the MCE interrupt is very slow, so if you have a bunch of corrected errors happening all the time, the system would slow down. Which is nonsensical, if you have a steady stream of corrected errors flowing in, there is a HW issue and the machine should be shut down for inspection ASAP.

Unfortunately the Platform-first Error Handling option dis- and reappears from BIOS update to BIOS update :frowning:

The current BIOSes for these motherboards with the latest AGESA V2 1200 don’t expose this option (ASRock X570 Taichi/P4.00/5950X and ASUS Pro WS X570-ACE/3204/3700X.

Source: I got both motherboards running with ECC memory and just checked…

2 Likes

Well, isnt that a pain in the ass.
Back when Anandtech did their review, the option was present.

OK, I have shot an email at Asrock, asking them to unhide the PFEH setting.
Hopefully they understand my request correctly.

I have not bothered to message Asus. From my experience, when it comes to seriously technical questions like this, their technical support is about as useful as having a pickle up my ass.

Yeah, I understand your frustration; this is how the AMD CBS menu currently looks with BIOS 3204:

(Also looked around for it in other (sub) menus, unfortunately missing)

What do you guys recommend to stress test overclocked ECC memory?

I normally use OCCT, but I’m not sure how well it will work for ECC memory. It seems like unstable OCs can be stable with ECC memory because it is correcting itself, but of course ideally we’d want stability with no ECC corrections.

Like, someone on Reddit overclocked their ECC memory and it took 5x longer to boot into windows because of all the errors it needed to correct :rofl:

I bought the Karhu ramtest, Windows only , but it works. Otherwise, running Linpack with a large enough problem size to allocate almost all of the RAM under Linux is also pretty good.
You can grab a hacked version of Linpack that runs on AMD here: ryzen_cycle

Hi. I was interested in long time getting these sticks ( Samsung M391A4G43MB1-CTD) for 256GB of memory, and finally they are available near me. I can’t find reliably Samsung M391A4G43AB1-CWE or Micron MTA18ASF4G72AZ-3G2B1 in usual retailers in my country, but I can get them imported or order for not that much extra (in fact Micron MTA18ASF4G72AZ-3G2B1 just for 156$ in one shop, which is suspicious, compared to Samsung modules which are about 190$)

I am using TR 2950X in x399 mobo (MSI MEG Creation), but might update soon to TR 3970X.

What highest clocks you get stable in 2 DIMM modules per channel? (8 sticks total in 2950X system, for a total of 256GB).

Initially I run memtest86+ for few hours.

Then I run Linux, and use memtester 4.3.0 or 4.5.0 (from Debian stable and testing packages respectively for example). I launch multiple instances to speed the process and make the memory controllers work harder. I launch same number of instance as number of cores usually (you can create a bash script to spawn all the instances and redirect output to own log files, or if you are lazy like me, just open 16 sub-terminals in Terminator terminal emulator and start them all manually using a broadcast feature in Terminator, or just paste the command line manually into all terminals ;D ). I adjust amount of memory used by each instance, so almost all memory is covered. Then let it do its things for at least 24 hours, and observe errors from each instance. If that passes, I do an extra few hours pass with more 2x instances (to also utilize all hardware threads, and stress MC even more).

I’d test it with Passmark’s Memtest since it can also inject errors so you can check that ECC is actually working even if the motherboard doesn’t have a proper BIOS-level logging feature for them (any Ryzen system to my knowlegde).

After a long while I got a brief response from Asrock:

After checking with our engineer, X570 didn’t support Platform First Error handling (PFEH).

Which is a confusing statement, given that the option to toggle PFEH did exist in an early FW version (https://youtu.be/hus0u-jX6nc?t=312).
Are they just wrong or maybe the option to toggle it never actually did anything in versions where is was present?

Have you ever actually observed an error getting logged in dmesg? Maybe error reporting is working on the latest FW, despite the PFEH toggle missing. Another option that is visible, likely to be relevant is “MCA error thresh enable”. If no errors are getting reported to the OS, maybe messing with that setting could be the key.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.