Ryzen 5 7600x and ECC

I have a Ryzen 5 7600X and a ASRock B650M PG RIPTIDE WIFI motherboard but am not sure if ECC ram for a truenas build makes sense. Does ECC ram work properly with my motherboard and cpu on truenas? I’m not even sure what model of ECC ram to look for if everything is compatible. Any help is much appreciated!

1 Like

trash in, trash out
ECC is necessary for data integrity

check the QVL

https://pg.asrock.com/mb/AMD/B650M%20PG%20Riptide%20WiFi/index.asp#MemoryRAP

Thanks for the information! I took a look at the QVL but I am unable to tell which models are ECC. Am I missing something or is it not clearly labeled? Is there a certain brand I should be looking for? I do not know much about ECC ram.

1 Like

honestly, me neither
google each and check against vendor

1 Like

To my knowledge no one has reported working ECC support on ASRock boards so it likely “works” in terms of running but not utilized.

For what it’s worth, here’s someone reporting ECC correctable errors on an ASRock B550M Steel Legend at least.

I haven’t researched ECC on AM5 boards, I’m afraid.

Also see this thread.

One way to find ECC UDIMMs is to go the manufacturer’s websites and do a search. For example:

There does not seem to be any sticks with official 6000 MT/s support yet. Crucial does mention 6400 MT/s ECC RDIMMs in one of their datasheets from may 2023, so maybe they’re on their way?

No, it’s not and RDIMMs are not the same as UDIMM.

Yeah that QVL is huge and does not indicate ECC. So I’d suggest looking which ECC UDIMMS are available in your location and then checking if they are on the QVL. There’s not too many out there so that should be easier. In any case

  • Even if ECC may not work/is not supported by the board, ECC modules should work in non-ECC mode.
  • I think QVL are overrated. I would bet e.g. the kingston ECC dimms are more likely to work well than a random gamer kit even if only the latter is on the QVL. Especially with DDR5 and the AMD IMC, and seeing kits with 6200+ on the QVL makes me chuckle. Plenty of people have gotten them and had instable EXPO/XMP settings.
  • Hynix dies work best for DDR5 (run cooler and potentially faster) so I’d recommend those if you can find them. There are Hynix brand 5600MT ECC sticks out there and Kingston has 4800/5200/5600MT ECC sticks that use hynix dies. Some other brands probably do too.
1 Like

Looking at that ASRock’s specs page.

“Supports DDR5 ECC/non-ECC, un-buffered memory up to 7200+(OC)*”

Several folks, including Wendell, have reported ECC UDMM error correction and logging on a few different AM4 and AM5 ASRocks. The B650M PG Riptide isn’t one that I’ve come across test results for but, as I’ve written in other similar threads, I’m comfortable assuming the checking and correcting part happens if ASRock indicates ECC support on a board.

None of ASRock’s desktop boards that I’ve checked indicate EPYC 4000 support but ASRock Rack has three series of boards that fall under AMD’s full support mandate. So, if one doesn’t want to take desktop indications of ECC support at face value or do fault injection to confirm support, those would be the go to within ASRock’s product lines. Unfortunately the B650D4U has a high failure rate but perhaps the B650D4U3’s better.

The part of @lostinthesauce’s question I’m most curious about is the TrueNAS bit. Can’t speak to that as the NASes I admin are running another *nix.

Since DDR5 enables scrubbing and adds read CRC, in some ways entry level DDR5 UDIMMs with only the on die EC2 exceed EC4 DRR4 protections.

Yeah, 5600’s the highest I’ve seen. Nemix is another besides Crucial and Kingston.

Uhm… There are some generic assumptions stated at best, there’s no “this actually works…” but feel free to point me to it in the video. I’m also curious about the no 4 DIMMs on AM5, it works just fine if you run within spec however I guess it might possibly be a bit more unforgiving on low-end mobos with fewer PCB layers. There are a lot of users here running 4 DIMMs, not that it’s a great data point but it would be interesting to know where that conclusion comes from.

1 Like

Just a quick note, it is not validated and ECC is not the kind of thing you want to rely on without validation. Run the RAM at safe speeds, test it as much as you can, then make sure you POST on regular basis (Daily). That is the golden rule on all PC based servers and in this case no matter which module you end up buying.

Thanks everyone for all the information! I was unable to find an ECC kit listed in the QVL so I went ahead and got the Kingston Fury Renegade Pro KF556R36RBK4-64 kit. I am not sure how to check and make sure everything is working when the kit comes in so if someone could point me in the right direction on how to run a test I would much appreciate it. I can use Ubuntu to run the tests instead of truenas to make sure it works if that would be easier. Also the kit I got has 4 dimms but I plan on using just 2 for this system and the other 2 for a future build as a backup nas, is that a bad idea?

The most common method I’ve encountered is masking a pin. There’s a couple other low cost options that might work and you could probably find a riser/interposer to mod.

Passmark supports only DDR4 and the Rowhammer injector isn’t fully open, so for more formal testing it might have to be something like MEI or a KT-5MU.

Breaking a matched quad into two pairs is just fine.

That’s the wrong kind of memory and wont work, you don’t want ECC registred memory.

Here’s an alternative of correct memory that’s working,
Micron 32GB DDR5-4800 ECC UDIMM 2Rx8 CL40 | MTC20C2085S1EC48BR | Crucial.com (both 4800 and 5600 will do fine)

2 Likes

Yeah, as diizzy said, you need unregistered/unbuffered memory (UDIMMs). The RDIMMs you ordered won’t fit in the slot.

So under Linux there are a few things you can do to get hints about whether ECC is working or not:

# dmidecode | grep -A 6 "Memory Device"
Memory Device
	Array Handle: 0x0010
	Error Information Handle: 0x0016
	Total Width: 72 bits
	Data Width: 64 bits
	Size: 16 GB
	Form Factor: DIMM

Look for Total Width: 72 bits. This, as I understand it, is info queried from the UEFI which in turn gets it from the DIMM SPD. I.e. this only shows that the DIMMs themselves think they are ECC - it says nothing about whether ECC is actually active and working.

# dmesg | grep -i EDAC
[    0.360677] EDAC MC: Ver: 3.0.0
[    2.517034] EDAC amd64: MCT channel count: 2
[    2.517829] EDAC MC0: Giving out device to module amd64_edac controller F17h_M60h: DEV 0000:00:18.3 (INTERRUPT)
[    2.518328] EDAC amd64: F17h_M60h detected (node 0).
[    2.518866] EDAC MC: UMC0 chip selects:
[    2.518867] EDAC amd64: MC: 0: 16384MB 1:     0MB
[    2.519258] EDAC amd64: MC: 2:     0MB 3:     0MB
[    2.519639] EDAC MC: UMC1 chip selects:
[    2.519640] EDAC amd64: MC: 0: 16384MB 1:     0MB
[    2.520003] EDAC amd64: MC: 2:     0MB 3:     0MB
[    2.520363] EDAC amd64: using x8 syndromes.

This shows that the kernel driver finds and can talk to the CPU memory controller EDAC device. As I understand it, it shows that the CPU supports ECC.

# cat /sys/devices/system/edac/mc/mc0/rank0/dimm_edac_mode 
SECDED

This queries the kernel EDAC driver through the sysfs interface. The files here are dynamic/“virtual” and created by the driver. So when you cat the file the data is generated on the fly by the EDAC driver and should show the current state of the onboard memory controller. (Caveat: I haven’t actually looked into the code, but I have written drivers using sysfs myself which are in the mainline kernel so I know that this is a reasonable expectation.)

In this case it shows that ECC is enabled for the first rank (the first DIMM in this case, since they’re single rank DIMMs). SECDED = single-error correction and double-error detection.

I’m not sure this automatically mean that you’ll get errors reported to the kernel though. Some UEFIs have a “Platform First Error Handling” setting that, if enabled, might cause errors to be handled silently. I’m not sure about this, but I’d look through the UEFI and make sure any such setting is disabled.

With all this in place, here’s someone triggering ECC error messages on a Ryzen 5800X system by overclocking. I guess this would be the final proof that everything is working?

Thank you very much! I could not find any in that list and looks like I missed it. Cost aside, would it be better to get 1 32Gb stick or 2 16Gb sticks? As I understand ddr5 runs in dual channel on one stick right? I’m leaning towards the 32Gb stick since I could drop in another 32Gb stick in the future for 64 total. Is this a bad idea?

Thanks for letting me know! I don’t know anything about ECC memory and what the difference in RDIMM and UDIMM is. I will give the suggestions for testing ECC when the sticks diizzy pointed out come in.

It seems to be around ~0-7% on computing intensive tasks in general, games seems to suffer more looking at single vs dual channel (1 vs 2 sticks). I’d personally go for a single 32Gb stick as it’s less painful to upgrade later on but it’s up to you.

https://www.retailmenot.com/view/crucial.com - Might give you 10% off