ECC memory compaitibility

Hi guys,

I apologise in advance if this is a stupid question, but I have been googling for quite awhile and have found conflicting information.

I have a number of desktops, and I’m looking at using one to practice some vmware and windows server stuff to help me progress in to the sys admin space for work.

Basically I have two machines I could use. One is a socket 1366 xeon 6 x5650 system with 12 GB of RAM and the other an FX8350 with 8GB of RAM.

I know normally ECC memory is unsupported on consumer desktop systems but I’ve seen some talk of it working on AMD systems and older intel systems like my socket 1366 board. I want to use ECC because there is a lot of cheap DDR3 on ebay which would be useful for the sort of work I’m planning to do.

My AMD motherboard is https://www.asus.com/au/Motherboards/M5A99FX_PRO_R20/specifications/

My Intel motherboard is https://www.gigabyte.com/Motherboard/GA-X58A-UD5-rev-10#sp

From what I can see the Intel doesn’t support ECC but the AMD board does, though I’m not specifically sure if it would work with my config.

http://dlcdnet.asus.com/pub/ASUS/mb/SocketAM3+/M5A99FX_PRO_R20/M5A_Series_DRAM_QVL_201406.pdf

These all look like non-ECC memory to me.

Any chance someone with experience can help clarify for me?

The thing is (in my experience) that when a manufacturer lists “ECC-RAM compatible” in their motherboard specifications, this doesn’t necessarily mean that the system will run in “ECC-mode”. In other words, the system will boot with ECC memory installed but it might not correct errors. Unless the board supports ECC memory in working ECC-mode, this compatibility won’t help you.

However, it might be possible that this board actually does support working ECC and is not just compatible. A look in your board’s manual might give you some further insights. One of my system’s board’s manual for example says that “ECC will only work with a Xeon CPU” (it’s an Intel C232 board).

Since your CPU also supports ECC, chances are good that you might be successful with your AMD system. If you have the option to order some ECC-RAM to test it, I would simply try it out. Most shops will probably let you send it back, if you don’t want to keep the memory. :wink:

Validation on the other side is a big can of worms. Usually the newer Linux kernels will report it in the logs, but if not you will need some other tools. What worked best for me was Passmark’s fork of memtest86 (the free edition). It will show a simple “Yes” or “No” right after booting the live CD. If you can overclock the memory on that board, you can also try that. If you go too high on the frequency, the memory will not be able to keep up and errors will occur, which ideally should be corrected and reported to the OS.

1 Like

If that cheap ebay RAM is registered ECC memory then it’s not going to work

Thanks for the info. So you would suggest that I would still need to test the memory even if it boots. Am I looking to create errors via overclocking or is there some other way to ensure ECC is working.

The one I was looking at was unfortunately registered ECC. Looked up the differences and have a bit better understanding. Thanks for the help.

That’s always a good idea :wink:

Like I said, you can try to verify in many ways. Usually you are all good if the Linux kernel reports ECC enabled in the logs. Overclocking is a nice way of triggering real-life errors and the mentioned live-CD is also a possibility that worked well for me.

1 Like

Yeah because IIRC it’ll boot as long as the first 50 KB are not borked. ( or the whole stick is dead)

Suttle failures like a flipped bit somewhere don’t really surface until actually run into it.

Always test the memory.

Thanks for all your help guys. It’s my first post on this board and everyone has been real helpful.

You can verify your memory is detected with a Total Width of 72 bits, which confirms ECC is working, rather than jus 64 bits.

Quick example from a recent setup on an Asus WS board, albeit BSD rather than Linux,

[freenas ~]$ dmidecode -t 17
Handle 0x005C, DMI type 17, 40 bytes
Memory Device
    Array Handle: 0x0058
    Error Information Handle: Not Provided
    Total Width: 72 bits
    Data Width: 72 bits
    Size: 16384 MB
    Form Factor: RIMM
    Set: None
    Locator: DIMM_B1
    Bank Locator: NODE 1
    Type: DDR4
    Type Detail: Synchronous
    Speed: 2133 MHz
    Manufacturer: Micron
    Serial Number: 16EXXXXXX
    Asset Tag: DIMM_B1_AssetTag
    Part Number: 36ASF2G72PZ-2G1B1
    Rank: 2
    Configured Clock Speed: 2133 MHz
    Minimum Voltage: Unknown
    Maximum Voltage: Unknown
    Configured Voltage: Unknown

@comfreak please correct me if I’m wrong though (as I’m new to dealing with ECC in *nix) - but the above would confirm ECC is in ‘working’ mode?

I believe in Fedora you can do dmesg | grep ECC.

That’s what I have read as well, but on the other hand, I have a workstation (not server) board that has a special chipset (C232) which is designed to have ECC-support, along-side a CPU (Xeon) that is designed for ECC as well, in conjunction with RAM that is listed in the compatibility list of the motherboard manufacturer as working ECC-RAM, but I still get a different number there. What I get is 128 bits even if I move the two modules into separate memory channels.

When I boot the mentioned live-CD I get “ECC: Yes”, so I assume it is working fine, but I haven’t verified it with the newer Linux kernels yet. Of course the live-CD could just do a simple “Is the data width > 64?” kind of check and call it ECC if true. Back when I built it, the then current kernel wouldn’t print anything related to ECC.

What I am trying to say is that a width of something other than 72 bits doesn’t necessarily mean that it’s not working, but if you see 72 bits, then I think it’s safe to assume that ECC is working. Of course the best way of verifying this is always to generate errors and have error correction reports logged by the kernel, because just the fact that the extra data is available doesn’t necessarily mean that the system will also correct them.

Note to myself: Boot from a live Linux and check logs for ECC messages.

1 Like

Great idea, thanks!!

1 Like