Return to Level1Techs.com

ASRock X470 TaiChi, AMD Ryzen 7 3700X setup and ECC memory support

It’s my first post here so hello everyone!

I’m using my system for coding and deep learning.

Here is my setup:
CPU: AMD Ryzen 7 3700X
Motherboard: ASRock X470 TaiChi
GPU: Asus RTX 2070 Super Turbo 8GB Evo
Storage: Samsung 970 Pro 512GB M.2 (MZ-V7P512BW)

I’m looking for 32GB (2x16GB) ECC memory kit for ASrock x470 Taichi (non ultimate) with AMD Ryzen 7 3700X setup that will be running without problems.

ASRock x470 Taichi spec says that motherboard (at least with AMD Matisse - Zen 2 3xxx CPUs) does support ECC memory - [https://www.asrock.com/MB/AMD/X470%20Taichi/index.asp#Specification].

Memory section has following annotation:

*For Ryzen Series CPUs (Picasso and Raven Ridge), ECC is only supported with PRO CPUs.Please refer to below table for DDR4 UDIMM maximum frequency support.

Does anyone run ECC memory with that setup?
Can anyone recommend me suitable ECC memory kit?

1 Like

This looks remarkably like the system I’ve just put together…

  • CPU: AMD Ryzen 3700X
  • Motherboard: ASRock X470 Taichi
  • GPU: GT 710 (for testing)
  • Storage: Samsung 970 Pro 512GB M.2 (MZ-V7P512BW)

(…and the part you’re likely to be interested in…)

They aren’t specifically listed in any of ASRock’s QVL reports for the motherboard, but reports from other X470 Taichi owners suggested they work just fine. My testing under Windows confirm these reports.

image
(“6” is “Multi-bit ECC”)



So far, so good. However my intention is to run Linux (Ubuntu) on the system full time. This is where I’ve run into some problems.

[email protected]:~$ dmesg | grep 'edac\|EDAC\|Linux version'
[    0.000000] Linux version 5.3.0-18-generic ([email protected]) (gcc version 9.2.1 20190909 (Ubuntu 9.2.1-8ubuntu1)) #19-Ubuntu SMP Tue Oct 8 20:14:06 UTC 2019 (Ubuntu 5.3.0-18.19-generic 5.3.1)
[    0.197969] EDAC MC: Ver: 3.0.0

If it were working properly, I’d expect to see something more like the results in this post.

Try lshw -c memory | grep ecc

and dmidecode -t memory.

[email protected]:~$ sudo lshw -c memory | grep ecc
       capabilities: ecc
       configuration: errordetection=multi-bit-ecc
[email protected]:~$ sudo dmidecode -t memory
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.2.1 present.
# SMBIOS implementations newer than version 3.2.0 are not
# fully supported by this version of dmidecode.

Handle 0x000E, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Multi-bit ECC
        Maximum Capacity: 128 GB
        Error Information Handle: 0x000D
        Number Of Devices: 4

Handle 0x0016, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x0015
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL A
        Type: Unknown
        Type Detail: Unknown
        Speed: Unknown
        Manufacturer: Unknown
        Serial Number: Unknown
        Asset Tag: Not Specified
        Part Number: Unknown
        Rank: Unknown
        Configured Memory Speed: Unknown
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x0018, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x0017
        Total Width: 128 bits
        Data Width: 64 bits
        Size: 16384 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 1
        Bank Locator: P0 CHANNEL A
        Type: DDR4
        Type Detail: Synchronous Unbuffered (Unregistered)
        Speed: 3334 MT/s
        Manufacturer: Samsung
        Serial Number: [serial]
        Asset Tag: Not Specified
        Part Number: M391A2K43BB1-CRC
        Rank: 2
        Configured Memory Speed: 3334 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

Handle 0x001B, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x001A
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL B
        Type: Unknown
        Type Detail: Unknown
        Speed: Unknown
        Manufacturer: Unknown
        Serial Number: Unknown
        Asset Tag: Not Specified
        Part Number: Unknown
        Rank: Unknown
        Configured Memory Speed: Unknown
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x001D, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x001C
        Total Width: 128 bits
        Data Width: 64 bits
        Size: 16384 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 1
        Bank Locator: P0 CHANNEL B
        Type: DDR4
        Type Detail: Synchronous Unbuffered (Unregistered)
        Speed: 3334 MT/s
        Manufacturer: Samsung
        Serial Number: [serial]
        Asset Tag: Not Specified
        Part Number: M391A2K43BB1-CRC
        Rank: 2
        Configured Memory Speed: 3334 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

Does this mean the memory is physically detected as ECC but, with the lack of EDAC support, errors won’t actually be detected/corrected?

… … uuuuuuuuuhhhh

mmwhwhwhwh

1 Like

If I’m following this discussion correctly, it appears as though EDAC support for Zen 2 processors is in the process of of being accepted into the kernel. Which almost certainly means it doesn’t exist in the kernel I’m using. Could also explain why other users with Zen/Zen+ CPUs get full EDAC output from dmesg; support was added for those processors a while back.

Somewhat related: proper temperature reporting for Zen 2 won’t be added until 5.4

https://www.phoronix.com/scan.php?page=news_item&px=Linux-5.4-Hwmon-Zen-2-Thermal

1 Like

Makes sense. I know that my systems with ECC do show up correctly in the EDAC output. And those are Ryzen 3 1200s.

Looks like we were on the right track. This is from the current Fedora Rawhide live environment

[[email protected] ~]$ dmesg | grep 'edac\|EDAC\|Linux version'
[    0.000000] Linux version 5.4.0-0.rc6.git2.1.fc32.x86_64 ([email protected]) (gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC)) #1 SMP Thu Nov 7 16:31:36 UTC 2019
[    0.351252] EDAC MC: Ver: 3.0.0
[   21.617120] EDAC amd64: Node 0: DRAM ECC enabled.
[   21.617123] EDAC amd64: F17h_M70h detected (node 0).
[   21.617193] EDAC MC: UMC0 chip selects:
[   21.617195] EDAC amd64: MC: 0:     0MB 1:     0MB
[   21.617197] EDAC amd64: MC: 2:  8192MB 3:  8192MB
[   21.617201] EDAC MC: UMC1 chip selects:
[   21.617202] EDAC amd64: MC: 0:     0MB 1:     0MB
[   21.617204] EDAC amd64: MC: 2:  8192MB 3:  8192MB
[   21.617205] EDAC amd64: using x16 syndromes.
[   21.617206] EDAC amd64: MCT channel count: 2
[   21.617575] EDAC MC0: Giving out device to module amd64_edac controller F17h_M70h: DEV 0000:00:18.3 (INTERRUPT)
[   21.617689] EDAC PCI0: Giving out device to module amd64_edac controller EDAC PCI controller: DEV 0000:00:18.0 (POLLED)
[   21.617695] AMD64 EDAC driver v3.5.0
[[email protected] ~]$ sudo lshw -c memory | grep ecc
       capabilities: ecc
       configuration: errordetection=multi-bit-ecc
[[email protected] ~]$ sudo dmidecode -t memory
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 3.2.1 present.
# SMBIOS implementations newer than version 3.2.0 are not
# fully supported by this version of dmidecode.

Handle 0x000E, DMI type 16, 23 bytes
Physical Memory Array
        Location: System Board Or Motherboard
        Use: System Memory
        Error Correction Type: Multi-bit ECC
        Maximum Capacity: 128 GB
        Error Information Handle: 0x000D
        Number Of Devices: 4

Handle 0x0016, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x0015
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL A
        Type: Unknown
        Type Detail: Unknown
        Speed: Unknown
        Manufacturer: Unknown
        Serial Number: Unknown
        Asset Tag: Not Specified
        Part Number: Unknown
        Rank: Unknown
        Configured Memory Speed: Unknown
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x0018, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x0017
        Total Width: 128 bits
        Data Width: 64 bits
        Size: 16384 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 1
        Bank Locator: P0 CHANNEL A
        Type: DDR4
        Type Detail: Synchronous Unbuffered (Unregistered)
        Speed: 3334 MT/s
        Manufacturer: Samsung
        Serial Number: [serial]
        Asset Tag: Not Specified
        Part Number: M391A2K43BB1-CRC
        Rank: 2
        Configured Memory Speed: 3334 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

Handle 0x001B, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x001A
        Total Width: Unknown
        Data Width: Unknown
        Size: No Module Installed
        Form Factor: Unknown
        Set: None
        Locator: DIMM 0
        Bank Locator: P0 CHANNEL B
        Type: Unknown
        Type Detail: Unknown
        Speed: Unknown
        Manufacturer: Unknown
        Serial Number: Unknown
        Asset Tag: Not Specified
        Part Number: Unknown
        Rank: Unknown
        Configured Memory Speed: Unknown
        Minimum Voltage: Unknown
        Maximum Voltage: Unknown
        Configured Voltage: Unknown

Handle 0x001D, DMI type 17, 40 bytes
Memory Device
        Array Handle: 0x000E
        Error Information Handle: 0x001C
        Total Width: 128 bits
        Data Width: 64 bits
        Size: 16384 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM 1
        Bank Locator: P0 CHANNEL B
        Type: DDR4
        Type Detail: Synchronous Unbuffered (Unregistered)
        Speed: 3334 MT/s
        Manufacturer: Samsung
        Serial Number: [serial]
        Asset Tag: Not Specified
        Part Number: M391A2K43BB1-CRC
        Rank: 2
        Configured Memory Speed: 3334 MT/s
        Minimum Voltage: 1.2 V
        Maximum Voltage: 1.2 V
        Configured Voltage: 1.2 V

Hopefully the formal release of 5.4 (and the uptake by various distributions) isn’t too far away. Now if only AMD hadn’t gimped the ability to intentionally corrupt memory, I might actually know if it was working properly.

1 Like