Testing In-band ECC with soldered RAM?

I have a LattePanda Sigma on order, but before I go and built a badass NAS out of it, I want to make sure inband ECC is working.

There’s very little out there on ECC testing in general, let alone testing RAM where you can’t mess with the pins physically.

My only plan so far is to enable Linux kernel support so it can log errors (as per here: LattePanda Sigma ECC Testing Guide - ServeTheHome ) then run a memory testing tool and… I guess some variant of rowhammer? I don’t even know if rowhammer is effective at all on DDR5, it’s been a long time since it surfaced as a problem, and there have been mitigations put in place.

Any thoughts?

1 Like

MemTest86 offers ECC fault injection on paid versions if it’s not locked out. (Not to be confused with Memtest86+ 7.00, which has Ryzen support that won’t help with the Panda’s i5-1340P.) A more accessible software option’s to push up a DDR overclock until it’s no longer stable and look for ECC correction logs.

4 Likes

Thanks, I appreciate it! I’ll give the paid version of Memtest a shot, even though their supported platforms don’t include mobile Raptor Lakes. But overclocking is probably going to be sufficient.

It’s not going so great with MemTest86. The Pro version thinks it can inject ECC errors and tries to do so, but without any results. The functionality is either locked out in the BIOS or just flat out nonexistent on mobile Raptor Lake.

Overclocking was also a bust:

The only thing you can change in the BIOS is the maximum RAM speed, not an actual value you can enforce. This does affect what MemTest86 reports (see 5186MT/s above) but the 19.9GB/s reported figure is closer to 2400MT/s which happens to be the BIOS Auto value and also what gets reported by HWINFO64 in Windows.

I’ve been running the rowhammer test of MemTest for ~48 hours without any errors, ECC-corrected or otherwise.

Next step is to see about getting EDAC support going in Linux.

I did manage to get hold of that LattePanda doc that was partially quoted in an image on ServeTheHome. Might be useful to some people:

LattePanda Sigma ECC Testing Guide

Which type of ECC does LattePanda Sigma support?

LattePanda Sigma supports In-Band ECC. Different from traditional 72-bit ECC memory, In-Band ECC does not use additional memory chips to store ECC bits. Instead, it partitions the capacity in existing memory and conducts two memory operations to transfer data and ECC separately. This undoubtedly leads to some performance loss. However, for the dual-channel LPDDR5 6400MHz memory bandwidth of up to 102.4GB/s, it is acceptable to sacrifice some performance for reliability.

In-Band ECC is different from the newly added On Die ECC in the DDR5 specification. On Die ECC is only a basic guarantee of reliability for DDR5 high-density chips. It is a purely hardware implementation of 1-bit error correction and is completely invisible to the outside world. It cannot monitor or detect errors. In-Band ECC, like standard ECC, has the ability to correct 1-bit errors and detect 2-bit errors, and can also achieve error monitoring and memory scrubbing.

The following is a comparison table of standard ECC memory, In-Band ECC, and On Die ECC.

DDR4 ECC UDIMM DDR5 On Die ECC In-Band ECC
ECC Size 64bit data + 8bit ecc 128bit data + 8bit ecc 512bit data + 16bit ecc
Correction Capability Fixe 1-bit, detect 2-bit Fixe 1-bit Fixe 1-bit, detect 2-bit
Correction Method Memory controller handling, monitorable Internal RAM Die processing, not visible outside Die Memory controller handling, monitorable

Validate whether ECC is truly functioning

As long as ECC is enabled in the BIOS, the reliability enhancement and corresponding impact on memory performance are already being realized. However, without the ability to monitor ECC errors within the system, it cannot be considered a complete EDAC system. This is where Intel is lacking in documentation, but we have discovered some methods through trial and error, at least for use with Linux.

  1. Enabling ECC functionality in the BIOS should be straightforward. If you are using an internal beta BIOS, ECC should be enabled by default.
    Advanced -> Power Management -> In-Band ECC Support [Enabled]

  2. When it comes to installing a Linux system, I usually opt for Ubuntu.

  3. Download the kernel source code version 6.1.x or higher. Modify drivers/edac/igen6_edac.c
    Change #define DID_ADL_SKU3 0x4621 to #define DID_ADL_SKU3 0xa707

  4. After compiling and installing the new kernel, upon restarting, you should see the following initialization message when running the command sudo dmesg | grep -i edac. If there were any memory errors detected, you would also be able to see the updated logs here.

[    3.604592] EDAC MC: Ver: 3.0.0
[    8.430566] caller igen6_probe+0x176/0x7d0 [igen6_edac] mapping multiple BARs
[    8.436042] EDAC MC0: Giving out device to module igen6_edac controller Intel_client_SoC MC#0: DEV 0000:00:00.0 (INTERRUPT)
[    8.442796] EDAC MC1: Giving out device to module igen6_edac controller Intel_client_SoC MC#1: DEV 0000:00:00.0 (INTERRUPT)
[    8.443092] EDAC igen6: v2.5
  1. In addition, you can also view the ce_count and ue_count counts by going to /sys/devices/system/edac/mc/mc*.

Performance impact of In-Band ECC

On the surface, enabling In-Band ECC may seem to reduce memory bandwidth as it requires the transmission of both data and ECC. However, in reality, the performance loss depends on the specific application, as the memory controller has a cache and other optimization measures that Intel does not disclose to us.

For example, in terms of theoretical CPU performance testing, In-Band ECC has almost no impact on CINEBENCH R23, but there is a noticeable difference in performance in y-cruncher. It is recommended that users measure the specific application’s performance impact of In-Band ECC based on their own application requirements.

Lastly, it is important to note that ECC cannot solve bugs that exist in software itself, such as when Adobe Premiere Pro crashes suddenly during the editing of complex effects, which is generally due to a bug within Premiere Pro itself.

Other Questions

If you have any further questions, please feel free to send an email to [email protected].

1 Like