Building a Low-Idle Power NAS with ECC Memory

Hey,

Since my Synology NAS suffered a defective RAM stick that irreversibly corrupted the BTRFS file system, I’ve decided to build a new NAS using ECC memory. My main goal is to keep idle power consumption under 15 watts (without disks).

I mostly need the NAS for Time Machine backups and plan to run a few containers on it:

  • Jellyfin
  • Paperless
  • Home Assistant
  • TubeArchivist

I’m planning to install either TrueNAS Scale or Proxmox.

After some research, it feels like I’m chasing the holy grail here. Most posts I came across reference an old LGA 1151 Fujitsu board paired with an i3-9100 for low idle NAS setups with ECC RAM. But that combo is outdated, relatively slow, and nearly impossible to find on eBay. Just as I was about to give up, I found this build idling at just 14W!

So I’ve decided to start my own build.

What I already have:

  • 2× 10TB Toshiba HDD
  • 2× Samsung Evo 860 1TB SSD
  • 2× WD Blue 1TB NVMe SSD
  • Corsair 650W SFX power supply
  • Intel X710-DA2
  • Noctua NH-L12

What I still need to buy:

  • AMD Ryzen 5 Pro 5655G
  • 2× Mushkin Proline 32GB DDR4 ECC RAM
  • Jonsbo N4

But I’m unsure which motherboard to choose:

  • GIGABYTE B550I AORUS Pro AX (same as the one used in the build above)
  • Biostar B550MX/E Pro
  • ASUS TUF Gaming B550M-Plus

I’ve read that motherboards can significantly impact idle power consumption. Both the ASUS and GIGABYTE boards are currently available via Amazon Warehouse Deals (open box), so I’m considering ordering both and testing them out.

What’s your opinion? Am I overlooking anything? Would you approach this differently

1 Like

Could you expand more on this? What happened?

When I built my NAS a while ago, there were not many desktop platforms that supported ECC memory (if any at all?) and I purchased many of the parts used to save money. Point being at the time the overall system cost of using ECC was much higher. Next time I rebuild the NAS, I’ll probably go with ECC.

The RAM module must have begun failing within the last three months, since I configured the data scrub to run every three months. On the day the NAS became unreachable, the power LED was solid blue, and the disk LEDs were blinking, everything appeared normal. I tried to shut it down via the power button, but nothing happened. After a hard reset, the system would no longer boot. I removed all disks with no luck. Then I took out the Synology RAM module and left only the aftermarket Kingston module installed and the NAS booted successfully.

I don’t have many details because Synology’s monitoring and logging sucks. There was only one log entry that the volume went into read-only mode.

Examining the kernel logs, I found numerous Btrfs errors and integrity-check failures on both mirrored disks, with files marked as unrecoverable. I tested both drives, they’re fine and the controller seems healthy, since it still serves other disks.

After some Googling, my best guess is that corrupt data or checksums were written from the faulty RAM back to the disks for weeks. As a result, anything modified in the last three months might be corrupted. Fortunately, most of the data is either backed up or are backups themself.

With proper logging, monitoring, and alerting, this issue could have been detected earlier. But if the system had used ECC RAM, it would have detected that something is not right when the module started to fail.

Once bitten, twice shy… my next NAS will use ECC RAM.

3 Likes

Good to hear you didn’t lose much. I heard stuff from BTRFS, especially with Synology and their SHR “hack” of BTRFS. I haven’t seen anything happening to my BTRFS stuff so far, but stuff is safe, mirrored and replicated.

There are a lot of moving parts from the CPU and RAM, down to controller, firmware ,cables and backplanes and ultimately the drives. Everything in the chain can be a problem.
ECC memory is awesome because your OS tells you about errors. So at least you can trust your CPU.

ECC UDIMMs became quite popular lately. It isn’t as hard to find them as it used to be. “Desktop CPU server” made it to the enterprise after all. And that’s good news for us NAS/Homelab-guys too :slight_smile:

I wish you only eventless and boring scrubs from now on :slight_smile:

1 Like

Just keep in mind that AM4 is more or less dead at this point and that ECC support is much more spotty than on AM5.

That is true and kept me thinking.
I had a look at Epyc 4124P or 4244P combined with a ASUS Pro B650M-CT-CSM.

Pro:

  • Upgradability / Future Proof
  • A bit more performance

Con:

  • Epyc 4005 was just announced which should be more energy efficient. Seems like a bad time to buy a Epyc 4004
  • Oh boy DDR5 ECC Ram is expensive!
  • I don’t need the performance gain
  • ITX boards are way to expensive and have only 2 sata ports / need to use µATX
  • This system should run for min. 6 years, after this I properly replace it anyway
  • Unknown idle power consumption

I still tend to use the AM4 solution. I get more RAM and enough CPU power for less money.

1 Like

Expect way more. Ryzen/EPYC4000 is great for performance/watt, but outside of (outdated by today’s standards) CPUs, they have rather high idle power floor. Intel always had the edge there with power management.
That’s where “Ryzen H” or Ryzen 8000 Phoenix with 35W TDP (seems like the successor to Ryzen5600G/5700G) and such comes in to fill the gap, which are more suited for small form factor low-power systems.
But they also have less PCIe lanes. So less potential stuff to power as well.

both 65W TDP CPUs. And EPYC4004 is basically Ryzen 7000 and of questionable value for a homeserver in terms of price. May have lower idle because on only a single CCD, but IO-die is the same for all Zen4/5 Ryzens which seems to be the main factor for idle power. And “half the cores = half the power” was never really the case.

All depends on available PCIe lanes for the board and CPU. PCIe3+4 drives and networking will serve you well for the next 7 years. PCIe5 on a homeserver, even long-term, will only be needed if you want a x16 200/400Gbit NIC (which you don’t) . On-board 10 or 25Gbit networking is nice and can save precious lanes/slot. And RJ45 vs. SFP for your network is a thing that can change in the future (as I am experiencing right now). So that’s a thing where “future-proof” makes sense: on-board NICs and what the x16 slot be used for at some point.

Juggling with the available slots/lanes for NVMe, Networking and other stuff is the real challenge for consumer CPU server. Finding the right board for that is the search for the holy grail for all of us.

I mean…what do you need in the future? The ASUS B650 board has one x16 where you can upgrade network, get 16x SATA or 4x NVMe. The x4 slot maybe a 10Gbit NIC, one additional NVMe or additional SATA ports.

My Zen3 server with 5900x, 2x10Gbit NIC, 4x32GB DIMMs 2x M.2, if I plug out all my U.3 NVMe and send HDDs into spindown and subtract IPMI/BMC power draw…20-25W idle power. If I plug the network cables or disable 10Gbit NIC , also reducing load on the X570 chipset, it will be even lower (probably around -5-8W).

With everything up and operating at normal speed, goes to 70-80W idle and 150W full load.

That’s “normal”. If you use stuff, the stuff needs power. The more stuff you have, the more power. A small server is often lower power because you just can’t put a lot of stuff in there. So it mainly depends on how much stuff you want/need.

And even 2xDIMM vs 4xDIMM can be 5-10W difference, depending on Voltage and capacity.

edit: lowest power as a homeserver is a laptop. Also comes with built-in UPS and display for troubleshooting :slight_smile: Or a SBC.

1 Like

LattePanda Sigma. I still have to get around to building my NAS but I do have all the parts for it.

It should do ECC. LattePanda couldn’t tell me with certainty that it does (they never tried hard enough to actually get it to log a detection or correction) and the controller doesn’t support ECC error injection, at least not as far as Memtest Pro is concerned so it probably works. The kernel thinks it does, but I’m not aware of anyone ever logging anything, or even trying too hard.

For PCIe expansion you’re limited to the M.2 port and Thunderbolt. It won’t break speed records, but you did say low power. This rough plan is for a GaN PSU, an IcyDock drive enclosure, a PCIe SATA controller (through the M.2 port), and a PCIe 10GbE NIC (through Thunderbolt).

I did some “preliminary research” some months ago as I wanted to update my current homelab with something else that was low power and also supported ECC. What I found was not good enough for my needs so I didn’t pull the trigger, although it can be useful for others.

The only thing I found that allowed me to go significantly lower than AM4 in size and power was going for an Intel mini PC/SBC that supports IBECC (in-band ECC)

and here on our forum

which is a newish feature that allows memory controller to run non-ECC ram as ECC (by eating up some of the RAM space for checksums, which is why “real ECC” ram sticks have more RAM chips than their capacity would require).

It seems some industrial pcs from Asrock have it and also the Odroid H4 has it
https://forum.odroid.com/viewtopic.php?f=171&t=48377

the Odroid H4 Plus does have pretty low idle power numbers, like low single digit with an Intel N97 or a Intel N305 processor

while it has only 4 sata ports, one of the bios images they offer allows you to install a board that splits the single m.2 slot into two x2 M.2 slots or into four x1 m.2 slots, or in a board with 4 intel 2.5G ethernet controllers, which I do find very neat
This is done by actually configuring the CPU’s PCIe lanes to be split in a different way, this isn’t bifurcation (a feature you can set up) nor it is using pcie switches/repeaters.

and similar to what other chinese mini PCs like the CWWK x86-P5 (here in bare form, there is also a version inside a case), that use the same hardware but don’t seem to provide the IBECC feature.