recently I upgraded my home server running TrueNAS Core to save power. As you might have heard power cost got quite up in Europe. Unfortunately my upgrade had quite the opposite effect.
I upgraded to the motherboard mentioned above and updated the bios to 3.90. For a CPU I chosed the Ryzen 5 Pro 4650G for its ECC support. I’m also running 64 GB of ECC unbuffered RAM.
The system is using between 97W and low 80s (total system power, which fluctuates a lot). I basically tried everything I know to bring the power consumption down but nothing had any effect on the total system power consumption.
What I tried so far:
Check load line calibration and confirmed that is running at level 5. Which should result in less voltage overshoot on frequency change.
I enabled ACPI power states (I may update the specific settings later).
I changed the socket power from auto to 64W and even to 35W (CBS → NBIO) without any change in total power consumption. This is the most confusing thing so far.
I ran powerd to force the CPU into adaptive to keep the clocks at idle down. It kept the clocks down but not the power consumption. (powerd -v -a adaptive -b adaptive -i 80). According to powerd the CPU clocked down to 1700mhz and even 1400mhz. Again no measurable change in power consumption.
I also set the target P-state to 3 instead of 0. So far as I know 0 is the base clock and the other P states are lower clock states.
Tomorrow I’m gonna change the six hard drives for SSDs. I hope to save between 18 to 30W on that.
I didn’t try a voltage offset. I heared that the motherboard will just add more voltage to deliver stable clocks. I might be wrong on that. In a perfect world I would prefer that the system will clock down in idle to consume less power and ramp up under load.
I’m gonna update the thread with more settings that I tested and specific settings (with the correct path in the BIOS).
If you have any suggestions please let me know and I will test them as soon as I can.
BTW I’m hosting some virtual machines on that server and I hope to run even more in the future. That’s why I would prefer not to underclock the CPU.
Aside from the motherboard an CPU, I also changed the power supply (Seasonic something 80+ gold) and changed one RAM module from an 8 gig to a 16 gig module (all are now 16 gig modules running a 1.2V). Everything else (including CPU cooler and fans) are the same.
I don’t think that these changes have big impact on power consumption.
So here is the list of BIOS settings regarding CPU power. I probably messed something up.
Overclock Mode (Bus Speed): Auto
CPU Frequency and Voltage Change (VID): Auto
SOC/Uncore OC Mode: Disabled
CLDO VDDP Voltage: Auto
CLDO VDDP CCD Voltage: Auto
External Voltage Settings and Load Line Calibration:
CPU Core Voltage: Auto
CPU Load Line Calibration: Auto (which is level 5)
CPU VDDCR_SOC Voltage: Auto
CPU VDDCR_SOC Load Line Calibration: Auto (also level 5)
VPPM: Auto (2,5V)
CPU VDD 1.8 Voltage: Auto (1.8V)
PSS Support: Enabled
PPC Adjustment: PState2
Deep Sleep: Enabled in S5
ACP Power Gating: Enabled
Adjust Vddcr Vddfull Mode: Auto
Adjust Vddcr Socfull Mode: Auto
CPU Core Count Control: Auto
CPU Common Options:
Core Performance Boost: Auto
Global C-State Control: Enabled
Custom PState0: Auto
NBIO Common Options:
SMU Common Options:
System Configuration AM4: Auto (was 35W prior to this post)
Default clocks are 3753 Mhz and 1.218V.
I also noticed that powerd by default shows a wanted frequency of 7400 Mhz. This also happens in adaptive mode (except when I change the idle load to more than 50% inpowerd). I guess powerd really wants to push the CPU to its limits
Very interesting indeed. Especially the motherboard and pcie devices part. I will test some things next weekend.
Today I installed Windows and fiddled with Ryzen Master. In Ryzen Master the setting for CO (I assume curve optimizer) shows nan. However Ryzen Master reports SOC power consumptions at 1-5 W and the CPU frequency between 200 and 600 mhz at idle. My power meter for whole system power consumption basicly staid the same. So I conclude that quite a bit of power goes to the motherboard and PSU. Also powerd seems to miss report and I have to check with other tools.
So the next things that I’ll try are:
Disable every unneeded device in the BIOS
Swap the M.2 SSD into another slot (prevents me from entering the BIOS anyway)
I’m sorry I wasn’t able to do further testing last weekend. However since the power consumption in Windows with Ryzen Master was basically identical, I think there is no problem with my BIOS settings or TrueNAS.
Therefore I conclude that the power consumption is mainly caused by the motherboard and power supply. Server mATX motherboards f.e. from Asrock are to expensive for me. The power supply on the other hand could lead to some improvements. Currently It’s an Seasonic SSP-400ES2.
I assume that the power supply has poor efficiency at the current load, so will do some tests to verify that. My plan is to connect a bunch of fans to a lab power supply. I can measure how much amps the fans use and therefore how much power is used. After that I’m going to connect the to the power supply (only the fans) and measure the power used at the outlet. By doing so I think I could estimate how power efficient the power supply is.
I did some testing. As mentioned before, I connected some fans to a lab power supply to get how much amps the fans draw. After that I connected them to the power supply and measured how much power I use at the wall.
With a load of 1.45A, I measured 26W at the wall. 12V * 1.45A / 26 = 0.67
With a load of 2.1A, I measured 35W at the wall. → 0.72
At low power usage, this power supply is more efficient than I expected. Maybe there are some improvements but getting reliable information isn’t easy, as mentioned in the video.
@DerSpinner I wonder if you’ve managed to find a way through this.
What you might want to check is the status of ASPM: dmesg | grep -i aspm
And also what your PCIe devices support in terms of it: lspci -vvv
Internets say that the support of ASPM on some MBs under Linux is flaky (some BIOSes don’t properly advertise the support of it and so the kernel decides to turn it off to be on a conservative side).
I wonder if you changing the mobo just got yourself into this issue.
This CPU seems capable of doing below 30W of idle on the consumer ASRock (different chipset though). See the doc mentioned in the video linked above - line 238:
P.S: I was actually just about to pull the trigger on a setup that’s very similar to yours (4650G / X570M Pro4 / 128G of ECC UDIMMs / few SSDs plus 4-wide array under TrueNAS, eying the expansion and addition of 10GE in the future) and was researching on the power consumption and the status of ASPM support in such a setup.
80W of idle is waaay over what I was hoping for…I wanted something like 30
Keep in mind that memory modules as well as storage always uses power. Most SSDs have nice power states and can flip in ms, but especially with ZFS, the drives always wake up for the next TXG (every 5 seconds by default). 4 sticks of DDR4 32GB is not a trivial amount of Watts when talking 30W range. Memory is always active and the more capacity, the more power.
And then there are the NICs…it all adds up with Fans and PSU efficiency…2 Watts here, 5 Watts there.
Fully kitted out homeserver with drives and shit. Pretty much what I’m getting too. 120W under load, 85W idle.
@Exard3k You’re making good points! Nevertheless, I want those sweet memories This is gonna be an all-in-one server and not a dedicated storage box, and sometimes I’m gonna spin up some pretty memory-hungry labs in there (so it’s not just to give space to ARC).
As for the ZFS specifics, the current strategy is:
to have different zpools for different stuff: one for SSD-based for VM/containers storage (and their DBs and caches), and another one for HDD-based storage of Linux ISOs, and yet another one (SSD-based) for the downloaders of those ISOs. All this should make the rust spinning much less. Atm I’m actually planning to even let them to spin down.
SLOG for ZIL
(maybe) Special vdev for metadata
To cut down on the sheer number of SSDs required, and also considering the fact that for the boot zpool, for ZIL, Special, and for VM/containers the SSD should have PLP, I’m gonna be using Kingston DC1500M (or similar) that also supports NVME namespaces. So, I’m planning to break it out into multiple pieces and give each of them to the respective zpool. This will let me to have just 2 SSDs (albeit not the most modest in idle).
For the NICs, if one gets to have one that supports ASPM then it should’t be that bad actually.
For the Fans, I’m planning to use an AIO (Arctic Freezer II) as the price is the same as the best-performing air ones, but it also doubles as two intake case fans, which together with the drive cage fan should create some positive pressure and keep the dust out. We’ll see how all of this is gonna pan out inside the Node 804, but I have high hopes
It’s all flash now.
Yes, I am aware of the lack of ASPM support on the ConnectX. At this point I am happy about the performance. I may go back to 10G at a later stage.
No ECC RAM - the AMD 5700G doesn’t support it.
I checked dmesg and aspm seems to bee disabled. I also checked aspm support with lspci. It seems that not all pcie devices support aspm.
Should I risk enabling it by using: pcie_aspm=force
Currently the system is idling between 54-62 watts. This includes a ups (was included in all tests). I guess you could achieve that with a more “server” style motherboard, although the amount of memory might interfere with that. When I bought the parts the x570 server boards from Asrock where a bit to expensive for me.
I haven’t underclocked and undervolted the gpu yet because I’m afraid of going to low and have no video output anymore. Is it possible to go down to 50mhz and 0.2V?