A cost-effective Intel W680 ECC server: repurposing an HP Z2 G9 motherboard

(apologies for URLs formatted like a code in this post — I am getting an error when linking normally)

I was wondering whether anyone tried to repurpose an HP Z2 G9 Tower/Lenovo ThinkStation P3 W680 Tower/Dell Precision 3660 W680-based workstation motherboard and got no response, so I decided to go ahead and do it myself.

I managed to buy a second-hand HP Z2 G9 W680 motherboard for less than €100 and within the last 3 months had gained enough experience with it to share here with the community. Considering all other W680 motherboards cost upwards USD 500 at this point, this turns out to be a fairly attractive solution to get a modern Intel system with low idle power consumption, PCIe Gen 5, and, most importantly, ECC memory support. Interestingly, I saw some recent eBay listings that sold those motherboards for as little as 20-40USD!

The HP Z2 G9 system spec is here, motherboard details on page 17: https://h20195.www2.hp.com/v2/getpdf.aspx/c08109687.pdf
The spec of the motherboard itself can be looked up in the Service Manual here, page 51: https://kaas.hpcloud.hp.com/pdf-public/pdf_5451169_en-US-1.pdf

I originally had it running with 12600k, I since upgraded to 13600k (more on that later).

There are a few challenges running it, compared to your run-of-the-mill products:

1. PSU

HP, Dell, and Lenovo all use proprietary PSUs, which at least in case HP are actually conformant to ATX12VO. That means the PSUs themselves only deliver 12V to the board, which then handles the 5V supply by itself. This also means that using off-the-self PSU is very limited, especially because the board expects +12VSB (stand-by voltage), as opposed to the standard 5V. There are workarounds available (google it), but I resorted to using an HP power supply.

For the record, while the HP Z2 G9 comes can be configured with 3 different PSUs, you can use PSUs from other systems. Notably, the PSU must have a 7-pin PWRCMD connector, as opposed to a 6-pin That is, however, not enough, as not all PSUs have enough cables connected to it, and my conclusion at this point is that any PSU with black cabling and a 7-pin PWRCMD connector will work with this system. I am myself using it with an HP G5 450W PSU, which has Gold certification (Platinum ones are also available).

Also worth noting is that there isn’t anything proprietary about those PSUs outside of the connectors: all of the cabling is standard ATX. In fact, before they switched to using all-black cabling, all of the cable colors used matched the ATX standard.

2. FANs

Another proprietary solution is HP fan connectors. They are compatible with regular PWM fans, but the HP CPU fan comes with 5 pins, instead of regular 4. Pins 1-4 are the same as in the standard, pin number 5 is crucial to get the system to NOT report fan incompatibility upon POST

Their fans come in at 65W and 125W TDP versions. So far, what I found, is that to emulate 125W, one needs to connect pins 3 and 5. Shorting pins 3, 4, and 5 will allow the system to not report when the fan is missing entirely. More on that later.

The system will also report a missing chassis fan, CHFAN2 (P9 in manual). This one is a regular, 4-pin one. The solution here is to short pins 3 and 4.

The problem, unfortunately, is that HP fan implementation is not exposed to the system, so lm-sensors is not able to control them. That wouldn’t be that bad if not for the fact that they tie the fan speed to the CPU usage, not the temperature. And this in itself wouldn’t be a problem if they didn’t tie them to a single core usage, which results in fans blasting as soon as the system sees even modest utilization, around 3-5%. See this thread on HP forums for details: https://h30434.www3.hp.com/t5/Business-PCs-Workstations-and-Point-of-Sale-Systems/HP-Z2-G9-CPU-Fan-starts-spinning-at-full-speed-even-if-only/m-p/8935953#M44658

With that in mind, you need to resort to an external fan controller to take the control back. You’ll also need to short pins 3, 4, and 5, as explained above, to avoid the POST complaints. Some recommended options are Corsair and NZXT, which luckily come with Linux kernel support, together with liquidctl to control them. More here: https://github.com/liquidctl/liquidctl?tab=readme-ov-file#supported-devices

3. Motherboard and its non-standard size
Screenshot 2024-04-11 at 9.16.20 PM Medium

All of these HP/DELL/Lenovo motherboards come in non-standard sizes. They are Micro-ATX by lengths, but exceptionally wide. This has both pros and cons:

  • You can’t use them in regular ATX cases, most likely. I don’t have a problem with it, since I have it installed in a custom IKEA shelving

  • Excellent IOMMU grouping. Pretty much every single device has its own group. No ACS overrides needed. No issues in passing GPU, Mellanox SRIOV VFs, or SATA controller to the VMs.

  • 3 x NVMe PCIe4 x4 slots + 1 x m2 PCIe3 x1

  • HP Flex IOv2 connector. That means you can get an extra 10Gbit NIC or install a relatively inexpensive HP 3UU05AA Thunderbolt 3 extension card (which takes up PCI slot 4 as well).

  • I219-LM Gigabit Ethernet with vPro support.

  • Extra USB connectors on the other end of the board

  • Only 4 PCI slots. Slot 1 is PCIe5 x16. The rest are PCIe3 x1, x4, x4. A 2.5-width GPU card will leave you with a single PCI slot 4 free — unless you resort to some creative PCI ribbon risers.

  • 4 SATA ports. You need a “P160 HP cable” to power the disks, as the PSU being ATX12VO does not come with SATA power cabling.

  • There’s an onboard USB header, used to connect the Media Card reader, but it is non-standard, so you can’t use it to connect the Corsair/NZXT fan controller.

4. Issues running it

13th gen “PCA not fully compatible” error

After updating to the 13th gen, I am seeing a “PCA not fully compatible” error. I have no clue what’s causing this: the fan is rather unlikely since you can disconnect it altogether and the SKU error still shows, followed by the “missing fan” error. The PSU is also unlikely, since AFIK there’s no way for the motherboard to know the model of the PSU connected. There is another revision of the motherboard, which is said was released to fix some 13/14th gen compatibility issues, but HP themselves have not confirmed that, and all of the posts on their forums that ask for help with the very same issue were left with no response from HP.

Unfortunately, the system will not post past this error unless you acknowledge it with a keyboard press. I did not notice any other compatibility issues with the 13th gen on top of the 12th gen. Goes without saying the firmware is up to date.

Power consumption

With no external devices connected and the system idling, 64GB (2 x 32 GB) ECC RAM, and 2 x Lexar NM 710 1 TB SSDs, the power consumption measured off the wall is 4.5-6W, which is astonishing (with only 1W difference between 12600k and 13600k). It could potentially get even a bit better with a Platinum PSU. AMD clearly cannot compete here at all. For the record, the system would reach C10 states with 12600k and C8 states with 13600k. I am not sure if the latter is expected — rather not.

Unfortunately, some serious problems start as soon as I connect PCI devices. Specifically, with the AMD 6600 XT GPU connected to slot 1, the system will no longer enter deeper C-states, stopping at C2. I believe this is because the PCIe5 slot 1 is connected directly to the CPU, which apparently limits the C states to 2, as explained here: https://mattgadient.com/7-watts-idle-on-intel-12th-13th-gen-the-foundation-for-building-a-low-power-server-nas/. So while the GPU is very efficient, idling at 4 W only, the system’s consumption spikes up to 40-55W. Combined with my Mellanox 4 LX and some USB devices + 4 x 2.5" idling HDDs, the total idle consumption is 55W-70W. This is, unfortunately, on par with — if not more than — similarly configured, idling AM5, with regular, off-the-shelf components.

It’s worth noting that all of the devices have ASPM enabled, as reported by the lspci. I am left without ideas here, I asked on HP forums, but I doubt I will be met with any help:https://h30434.www3.hp.com/t5/Desktop-Hardware-and-Upgrade-Questions/Z2-G9-unable-to-reach-C-state-above-2-with-GPU-installed-in/m-p/9042319

Noteworthy, HP specs suggest a similarly specced system, i9 12900 with NVIDIA GPU, should idle at 22-24W in Windows. This is considering NVIDIA GPUs are actually worse than AMDs in idle. I have not tested the system with Windows yet, it would be interesting to see whether I can confirm those numbers — especially in light of what is suspected on the C2 levels enforced by occupying CPU-bound PCI slots.

RAM issues with Intel x710 NIC

Another weird issue is happening with an Intel x710 installed in PCI slot 4 in place of the Mellanox. I figured I would want to reduce the overall power consumption by using the Intel NIC, which is reported to be more power efficient. Unfortunately, with Intel NIC installed, the system POSTs with half of the memory, i.e. 32GB in my case. Super odd issue, I would need to have someone try to replicate it, otherwise it may sound like the motherboard is somehow broken. Although it’s interesting that I am not getting such an issue with the Mellanox installed.

Turbo speeds

I am convinced the motherboard limits the socket to 125W and, as a result, the CPU will hardly ever get close to the Turbo speeds, even if the thermals permit it. This is with the High-Performance Mode enabled in BIOS.

Undervolting/Overvolting

This being a proprietary system means you cannot undervolt the CPU, which could potentially reduce the consumption significantly, as it is with Intel CPUs.

Summary

All things considered, whether this is a viable platform depends on whether the issues above are fixable. Having to manually press “Enter” to boot with a 13th gen CPU can be worked around by resorting to a 12th gen CPU. However, if you want to use it with extra PCI devices, you need to consider the idle power consumption issues, which then make this system lose its advantages over the AMD AM5 platform. Other than that, the system has been very stable, not having crashed once in the lasts 3 months.

All things considered, if you can get ahold of an off-the-shelf W680 motherboard at a reasonable price, I wouldn’t bother getting your hands dirty with this approach.

3 Likes

That is a very impressive idle power consumption when it can hit those high C-states

ASUS has an ATX IoT/Industrial motherboard with R680E for ~250, not exactly a screaming deal but not too terribly bad for getting PCIe 5.0 and DDR5 ECC support

That’s the one I initially wanted to buy, but for some reason it costs at least twice as much in the EU, compared to the US. And, as of recently, the price in the US is also over 300.

EDIT: Oh, sorry, thought you meant W680-based Asus WS W680-ACE

This was the one I was thinking of:

now that I’m looking at some of the retailer prices for it; its not that much less expensive than the “full” W680, Pro WS W680-ACE that has the extra ability to overlock memory over the R680E.

Well, not only the overclock ability, but pretty sure R680E disables undervolting, too, which Asus W680 does come with.

1 Like

ahh good point

This is an excellent resource.

I think it’s important to repurpose these cast offs so they don’t become e-waste.

Thank you for the write up!

1 Like