Asrock X570d4u, x570d4u-2L2T discussion thread

Hi all, also first time poster here! I wanted to say thanks for all the info in this thread! After following here, a little while ago i decided to go for this x570d4u-2l2t for my three node Ceph/Kubernetes cluster, maybe useful for other people to read that following hardware combination works really well:

  • Asrock X570D4U-2L2T Motherboard
  • AMD Ryzen 5900X CPU
  • 4x Kingston KSM32ED8/32ME (Total 128G ECC)
  • 4x Seagate Firecuda 530 1TB PCIe 4.0 NVME SSD
  • Delock PCIe 4.0 x8 to 2x x4 bifurcation card
  • 2x Samsung 870 EVO 250G SATA SSD
  • NVIDIA/Mellanox MCX631102AN-ADAT ConnectX-6 Lx (PCIe 4.0 x8 / 2x SFP28/25GBe)
  • Silverstone SST-NJ450-SXL 450W Fanless SFX PSU
  • Noctua NH-U9S CPU Cooler
  • Noctua NF-A12x25-PWM, NF-A12x15-PWM, NF-A9-PWM case fans
  • Cerberus MicroATX case

In bios I configured ECO modus, the two SATA SSDā€™s are used for boot and are configured in software raid (not in bios, but using mdadm). The Bifurcation card is in the top x16 slot and the Mellanox 25GBe card in the bottom slot, this means the top one functions at x8. If I remember right, in bios i configured as x4x4x8. This way there are three NVMEs attached directly to CPU, these are used for Ceph while the fourth (sharing bandwidth with other devices through chipset) is used for temporary storage.

I remember that right after installation I had a couple of crashes, the chipset was really hot but hardware was not built into the case yet so no cooling going on and having read experiences here I blamed that. After having everything assembled with fans and all I never had a hang or crash anymore, things are stable.

For anyone out there running Proxmox (or, I suppose, any Debian-powered OS), a question:

Can you recommend any software for controlling the fan RPMs that lets me get the speed lower than 20 percent of full speed on some of the case fans?

Iā€™ve built my server into a Supermicro CSE-835, and all my drives are low-powered SSDs

Iā€™ve got active cooling on both the CPU and VRM; they idle in the low 30s and low 40s respectively. RAM is in the mid-to-low 30s at idle as well. The VRM is the hottest thing in the system, from what I can tell, and itā€™s got about 40 C of thermal headroom before I need to start worrying about it.

Problem: The overpowered Supermicro front intake fans are loud, even at their current 20 percent setting. I suspect these fansā€“which are meant to cool passively cooled Xeons, etc.ā€“are entirely overkill for my setup at 20 percent of their max RPM. Iā€™d like to test them at 15 percent or even 10 percent, but the IMPIā€™s manual fan control interface wonā€™t let me go lower than 20 percent.

The person I bought this rig from was using Unraid, and set the fan speeds using a add-on Unraid module. I donā€™t see anything like that for Proxmox, though fancontrol (which I have never used) is available in the repo.

The seller told me he used the Unraid module to set the fans to their lowest speed, which achieved an acceptable low volume for working near the server in an office room and kept everything cool enoughā€“and he was using spinning HDDs, which would have run hotter than my setup.

Is fancontrol the way to go? If it is, is there a good tutorial somewhere?

Or should I be looking at something else?

EDIT (2022 04 23 @ 2040 US Central Time): Iā€™ve just realized that I should probably be using ipmitool for this, but up-thread it looks like thatā€™s not working for some people.

Has anyone actually gotten it to work?

(I hate messing with the fans. I still donā€™t understand what the difference is between open loop and closed loop tables. Iā€™m using the open loop for the CPU and have everything else on manual, but I have no idea if thatā€™s actually the way to go or not w/r/t open vs. closed.)

Hello there! If you were successful with this BIOS could you share the file here? Asrock support has been unresponsive to me and I am running into the same exact issues.

Iā€™m looking forward to this one. Iā€™m experiencing both these issues. System Inventory doesnā€™t work, and I canā€™t access the BIOS at all without going through the virtual remote display thing.

I need to reach out to them about the power management. The board canā€™t see my dual PSUs at all. Theyā€™re defaulting to both being active, with an uneven load split between them, and an alarm goes off if I try to run it with one PSU only.

New Beta BIOS for X570D4U-2L2T:

Thereā€™s a new ā€œBeta Zoneā€ tab in the downloads section for this board.
https://www.asrockrack.com/general/productdetail.asp?Model=X570D4U-2L2T#Download
Not sure if itā€™s there for the D4U, but Iā€™d suspect so, since itā€™s the 1.5x beta BIOS for Ryzen 5800X3D support. @Marco1 , this beta BIOS may or may not address your issues. I havenā€™t found release notes for it yet.

No BMC update.

The latest release of the BIOS is 1.40. Iā€™m optimistic that the released 1.5x version will also include updates to the System Inventory/BIOS access from within the IPMI.

I still need to ask them about why the board canā€™t talk to my dual PSUā€™s Power management controller thing.

Unrelated: Iā€™ve also asked support how much the maximum sustained (24/7) wattage load is on a single fan header. Iā€™m running almost all my fans in manual mode, so Iā€™m a bit concerned about that. Iā€™ll report back when I hear something.

I contacted ASRock Rack about this issue but it was ignored.
Is there anyone who can solve this problem?

This has to do with whether youā€™re booting up in Legacy BIOS mode or UEFI mode.

For example, some of the older RAID cards expect legacy BIOS mode, and the keyboard shortcut wonā€™t work when youā€™re in UEFI mode.

Proxmox needs me to be in UEFI mode, so I canā€™t directly access the RAID setup utility.

Options:

  1. Switch to Legacy BIOS, try to config at startup, and switch back;
  2. Boot into Windows and use a software config tool there if possible.
  3. Stick card in another computer thatā€™s running legacy BIOS and config.

I know there are some people in this thread that upgraded their CPU from a 3000 series to a 5000 series.

Iā€™ve got a Ryzen 9 5900X on my desk, ready to swap in to replace my Ryzen 7 3700X, without experiencing any sort of catastrophic explosion. :wink:

Whatā€™s the recommended way to do this to avoid any issues and make sure the server turns back on when Iā€™m done?

Iā€™ve never swapped a processor on a set-up system before.

Proxmox is already installed, if that matters.

EDIT: Well, that all seems to have worked. I forgot to reset the BIOS first, so the first time around it didnā€™t see half my RAM, but after a couple of BIOS resets itā€™s back to normal. It also took my minor RAM overclock (DDR 3200) fineā€“I still need to run a memory test to make sure all is well.

Something changed with my LSI HBA cards, though. Both of them are still recognized by Proxmox, which still sees all the drives plugged into them, but only ONE of them is listed when the system boots (the part that lists all the detected HBAs and drives and prompts you to enter setup.

Aside from that, they seem to still work as they did before, but I donā€™t like not understanding what happened.

Re: Precision Boost Overdrive:

For power consumption reasons, Iā€™m considering disabling this. I went and found the settings page in the BIOS, but I donā€™t understand what itā€™s showing me. Itā€™s not disabled, or set to Auto, but rather Advanced.

It looks like this:

I havenā€™t set any of this, and those settings are there after two Reset to Defaults commands.

Does anyone elseā€™s look like this? Is it some sort of optimized Asrock Rack settings thing?

Canā€™t say anything about swapping, just donā€™t try to hot-plug the CPU :smiley: . I donā€™t think there will be a problem. But check BIOS in any case.I never trusted BIOS ā€œauto detectā€. 5900x runs fine with both board and proxmox.

I sometimes see wierd SATA BIOS detection too, but in practise, everythingā€™s running just fine.

Inever touched the overclocking stuff other than 1600MHz FCLK+memory and ECO-Mode. I have to maintain my sub-100W power target :slight_smile:

1 Like

Thanks, @Exard3k . Reassurance is always appreciated; I have no idea what Iā€™m doing.

Running Prime95 on a Windows boot disk now.
CPU holding steady at 69 C - 70 C in the IPMI sensors window, even with the tower cooler installed and blowing air left to right instead of front to back. (Iā€™ve got the proper mounts on order to fix this, but.)

In Windows, Hwinfo isnā€™t detecting the RAM sticks properly. Proxmox knows thereā€™s 128 GB. Windows 10 knows thereā€™s 128 GB. Hwinfo knows thereā€™s 128GB.

But Hwinfo doesnā€™t actually show any memory sticks in the dropdown menu where you can go to look at detailed info.

The BIOS, OTOH, does.

My 3700X was rock solid as far as things being detected properly, so Iā€™m a bit annoyed because I donā€™t know if this is an actual config problem or some sort of harmless bug in the BIOS or Hwinfo.

EDIT: Okay. Now Iā€™m concerned. Memtest86+ only saw 64 GB of RAM, even though the BIOS sees it all, and so does Windows and Manjaro (Arch) Linux.

Help?

EDIT 2: Soā€¦Iā€™m not in EFI mode. What.

EDIT 3: Okay, I switched to another ISO to run Memtest86, that supports the UEFI version of the test software.

With the Ryzen 9 5900X installed, attempting to run Memtest86+:

  1. When RAM is set to 3200MT/s, I immediately get ECC errors.
  2. I took it down to 2933MT/s, no more errors.
  3. But: It canā€™t pull the SPD data from the RAM, so it canā€™t display data like RAM temperature.
  4. Additionally, Memtest86+ only found 16 CPUs, of which itā€™s using 8. Unless thatā€™s a hard limit of the free version, somethingā€™s wrong there.

The Ryzen 3700X would run the test for 24 hours at 3200MT/s.

tl;dr things have been less stable and more weird since installing the Ryzen 5900X. Iā€™m kind of hating it right now.

EDIT 4: Grabbed the paid version of Memtest86.

  1. It detects the proper number of cores/threads.
  2. It still canā€™t read the RAM temp or other info from the RAM.
  3. The BIOS reports the CPU temp is about 10 C hotter than what Memtest86+ reports. Nice to know whatā€™s going on. lol.
  4. Iā€™m going to let it run overnight, but donā€™t expect any errors.

@Exard3k , where do I go to adjust the timings themselves? You mentioned a while ago that you were using the JEDEC standard ones to good effect. Also, do you see the same errors collecting SPD data from yours if you run Memtest86+? (Free or paid are both the same on thisā€¦)

The CL22 timings are the intended timings for this RAM according to the spec sheet. If the BIOS selected anything else, thatā€™s bonkers.

Wait, wait, wait. Is that 100w for the whole system, or just the CPU? :stuck_out_tongue:

What are you doing currently to limit power while maintaining performance? I remember discussions in this thread several months ago about the 5900X specifically, but itā€™s been long enough Iā€™m curious what you ended up going with long term.

So far, all Iā€™ve done is set ECO mode.
Iā€™m really not super comfortable trying to tweak anything else without someone elseā€™s input. I donā€™t understand a lot of this yetā€¦

Idle/low load power for the whole system. I get ~130W at the wall with everything running, from 10GbE and HDDs to CPU doing heavy lifting. But most of the time, system runs at 80-90W. Right now Iā€™m using a GPU in the x16 slot because I need a gaming-capable VM until I can get myself a new daily driver (probably AM5), which increases power draw by quite a bit (140-250W). But thatā€™s just for a couple of months.

ECO mode. Limits PPT (package power tracking) to 86W maximum instead of 142W. Really is free real estate unless you need that last 10-15% of single-thread performance or all-core scientific compute.

And Iā€™m running my memory on the default 2667 MT/s. I donā€™t see improvements using 3200 MT/s on my usage as my VMs and servers donā€™t benefit from higher bandwidth.
I get mce memory errors while using 3200, but these have a pattern and fixed duration between them. All are 1-bit correctable ones. This seems to be a BIOS/firmware thing more than CPU or memory as Iā€™ve seen other people experiencing this as well. May be exclusive to Vermeer chips. I just stick to 2667 as everything is just fine and also uses a bit less power.

1 Like

Cool. Iā€™ll have to check idle power when Iā€™m sitting in (almost completely unconfigured) Proxmoxā€“thatā€™s as close to idle as I know how to get right now. Oddly enough, the CPU runs hotter in the BIOS than it does when idling in Proxmox.

Iā€™ve got it in ECO Mode now, which Iā€™m coming to understand is mostly just a preset for Precision Boost Overclock (I love how all these things tie together and depend on each other and none of this is explained in the manual.)

Where do you go in the BIOS to see the actual timings (22-22-22)? I want to make sure mine is set up right and didnā€™t auto set to something strange.

Have you considered locking it to its base clock? Or undervolting? The wizards of Ryzen 5000 power usage seem to swear by undervolting, but it is confusing and scary. :stuck_out_tongue:

Iā€™m in hour 12 of a Memtest86+ run (pass 4), and right now itā€™s drawing ā€¦ 239.7 watts (2 amps). That seems ā€¦ high, but I donā€™t know how much load itā€™s actually putting the CPU underā€“itā€™s using 12 of the 24 available threads. Temp is stable at 55 C, though, so at least thereā€™s that.

EDIT:
It just occured to me to pull all the front hot swap drives (except the pair of VM store SSDsā€“I need that to be considered part of base power consumption).

  • Iā€™m running 16 SAS2 2.5" SSDs (Sandisk Olympus from 2014.)
  • Um. Wow. With all the drives off, and Memtest86+ still running, power consumption is now stable at 167w.
  • So: My SSDs, at idle, use a total of 73w (4.56w each). Iā€™m going to break out a separate thread for this, but that is about 4.5x more than what I expected for each of them.

(Yes, this is hotter than the Ryzen 7 3700X was, but I think thatā€™s unavoidable. I also need to fix the orientation of the Noctua cooler I have in there so itā€™s not at a right angle to case airflow. That might help a bit.)

Do you have access to Memtest86+? If so, would you mind checking to see if you can read the SPD data? It should be available if you ask for information on the RAM modules. Right now, it canā€™t read mine.

Iā€™m on a business trip and only have access to laptop right now. Proxmox and IPMI are inaccessible from WAN. But a full Memtest86 run was planned anyway.

Only thing Iā€™ve done is changing the CPU governor in Proxmox and I tested powersave which sets clocks to 2200MHz, but I always end up with conservative on all my systems :slight_smile:

AMD Overclocking ā†’ DRAM timings and frequency ā†’ timings become visible after you change auto to enabled along with frequency option. I tested a bit, but didnā€™t really do much, so I stuck with auto. Havenā€™ touched the BIOS in 2 month or so and that was only to do changes necessary after I installed the GPU.

1 Like

Thanks. Hope your trip goes well. Looking forward to your test results. I really appreciate you taking the time to run and share them.

Iā€™m almost 15 hours into a memtest86 run, on pass 4, test 12, and just saw my first ECC error with it clocked to 2933 Mhz.

Iā€™m paranoid I somehow bent a pin on the CPU, but that doesnā€™t make any sense. It should be failing more consistently if I damaged something.

I donā€™t really understand why the 5900 is so much less stable than the 3700X w/r/t memory speeds, but once the test is done, Iā€™m going to run it again with the default DRAM timings and frequency. If that doesnā€™t work, Iā€™m giving up and going back to the 3700X.

That is why we use ECC. Errors happen and we pay for things to be automatically detected and corrected. If it isnā€™t periodically and persistent I see no problem, quite the opposite.

Probably some deviation on the firmware between Matisse/Renoir and Vermeer architecture.

2 Likes

Update: The test passed with no ECC errors at 2666 MT/s.

I donā€™t really think that, for my use case, Iā€™m going to see a night and day differenceā€“or likely, even a measurable differenceā€“in normal operations. Not like jumping from 2666 to 3200. Like you did, I think Iā€™ll just leave it at the default timings.

Thanks for the reassurance re: the ECC error.

Iā€™m curious if youā€™re seeing the SPD errors as wellā€“that is, the inability for Memtest86 to pull any SPD data from the RAM at all, including temps. Whatā€™s odd is that Windows and the BIOS have no issues getting these tempsā€“but they could be reading from different sensors. I donā€™t know.

Update re: Official Fan Header Wattage Information for X570d4u and X570d4u-2l2t

Asrock Rack US tech support got me the following data from their engineering team in Taiwan.

image001

Current Rate: 4 amp
Dielectric voltage: 250V AC for one minute
Insulation Resistance: 1000 M-Ohms (?) per minute (?)
Contact Resistance: 20 m-Ohms (?) max
Operation Temperature: -25 C to 85 C

I will admit I donā€™t fully understand all of this. Iā€™m not sure what a dielectric voltage is. I think is referring to this? Dielectric Strength - an overview | ScienceDirect Topics (So, the amount of voltage to cause ā€œdielectric breakdown?ā€)

If Iā€™m understanding the ā€œcurrent rateā€ correctly, it means that one 12V fan header on the board should be able to handle up to 4 amps, right?

  • So, 12V at 4A = 36 watts per header?
  • Then, if Iā€™ve got 3x0.58A 12V fans, that means their total max power is (12V x 0.58A x 3 fans = 20.88 watts).
  • Conclusion: One header with a splitter cable rated for enough watts should be fine and not catch on fire.

Does this seem reasonable?

EDIT (2022 06 10): Iā€™ve confirmed with Asrock Rack that it is 4A per header.

I find myself interested in bifurcating the x16 slot so I can run both my HBAs off of the x16 slot and free up the onboard x8 slot for something else.

  1. Is this going to hurt the bandwidth of my SAS SSDs connected to the HBAs?
  2. What PCIe 4.0 x16 to 2x8 bifurcator (?) should I get?
  3. The manual is a bit confusing on all this. What happens to the other two slots if I do this?

Well. Iā€™m back with new problems, and an increasing conviction that putting a 5900X in this thing wasnā€™t worth the aggravation.

I had to reseat the CPU because it came out of the socket when I tried to rotate the cooler to fix an airflow issue, and now the BIOS (and thus, the rest of the server) isnā€™t seeing anything in bank B. So, half my RAM is not showing up.

The bank B temperature sensors are working, but I have no idea if those would work even if no RAM was plugged into those slots.

Iā€™ve reset the BIOS and even pulled the CMOS battery to no effect.

This actually happened the first time I put the 5900X in the socket, but it fixed itself after putting the BIOS back to factory defaults.

Any ideas? @Exard3k , did you run into this? I am ā€¦ aggravated.

1 Like