Overkill Thread r ipper

TR: Adding excess to excess (Excess² ?)

I am wrapping up my substantially overbuilt 9985wx build and thought I would share.

Background

I am the AI Security Principal (aka ‘expert’)’ for a Fortune 500 and a fair amount of the work I do is not permitted to be ran on the corporate network and/or cloud providers (think AI red teaming type activities). As result, I build my own systems to perform these activities.

My current machine died when I decided to reroute a water cooling hose ‘real quick’ when I was waiting on a dataset to download in prep for swapping in a 5090. Let’s just say that the process didn’t go well as one of the quick connects stuck open and sprayed EVERYTHING with GL50 (Glycol based coolant) why it was running. Motherboard… DEAD, Processor… Dead. Most other parts were ok, but were old enough that I’m saving them for a build for my son.

I also play games on a 5K2K monitor (demanding resolution to push at higher frame rates), competitively crack passwords (HashMob / CMIYC / etc), and am an water cooling enthusiast who competes on 3DMark (Currently in the top-100 leader-boards several times and the top-10 once :+1:).

FYI: I have 256GB DDR4-3600 (8 32GB sticks) that came out of this build that I could part with VERY cheaply. If anyone has a need send me a DM and I’m sure I can set you up. :slight_smile:

Sweet, you got work to pay for your rig?

Sadly no, this is out of pocket but it should pay for itself over time through work bonuses if it helps me meet goals… and it’s one hell of a gaming rig.

To keep this from being a sprawling mess, I’m using collapsible sections below, so I you see the following click the triangles

This is a description of a section

Tada magic I tell you :magic_wand:

Details would be here


The Build

Parts

Main PC:

This is the actual PC itself with some of the parts of the water cooling loop

Component Details
Part Model Notes
Case SilverStone RM-52 Product Link
Power Supply Silverstone HELA 2050R Platinum Product Link
On a dedicated 20A 240v port on my UPS
Motherboard Asus Pro WS WRX90E-SAGE SE Product Link
Processor AMS T PRO 9985WX Product Link
Memory VColor DDR5 OC R-DIMM Part#: 46942730354855
Product Link
GPU Asus Astral 5090 OC Product Link
Storage 6 x 4TB Samsung 9100 NVME M.2 Product Link
Storage ‘Adapter’ IcyDock ExpressSlot Slide 2 x Part # MB204MP-B
Product Link
Network Nvidia Mellanox ConnectX-4 (2x100GB) Part #: MCX416A-CCAT Specs
End of Life, already had this on hand
Cooling Components

Note: Cooling related components will be covered in more detail in the next section.

Part Model Notes
CPU WaterBlock Watercool HeatkILLER IV Pro Product Link
GPU Waterblock Alphacool Core Product Link
Radiators Alphacool NexXxoS HPE-45 (360) 2 radiators
Product Link
Radiator Fans Sanyo Denki SanAce 120 6 x Model 9RA1212P4G0011
120mm @ 4500RPM (130CFM | 12MM H20)
Product Link
Exaust Fan Sanyo Denki SanAce 140 1 x 9WL1412P5G001
140mmx51mm @7500RPM (318 CFM | 66mm H2O)
Product Link
Fan Controller Alphacool ES Guardian 2 x Product Link
Flow/Temp Sensor Alphacool ES flow and temperature Product Link
This is connected to the CPU Fan header to shutdown the PC in case the cooler unit has a failure.

Cooling Unit

This is the unit that houses the majority of the watercooling loop components

Water cooling Unit Component Details
Part Model Notes
Case SilverStone RM-52 Product Link
Power Supply Silverstone Gemini 900A Gold Product Link
On a PDU fed by a 30A 240v port on my UPS
Radiator: Front Alphacool NexXxoS Monsta 180mm Dual Product Link
Radiators: Mid & Rear Alphacool NexXxoS XT45 Full Copper 180mm Dual Product Link
Radiator Fans Orion Fans OD180APL Series 4 x Model OD180APL-12HB
180mm x 65mm @ 3375RPM
(460CFM | 28MM H20)
Product Link
Exaust Fan Sanyo Denki SanAce 140 1 x 9WL1412P5G001
140mmx51mm @7500RPM (318 CFM
Pump Tops Alphacool Eisdecke D5 Dual Brass Top 2 x Product Link
Pumps Alphacool VPP Apex Pump 4 x Product Link
These are D5 footprint, but 12V optimized
Reservoir AquaComputer aqualis ECO 450 ml Product Link
System Monitor AquaComputer aquaero 6 PRO Product Info
Flow / Temp Sensors AquaComputer High Flow 2 2 x Product Link
Power Button APIELE 22mm Latching Switch 12V Product Link

Misc components

Components used in the build that don’t fit elsewhere

Miscellaneous components
Part Model Notes
Thermal Paste Thermal Grizzly Duronaut Product Link
Watercooling Fittings Alphacool HF Metal G 1/4 Product link
90 Degree Product link
Watercooling Quick Connects Alphacool ES Quick Release Female Product Link
Male Product Link
Bulkhead Product Link
Fluid Alphacool ES Liquid GL50 Product Link
Power connector for 3A+ fans Spectrum IC3 (60A rated) Male Product link
Female Product link
Air guide Custom bent polycarbonate Procuct Link

ASSEMBLY

Main PC

Assembly notes

The Main PC went together the way a typical rack mount chassis would. This chassis was repurposed and already had the radiators and fans (listed above), otherwise I would likely have skipped these radiators and put in another pair of the Orion 180MM fans for even higher airflow over the RAM and 100G NiC.

NOTES:

  • All storage is in PCIe slot based hot swap NVME cards
    • 4 x 4TB Samsung 9100 NVME for linux (mdadm RAID-10 - FAR2 layout) for 8TB usable
    • 1 x 4TB 9100 used for VM storage
    • 1 x 4TB 9100 for a Windows 11 Dual boot
  • The Fans took too much power for the mainboard headers, thus the two fan controllers in the front (bottom of image) one per radiator
  • There are Quick Connectors allowing me to remove the CPU and/or GPU without draining the loop
  • The Rear (top) fan is using the IC3 connector that was soldered onto a PCIe extension cable (12v & Ground pins - Rated at 75W per pin)
  • The second radiator is under the fan controllers
  • The hot glue on the fan connectors is pretty common on non-serviceable Server parts and I did it here to ensure secure connections. I can’t think of the last time a fan of this class failed.
  • There are no pumps or reservoirs in the PC case, external fluid flow is required
    • I connected the Flow Meter to the CPU fan header so the system will shut down if it is not receiving flow (simulates CPU fan failure)
  • This is the air guide I made to force air over the RAM

Water Cooling Chassis

Assembly notes

This is where I went a little crazy. I decided that if two 360 rads work, then adding 3 more 360mm x 180mm would b better any why I was at it, PC fans didn’t flow enough air for how I wanted to set it up, so lets go bigger.

I wanted this to be resilent to single points of failure, so I used a redundant power supply, and designed the loop in a way that no single component failure (barring a leak) should cause a cooling failure.

Fans

The fan’s I used (listed in the parts section above) push 460 CFM each and being rated at 50.4w each, I used the IC3 connectors and was able to get 3 connectors per 8pin PCIe extension.

There are four of these fans and one of the same 140mm fans in the main chassis for a theoretical max airflow of 2,158 CFM :wind_face:. When these ramp up, it is no joke and move as mich air as 2 of the magnum backpack blowers from Stihl. Here is a short of me testing the wiring; I’m not directly in the airflow, but you can hear the wind whipping around my office.

Radiators

As listed above, there are 3 radiators in this chassis all of the 180mmx360mm class. To protect my fingers as well as clothing and loose objects from the fan blades of destruction, I sandwiched them between rads. These ARE in a parallel configuration, but with a flow in excess of 200l/h and enough airflow that it may qualify as a self-propelled craft :dash: the lower flow resistance outweighed the stacked radiator inefficiencies which are mostly theoretical in this setup…

Pumps

There a 2 dual pump (2 x VPP D5 variant each).

  1. The first pair is fed by a 440ml reservoir
  2. The output feeds the rads
  3. The output of the Rads feed the second pair of pumps, which ‘boost’ the flow rate before going to the front bulkhead ‘out’ fitting
  4. After going through the PC itself, it is returned to the ‘in’ bulkhead that feeds the reservoir. The PC flow order is:
    a. In bulkhead → Reservoir 1 → Reservoir 2 → CPU → GPU → Out Bulkhead
Control

This system is relatively dumb; give it power and it does it’s thing. To have visibility into how everything is functioning, the Acquero 6 monitors temps and flow rates
image

The Sum of its parts:

Pulling it all together, the end product is tightly packed but all fits, I even cut out unneeded grilled areas of the case:

All together now

Installed setup

When you stack everything together, it is a pretty clean package. In retrospect, I would swap the in/out bulkheads in one of the units to avoid crossing the tubes between the units, but they were a pain to get tightly installed and I can live with the current setup:


How do you think with all that noise?

Noise Management

I have installed the machine in a server closet about 50ft from my office. In that room I have built a custom sound deadening 25U enclosure that weighs 1200LB empty. This Rack receives air conditioned house air at ~ 72f/22c and vents hot air into the attic, which has two 10in AC infinity fans (~1000 CFM Each at MAX) ducted to the top of the rack enclosure pulling hot air from the rack. These are the fans with an older version of the controller unit . Let me know if there is any interest in more details here.


Wrapping Up

I’ll follow up soon with some performance data, bios info, etc. but I’ll say I’m stable:

  • PBO all core of 5.2Ghz (all 64 cores/128 threads)
  • PBO single core of 5.65 Ghz
  • I’m in several 3dmark top 100 lists already, with no hardware mods to the 5090 or sub-ambient cooling

Let me know if you have any questions or have any specific tests you would like me to run.

Brian

6 Likes

Performance Numbers

Note: All tests were performed with the standard cooling setup (no chillers / ACs / Sub-Ambient cooling). There are a couple of of the 3dmark tests where I ramped the attic fans to 100% and opened the rack door to improve airflow, but everything else in in my 24/7 daily cooling setup.

BIOS SETUP

These are the BIOS Settings used for the setup including all benchmark results. Pages not shown are at defaults and/or are irrelevant for performance settings.

BIOS Screen Shois

Main

AI Tweaker Main Page

Memory Timings

PBO setup

PCI-E Setup

SMU Common Options

Workstation Benchmarks (Linux)

In Process

Operating System Setup
                  -`                     brian@desktop
                 .o+`                    -------------
                `ooo/                    OS: Arch Linux x86_64
               `+oooo:                   Kernel: Linux 6.15.10-native_amd-xanmod1-1-bore
              `+oooooo:                  Uptime: 1 hour, 22 mins
              -+oooooo+:                 Packages: 1501 (pacman)
            `/:-:++oooo+:                Shell: zsh 5.9
           `/++++/+++++++:               Display (LG Electronics 44"): 5120x2160 @ 100 Hz in 44" [External]
          `/++++++++++++++:              DE: GNOME 48.4
         `/+++ooooooooooooo/`            WM: Mutter (Wayland)
        ./ooosssso++osssssso+`           WM Theme: adw-gtk3-dark
       .oossssso-````/ossssss+`          Theme: adw-gtk3-dark [GTK2/3/4]
      -osssssso.      :ssssssso.         Icons: Adwaita [GTK2/3/4]
     :osssssss/        osssso+++.        Font: Adwaita Sans (11pt) [GTK2/3/4]
    /ossssssss/        +ssssooo/-        Cursor: Adwaita (24px)
  `/ossssso+/:-        -:/+osssso+-      Terminal: terminator 3.13.7
 `+sso+:-`                 `.-/+oso:     Terminal Font: MesloLGS NF (10pt)
`++:.                           `-/+/    CPU: AMD Ryzen TR PRO  64-Cores (128) @ 5.65 GHz
.`                                 `/    GPU: NVIDIA GeForce RTX 5090 [Discrete]
                                         Memory: 7.10 GiB / 503.03 GiB (1%)
                                         Swap: 0 B / 32.00 GiB (0%)
                                         Disk (/): 1.09 TiB / 7.21 TiB (15%) - ext4
                                         Local IP (enp193s0f0np0): 10.0.1.126/24
                                         Locale: en_US.UTF-8

Gaming Benchmarks (Windows)

3D Mark Results

Click results for details

Overall, I’m very impressed that I’m getting results this high from a 64 core workstation chip in gaming workloads. I did NOT enable gaming mode in the BIOS.

1 Like

out of curiosity, which QDCs did this?

I’m a big fan of the 9RA fans, I got a bunch of the 140mm ones for intakes on my relatively restrictive radiator setup to help the actual radiator fans keep positive pressure.

They look aggressive and back it up with performance:

2 Likes

QC Fitting: It was one of the ones linked, seemed like a fluke, as the o-ring seal slipped out of it’s groove and wedged the fitting partially open. That said, I was being an idiot and disconnected a fitting that had active flow through it… It worked previously, but isn’t a good practice. :person_facepalming:

1 Like

Ooof! I come home one day and my first water cooled machine burst a hose with similar results, glad it didn’t burn the house down!

Stoked to see your build.

Question: What is the PBO CO offset and SP rating for the core(s) hitting 5.65GHz?

1 Like

I’m running some benchmarks right now, but will add bios info to the performane post above this weekend.

2 Likes

Wow, what a sweet setup! How are your rdimm temps?

@freecableguy FYI: Added BIOS settings to the Performance Numbers Post

Running through some Phoronix test suite benchmarks yoday.

@aatchison Memory temps are really good after installing the ‘Air Guide’ to direct air over the modules. I was running a crypto benchmark when I captured the below.

*Ignore the CPU FAN speed, this is a flow meter that outputs an RPM value, there is no fan on the CPU

1 Like

Added Windows (Gaming benchmarks) to Performance Numbers Post

You can leave AI Overclock Tuner on Auto as the values manually specified are the auto settings.

Recommend setting Performance Bias to None, unless you want the board to randomly select a performance bias from the list provided. If you do test and find a desirable profile, set it manually.

All those memory timings on Auto… big OOF. This is a no-no. Use a tool to read current values and manually specify these in BIOS. DO NOT EVER LEAVE MEMORY TIMINGS ON AUTO.

On the perf bias, that makes sense.

@freecableguy As to the memory timings, secondary timings are not spec’d by VColor. I can generate a report of secondary timings in bios and manually set them, but how would this differentiate from auto? (serious question)

Here is a report from bios on the secondary timings:

My primary timings are tighter than spec (CL40-40-40-40-96) vs (CL52-52-52-52-103)

Vcolor Specs

Memory Capacity: 512GB (64GBx8)

SKU (1) SKU (2) Speed MT/s CAS LATENCY VOLTAGE NATIVE
TR564G64D452 TR564G64D452O 6400MHz PC5-51200 CL52-52-52-103 1.1V 4800MHz

Also, I have the inf. fabric at 2100, haven’t tried bumping it higher yet. Every BIOS change that impacts timings takes ~20 min to make with the way training is setup on this board.

Most of the secondary and tertiary timings from V-Color have about 100% margin (i.e., twice as great as they need to be). Depending on the kit, primary timing ratings are quite loose as well. The timings provided are designed to ensure 100% compatibility and should not be used for performance tuning purposes.

Short answer: training is taking long because you’ve left all those timings up to the board to configure from scratch every time it boots by selecting auto.

See here for more information:

Those timings are horrendous. I would invest some time in tightening those up. Yes, I understand it’s trial-and-error and training takes time, but a lot of good can come for investing some effort. I’ll give you a reference for consideration; however, keep in mind the double-sided 16Gb IC DIMMs shown here are half the density of the 64GB sticks you’re using, and so you likely will not get this tight as you’re using 32Gb memory ICs.

Proper tuning produces the following results:

You may need to set VDD_11_S3 / MC Voltage (this is the IMC line voltage on the I/O die side, analogous to memory VDD) as high as 1.4V, which is not a problem. I use 1.3V VDD/VDDQ/S3 but then again my memory is rated for DDR5-7200, so these are cherry picked Hynix A-die ICs.

I saw your memory load temperatures above and you have plenty of headroom. You can push these sticks to as high as about 65C and still achieve Refresh Intervals approaching (or meeting) max at 65535. Refresh Interval has a HUGE impact to not only bandwidth but also latency.

EDIT: One final thought… you should be running SOC at 2133Mhz with memory at DDR5-6400. You want a 2/3 ratio:

DDR5-6400 → 3200MHz (Double Data Rate) / 3 = 1066MHz

1066MHz * 2 = 2133MHz

Push IF just a little higher to 2133 and you will see memory latency make a healthy drop. (Higher SOC frequency may require a small bump in voltage. I use 1.15V for 2133MHz with the memory timings provided above.)

1 Like

Thanks for the info, agree the timings are loose. will take a look after I complete the benchmark suite to have a baseline.

My old machine was TR3000 series and I was at 1:1 bus timings 3600 mem (DDR) / 2 = 1800 (BUS): 1800 (IF).

I don’t think 2133 should be an issue on 9000 series.

That was 1:1 as memory at DDR4-3600 is in actuality running at 1800Mhz. DDR means two data transfers per clock: one on the rising edge and one on the falling edge.

Yep 1:1 bus timings (more accurate way to put it) 2:1 when accounting for the DDR 2x due to reading on both (rise and fall) updated previous post

1 Like

@freecableguy: What would you estimate a cold boot post time is in your setup? e.g.:

  • PC has had wall power but is off
  • Press Power button
  • How long until you see video signal and complete post?
1 Like

Less than 30 seconds.

1 Like

Linux Asus Pro WS WRX90E-SAGE SE users:

Has anyone managed to get their memory temps working in LM_Sensors? I followed the typical steps, but can’t seem to find the correct bus with the data…

REF: lm_sensors - ArchWiki