Have my RTX 3090... now what? Build advice for PyTorch

HI Level1Techs,

I had ordered an RTX 3090 Vision OC back in January and was planning to shoe horn it into a very old build of mine… but the only one I had (a Core i7 2600K based system). While playing cyber punk and swapping SSDs it decided it’d had enough and I got the dreaded red led on VGA_LED on the MOBO… tried everything I could think of to get back into the system but… I had had this since 2012 and it was a wonder it made it even this far… SO I now need to build a system from scratch (and it makes sense since a RTX 3090 in a core i7 2600K system would have been severely bottlenecked)

Here’s what I have so Far

  • RTX 3090 Gigabyte VIsion OC
  • Corsair hx750W (thinking to replace this with something new and higher wattage)
  • Coolermaster HAF XB Evo chassis (small noctua fan underneath near the SSD Cage for exhaust)
  • GTX 1060 MSI 6GB (likely going to sell or make a gaming PC off it for GF)
  • 1TB Samsung NVME 970 Evo
  • 1TB Samsung Sata SSD 980
  • 2 random old SSD sata drives… might use as scratch disks or for gaming

Here’s my PCParts picker so far… no RAM chosen yet and was really looking to see recomendations on that… as I do not generally know ryzens sweet spot

PCPartPicker Part List: https://pcpartpicker.com/list/JHjNwz

CPU: AMD Ryzen 9 5950X 3.4 GHz 16-Core Processor  ($999.00 @ Amazon) 
CPU Cooler: Noctua NH-D15 CHROMAX.BLACK 82.52 CFM CPU Cooler  ($109.95 @ Amazon) 
Motherboard: Gigabyte B550 VISION D ATX AM4 Motherboard  ($388.72 @ Amazon) 
Storage: Samsung 970 Evo 1 TB M.2-2280 NVME Solid State Drive 
Video Card: Gigabyte GeForce RTX 3090 24 GB Vision OC Video Card 
Case: Cooler Master HAF XB EVO ATX Desktop Case  ($105.99 @ Amazon) 
Power Supply: Corsair HX 750 W 80+ Gold Certified Semi-modular ATX Power Supply  ($298.99 @ Amazon) 

That’s about it really. What am I using this system for? Well the reason I grabbed the 24GB 3090 is I do image processing in PyTorch… and models I created were not fitting on my GTX 1060 6GB… and when I went to use AWS for training models it cost about $250 for 24 hours or something nuts… plus I like tinkering and wanted to have a machine to practice C++ CUDA programming and swap out the SSD every once in a while to play Windows games (or give that new steam electron stuff a try)

My big decision is… do I go for thread ripper? The images I process are right now based off a macbook pro retina screen… so a bit less than 4K in resolution… but in general I may in future want to experiment with other models in and around image and sound processing… so I figured TR is overkill if I am paying out of pocket… and just start with the 5950x

  • Budget. How much are you willing to spend?

On top of the 3090? another $2-3K or so… but flexible

  • Where do you live (what country, don’t post specific details), and what currency do you use?

Sweden, SEK

  • Is there a retailer you prefer?

Not really… they’re all expensive compared to USA

  • Do you need or already have peripherals? (this can add to costs)

Already got those

  • What will you be using your Glorious computer for? Gaming? Rendering? Mix of both? Or is this a home media PC or Steam Box?

Deep learning and general programming, running VMs in Ubuntu LXD

  • Do you overclock or want to get into overclocking?

No, I plan to buy an over the top air cooler and maybe overclock in future… but this build is about stability under load… while training pytorch models

  • Do you plan on going for custom water-cooling now, or in the future?

never, air only

  • Operating System. Do you need a new one?

Ubuntu 20.04 LTS is the plan

  • What kind of settings do you like or what FPS do you want to play at?

1080p - 1440p

  • What resolution will you be playing at? //or would like to play at.

I play BF3,BF4, Cyberpunk in and around 1440p would be ideal

  • What kind of games do you like to play?

BF3, BF4, Cyberpunk… shooters mainly

  • What specific game will you be playing (if you really only play one)

Cyberpunk and whatever the next BF is

Thanks so much for looking!

Edit: I am torn between a B550 board or X570 board… in order to run my 1060 as a display driver… and free the 3090 RTX for 100% computation… but that seems quite intense amount of power and cost… but still weighing if I would do well NVME and PCIE lanes wise on something like this instead of B550

https://www.asus.com/Motherboards-Components/Motherboards/All-series/Pro-WS-X570-ACE/

and throw in some ECC RAM… but will it still support windows gaming?

also doing some research on this forum and eying one of these “Aorus Gen4 AIC” in perhaps an attempt to feed data to the GPU as quickly as possible… via some NVME RAID perhaps?

Since browsing the forum I’ve landed on NOT getting a threadripper build… I am buying this out of my pocket and it’s just not needed for my use right now… and if/when it IS needed… zen3 TR will be out… SO I am now just wanting to make sure i land on the right Mobo and ECC RAM combo to give a good stable rig that runs Ubuntu 20.04, LXD, Juju, Pytorch (via conda) and once in a while let me spin up windows and cyberpunk 2077 somehow (I need to look into virtual machine support in windows)…

the big question also is… do I try to get the 1060 GTX to be my main graphics driver… put it in PCIe slot 3 … and that frees the 3090 to be completely free to run DL/ML workloads fed off either an onboard NVME… or possibly use a PCIe extender and slot in one of these 4x NVME Raid cards… seems i’d need Wattage that starts to bump up the price significantly… and maybe I can get something where I kill X Server if/when I wanna go full blown 3090 Memory usage … or do I dare get the AMD APU??

after much debate… I think I’m settling on this motherboard

https://www.asus.com/Motherboards-Components/Motherboards/All-series/Pro-WS-X570-ACE/

and now trying to decipher the best RAM setup… I get it I need unbuffered ECC for reliability… which will be critical on those long deep learning training sessions right? but now I am confused about speeds, ranks… seems there was debate about whether to get 4x16GB vs 2x32GB … so far I see I should likely max out the RAM to 64GB with ECC unbuffered… but unsure after that what brands to steer towards

any thoughts on RAM for this build would be well appreciated… I’ve never had a ryzen system nor had ECC

looks like this RAM would clear the cooler I am eyeing without a problem as well

@emcp Welcome to the forums. Your posts cover quite a bit of territory; it’s a little hard to know where to start. That may have something to do with the lack of response.

I’ll start by responding to one point that catches my attention. The “standard” answer regarding what RAM to use is to see the QVL (Qualified Vendors List) for the motherboard. And, indeed, the ASUS support page for the motherboard you mention does lead to a “Memory QVL report ECC” document. It only lists a small number of DIMM models; presumably these offer the best likelihood of compatible operation.

Interestingly, the QVL includes sets up to 4x32GB or 128GB total. You are in a better position to know whether this amount of RAM would benefit your work.

By the way, although I am a supporter of ECC RAM, I would not lay great stress on the need to use it for Machine Learning. Errors in RAM are fairly infrequent, and wouldn’t the nature of ML tend to minimize the effect of single-bit errors? Suppose you randomly changed a single bit in your training images (of, for example, kittens)? A single bit in every kitten image? Ten bits? The learned weights would likely be very similar.

It is probably possible to find a Worst-Case bit which would do serious damage if altered, but… compared to software errors which almost certainly remain in PyTorch, plus any design errors which (heaven forbid) you might introduce, it isn’t obvious to me that ECC memory is of major importance. Depends on how critical the use-case is, I suppose.

Anyway, it sounds like you are planning a powerful computer. Congratulations, and good luck.

2 Likes

thanks @Caped_Kibitzer , reason I am a bit all over the place is… this is not for work and I have never had this much VRam at my disposal. the usecase is not critical at all at this stage and I just want to ensure good realibility over a long runtime (likely hours and hours of training, running Linux containers in background alongside the ML workload)

I agree that the ECC RAM is sort of irrelevant for my purposes… the vibe in the forums seemed like it was desirable for stability overall but… perhaps I just grab some solid quality RAM in a good speed for whatever chip I land on … that clears the air cooler…

great feedback! thank you

@emcp I do like ECC RAM in principle, but it really limits the available RAM choices; hence my doubts.

I took a hard look at PSU needs, expecting to tell you that 750W wasn’t enough… Of course, you mentioned that you are considering a more powerful replacement.

As for 750W, well maybe. NVidia recommends 750W for both RTX 3080 & 3090. There was a kerfuffle in about October where people were reporting sudden crash during gaming with 3080 or 3090. Turned out to be PSU-related, though the PSUs seemed to be adequate; a fair number of these were 750W. But NVidia released a new Windows driver that evidently reduced the peak momentary demand and this seemed to eliminate the problem. (Hopefully, the fix made it into the Linux driver.)

Given the history, 750W seems a trifle marginal to me. I wouldn’t discourage you from testing with a 750W PSU you have on hand, but if buying another, I would probably recommend 850W.

1 Like

@emcp How 'bout that motherboard? Somehow I had not been aware of the ASUS Pro WS X570 Ace and I am quite impressed. @MisteryAngel Do you have any thoughts you would like to share, perhaps about whether the VRM is sturdy enough for a Ryzen 5950x?

The official support for ECC RAM, and for up to 128GB of RAM, seems like a remarkable strength of this motherboard. ECC support for X570 has generally been murky, and this is the only board I know of for which the manufacturer clearly states support for ECC functionality. Nice!

Plus there are three PCIe x16 (mechanical) slots, capable of X8/X8/X8 operation. Also nice! (The third slot uses chipset PCIe lanes, so throughput would be limited to PCIe 4.0 x4.)

But there are always tradeoffs… You seem to be doing your homework, so I suppose you have learned about some weaker points? Like:

  • It’s a “first generation” motherboard, so doesn’t incorporate any lessons learned, nor are VRMs beefed up for Ryzen 5000. (However, there may be no need since the power consumption doesn’t seem to be higher.)
  • Only one full speed M.2 slot; the second uses only 2x PCIe lanes, routed via the chipset. (Should still be faster than a SATA port, though.)
  • Four SATA Ports
  • Six USB Type-A (3.1 or 3.2) and one Type-C on the I/O panel seems just adequate.
  • It is unclear, but there are some “rumors” that the RealTek NIC is dedicated to an IPMI-like management interface. Perhaps only the Intel NIC is available for general use.
  • The User Manual is even skimpier than usual; the dearth of info is shameful.

None of that strikes me as being a showstopper, but it is best to be well informed, lest there be any nasty surprise.

Generally, this motherboard impresses me as an appropriate fit with your other hardware and the planned workload.

You’ve got me wondering whether this might be a good motherboard for my next build; please share with us how it works out for you.

1 Like

The vrm on that particular board should be good enough.
It´s not the most powerfull vrm on am4 today.
But it should be fine for a 5900X / 5950X.

Basic vrm specs out the top of my head.

  • pwm: ASP1405i in 6+2 phase mode. (no doublers).
  • powerstages: 12+2 powerstage design. IR3555 60A smart powerstages.

The board itself does come with few minor downsides which you already described.
So i would highly advice that topic starter should,
double check if those limitations could be an issue for him.
To me personally it only having 4 sata ports would be a deal breaker.

1 Like

@emcp what are your actual goals with virtualization?
And then i mean are also looking into running windows in a vrm,
with pci-e passtrough for gaming.
Or are you just wanne run a couple of vm’s without passtrough?
This could be kinda important in regards to motherboard choices as well.

1 Like

@Caped_Kibitzer

on the PSU, I def. will run something larger… and this anecdote sort of confirms it for me now that I need more watts… I’ll see how the Swedish suppliers can do but will start at around 1000W gold… I usually stick to corsair but maybe there will be other good ones. thank you.

given this system is maybe the last AM4 generation… if it will hold the 5950x then I have no qualms about the downsides… I do not have more than 2 old sata devices and store most things on blobstorage… rotating data into the machine as it chews through computations. thanks for double checking and shame if that second NIC is only for IPMI … will have to find out and report back :slight_smile:

@MisteryAngel thanks for the comments/insights. Regarding the virtualization question(s)

GPU passthrough would be “nice” to have, but I have not planned to require it at this time … currently I run anywhere from 8-10 small LXD based images at once… and it supports passthrough according to articles I find

There will be no windows running in virtual machines… if I want to game I plan to boot off an entirely separate drive … mixing linux and windows via virtual machines to me is more work than it’s worth and I can try gaming on Steams linux thing once the machine is built

Thanks for the responses

There are far more qualified people that can lean in here so I’m not even going to try but here is a thought… Have you thought about having 2 separate systems and streaming your games. I believe Steam Link is on Linux now but I haven’t experienced it myself. Leave out anything you don’t need from you main workstation but use the GPU to stream to another device being a nuc desktop or a big screened laptop or anything to offload from your work-centric hardware.

1 Like

interesting idea… in my house we have a ASUS Tuf gaming laptop that’s not used much… I will see how that might go when I can try it out.

thank you

1 Like

By coincidence, another new forum member just posted about an issue with their ASUS Pro WS X570 Ace. They seem to have run a couple of these motherboards without problems using 2 DIMMs, but encountered an odd issue after adding 2 more DIMMs to one of the boards. That thread might be of interest to you:
https://forum.level1techs.com/t/asus-x570-pro-ws-ace-realtek-nic-missing-on-boot/171263?u=caped_kibitzer

1 Like

Well, what do you know? There are a fair number of posts on this forum re the X570 Ace motherboard. The “magnifying glass” icon in the forum toolbar opens a Search dialog, and search on “asus pro ws x570 ace” turns up a bunch of posts.

I haven’t reviewed these for useful information, and may leave that to you.

1 Like

parts are starting to trickle in … here’s how I landed and thanks all for everyones advice

I haven’t purchased the RAM or CPU… but I am pretty settled on it… just a matter of supply and some last minute budgeting constraints

CPU: AMD Ryzen 9 5950X
CPUCooler: Noctua NH-D15 CHROMAX.BLACK
Motherboard: ASUS Pro WS X570-ACE
RAM : 2x 32GB KSM32ED8/32ME Kingston ECC RAM (from QVL list)
Storage: Samsung 970 Evo 1 TB NVME 
VideoCards: Gigabyte GeForce RTX 3090 24 GB Vision OC
VideoCards: MSI 1060 6 GB (to draw desktop, practice basic multi GPU pytorch) 
Case: Cooler Master HAF XB EVO ATX Desktop Case
PowerSupply: Corsair HX1000 1000W

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.