Help with new rig primarily for AI/ML applications and passthrough

Hey @all,

I am new to this forum but follow the youtube channel and after the friendly invitations in the videos I thought I try my luck here.

I am not a complete novice to building computers but also no a pro. My last build is quite a number of years ago. I need a complete new rig primarily for heavy lifting deep learning NLP applications. Plan is to train test locally and then burst compute on the cloud.

But I also need to use the rig for several systems (pass through → potentially at the same time) and maybe some occasional graphic work / renders but nothing too intense on that end I guess.

Please note that NLP usually takes a significant amount of GPU RAM on the engaged graphic units.

Here are the proposed components and resulting question of which I hope you can answer. I plan a 4 screen work setup.

Processor: AMD Ryzen Threadripper PRO 3955WX, 16x 3.90GHz

MB: ASUS Pro WS WRX80E-Sage SE WIFI

Cooler: Fractal Design Celsius+ S36 Prisma

RAM:

either

  • Trident Z NEO either the F4-360016Q-64GTZNC
  • gskill(dot)com /specification/165/326/1562840280/F4-3600C16Q-64GTZNC-Specification

or

  • F4-3600C18D-64GTZN
  • gskill(dot)com /specification/165/326/1582265908/F4-3600C18D-64GTZN-Specification

SSD: something like the Samsung SSD 970 EVO Plus 2TB, M.2

HDD: anything really

Case: be quiet! Dark Base Pro 900

Power: be quiet! Dark Power Pro 12 1500W

GPUs: no this is where it gets interesting. I am planning for 3 GPUs. 1 system one and 2 for combined and/or individual passthrough.

Potentially something like this:

  • MSI GeForce RTX 3070 Suprim (system)
  • MSI GeForce RTX 3090 Ventus 3X 24G OC (2 of those)

Question 1:
Can the MB actually fit all 3 GPU’s and potentially even the motherboard included Hyper Quad 4xM.2 adapter? That is pretty much my biggest worry right now. PCIe slots are 20mm apart from each other. 7 slots make for 6 “gaps” which makes about 12cm of space. However each graphic card is about 6cm thick from what I can see in the specs. That basically rules out 3 and means 2 cards will fill pretty much all 7 slots? This can’t be right…? What am I missing? And if that is the case what are my options?

Question 2: Worries about power. Is the Dark Power Pro 12 1500W enough?

Question 3: I’ve read several times now that passing through 2 identical cards that are the same can result in problems. However the main system card would not be. Is that still a problem. Does anyone have experience with this?

Question 4: I’ve listed two options for RAM here. Surprisingly the first one (which has less size per component: 4x16GB) is more expensive. I am no expert in RAM. Looking at the specs it seems that the latency is higher in the 2x32GB components. However I am trying to maximize RAM in the machine if possible. What are the trade offs here? Does any of that RAM work for the system (amd(dot)com/en/products/ryzen-compatible-memory) ?

Question 5: Is there anything else in this system which isn’t going to work or will be problematic? Any better GPU options (Note: I don’t want to get a TPU)

Also did I forget any components? Does the cooler need some kind of “paste” onto the processor (god… haven’t done this in a long time :roll_eyes:)

Question 6: Is the case large enough with 577 x 243 x 586?

As mentioned introductory I am hoping to setup several different pass throughs for different things. Most will just use one GPU while the main one will need to have 2 GPUs passed through for heavy AI lifting. But at most 3 systems (main + 2 passthrough systems) would be running at the same time.

Thanks so much for the help. Much appreciated!

You do need thermal paste :)
It looks like your cooler will have it pre-applied, though. (source: specifications at Celsius+ S36 Prisma — Fractal Design)

Thanks for the reply :smiley: :+1: That makes sense. Wonder if anyone knows about the other points. Mostly if 3 GPUs will fit on that MB?

I don’t know right off hand but I know that the 3090s are three slot cards. Look at the PCIe layout of the board. ThreadRipper has plenty of PCIe lanes but accessibility depends on how the board is laid out. Expect the best but plan for the worst.

If we assume that all three cards are 3 slot cards, then you would need to hit every fourth PCIe slot which means that you would need a board that would have at least 7 PCIe slots and your case would need to have the additional space at the bottom to accommodate that bottom card. You would also need the correct lanes available in the first, fourth, and 7th slot.

Thanks for the response. I did a good bit more reading and decided I will have to go with a PCIe riser option or installing some of the (or at least 1) cards vertically. So now I need a case that will support this. I’ve looked at a number of them and the Corsair Obsidian 1000d looks like a potential candidate for this having enough overall space inside. 2 cards on the board in regular horizontal fashion and then one card horizontal kind of like over (or on top) the motherboard. I suppose I might have to buy a custom vertical mount for it or do some custom modding to make this work.

I am seriously wondering how to people do it that use 4+ more cards. Are they all using a custom case then with PCIe risers / extensions? I’ve read in mining forums that some people use 6 or more cards in a mining system. Surely that must be all custom then?

They could be using an eATX motherboard with a full size ATX case.

The problem still is the solid installation inside the tower. After looking more into what miners do it is pretty much always a custom frame build where GPUs either are mounted or just hung in a vertical fashion to some kind of frame. Quality and stability reach from very bad to extreme good custom builds.

I’ve done even more research on this. When building a rig that is going to be a solid machine learning powerhouse the best option is actually one (or more) custom liquid cooling loops. That involves taking off the fan-cooler from each GPU manually and installing a custom made glass or plastic fitting instead. Several companies make them and I even found pre-made ready to go GPUs with pre-installed water cooling ready to go. I think ASUS make them in conjunction with a company called EK I believe. But you can buy adapters from a number of companies including Corsair for example. Here is an example of that.

Anyway the point is that this will pretty much change the GPU size from an either 2 (double) or 3 (triple) slot PCIe component to a single size component. That in turn easily allows then for 4 cards to be mounted right next to each other. That is the biggest take away for me here.

So basically what I want can be done, but not without substantial configuration and part modding. Also a danger can be when a pump stops working while you have 3 or 4 cards running over night. The best option (something I saw) seems to be 2 pumps in the same loop in case one does fail.

I’ve never done a custom liquid loop but from watching some videos it does look very doable.

I might update further on what I am going to do / find out.