I am planning on building a workstation for Cybersecurity Research and Deep Learning development. I am contemplating a Dual RTX Asus ROG STRIX 3090 in SLI mode with a Threadripper 3960x. While I have gone through several videos online showing this setup, none of the builds have installed an additional addon card or state if there is any clearance to put a PCIe Extension cable on the PCIe x16 slot covered by the 3090s.
Since I plan on adding an Aorus Gen4 AIC, what mobo do you guys recommend for such setup?
I was sold on either:
ROG Zenith II Extreme Alpha TRX40: Issues, when installing dual 3090s on slots 1 and 3, slots 2 and 4 seem to be completely covered. No one discusses if a PCIe x16 extension cable can be put in slot 4 on this setup.
ROG Strix TRX40-E Gaming: Issues, It seems the GPUs could be installed on slots 1 and 3 while installing the AIC in slot 2. The problem is that I wouldn’t be able to use SLI due to the added distance (slot 2 in the middle) between the GPUs.
What is your recommendation based on your experience?
With the Zenith: 2.9 slot graphics cards in slots 1 & 3 means the one in slot 1 won’t have enough room to the one in slot 3 for the fans to get air.
I have a STRIX 2080Ti-O11G (that is 2.7 slot). Slot 2 is completely covered, slot 3 is too close for my comfort.
IMO, you will probably need watercooling or run alot of riser cables (I think most if not all won’t work with PCIe 4.0) unless you can find an XL-ATX board with just the right slot configuration and a big enough case.
Maybe Gigabyte TRX40 Aorus Xtreme? But even then I don’t think you could put something in slots 2 and 4.
Good day Tived. Thank you very much for your input. The main issue in Deep Learning is the I/O from and to your storage. Some of my datasets can be as large as 8TB (When dealing with Network Data). I am trying to avoid water cooling and see if I can stick with stock cooling solutions. Please let me know if any other ideas come to mind.
Thank you very much for sharing your experience. Based on everything I have seen so far backs up your concerns. I am trying to stick with stock cooling solutions. And yes and AIC or a Raid Controller for attaching multiple SSDs is needed along the Dual 3090s. Please let me know if you have any additional suggestions.
Thank you very much for your insight. I have been getting a lot of real good information on this topic from your videos “Building the ultimate DIY NAS on AMD”, “Is The RTX 3090 A Machine Learning MONSTER?” as well as videos from Jay, Robey, Steve and Paul. In my case, I it is needed to use an Aorus Gen4 AIC along a Dual 3090 on SLI to maximize the amount of high speed storage in the system. For example, even by filling the M.2 slots 1,2,3 and the DIMM.2 with Samsung 980 Pro or Sabrent 4TB Rocket Q4 NVMe PCIe 4.0 M.2 2280 would give us a max storage of 5TB - 20TB.
In contrast, by using an AIC and M.2 slots 1,2 and DIMM.2 we could potentially get a max of 24TB. The extra storage will come very handy as we focus on the generation of federated deep learning models trained with datasets as large as 8TB. Most data is collected by using the elastic stack from dozens of VMs.
I would greatly appreciate your recommendations on setups for having a dual 3090 on SLI and an AIC on a 3rd PCI express port. Feel free to let me know your recommendations using stock cooling (Is it even possible with the currently available Mobos?) and water cooling options.
Thanks for the great content you and your team generate. It gives a lot of insight to the scientific community. P.S. Would love to keep having AI benchmarks videos when possible.
@gonzalo How many M.2’s do you want/need? A lot of motherboards will share resources with PCIe slots or other board resources (e.g. sata controllers) so just because the board has the physical slot doesn’t mean it will be available .
Do you need the 3090’s to run at x16?
The only thing I can think of that would work with stock cooling:
With the Aorus Xtreme you could get 1 3090 at x16 in slot 1, 1 3090 at x8 in slot 4, an AIC with 4 M.2’s at x4 each in slot 3, and 4 m.2’s on the motherboard (2 through chipset, 2 direct to CPU). You would need a big case though because the bottom 3090 would hang out past the end of the XL-ATX case.
With watercooling your possibilities are about endless. You can stack the cards close together, put your AIC in the middle (as long as it doesn’t get its fan blocked)
You could also pick up 3090’s with stock watercooling (AIO type or custom loop type). Examples:
Thank you very much for the insightful recommendations!
I was aware about that we might encounter the issue of running one GPU at x8 given the amount of M.2 slots occupied + AIC. I have no knowledge on how will the lanes would be occupied given that threadripper 3960x has 88 PCIe lanes (I still haven’t read enough).
What would be your recommendation for having a system running running two 3090s in SLI at x16 (both) while maximizing high speed storage capacity (i.e. AIC or a few M.2 Samsung 980 Pro installed)?
I would go full custom loop, but that’s just me…
Either way you will have a hard time finding a case to pack all this into and keep it cool. Water-cooling 2x3090’s means you want at least 2x 240mm radiators. Water-cool the CPU too and that’s another 240mm radiator.
FYI: the 3960x, 3970x, 3990x all only have 64 lanes.
8x to chipset
32x to your 2 3090’s
leaves a max of 6 m.2’s direct to CPU (if the motherboard doesn’t use some lanes for things like 10gb nics or usb).
Any remaining m.2’s would run through the chipset (and take a performance hit)
You may find you don’t need the full x16 lane for you workload. Things like mining can run fine on x1 links (for example), but you would probably have to test this to be sure.
The Threadripper pro CPUs that are coming out will have more lanes and the motherboards will probably reflect it with more connections (M.2 or PCIe slots) at the expense of cpu clock speed.
If you don’t want to watercool:
Gigabyte Aorus Xtreme or ASUS Zenith II Extreme Alpha
2x AIO type 3090 in slots 1 and 3, mount radiators in the case somewhere
2x AIC’s with 2 M.2 SSDs each (probably have to buy the 4x m.2 models and only fill with 2 SSDs each. OR
2x PM1735’s or something similar. Higher capacity, higher cost, but enterprise grade.
run 4x m.2’s on the motherboard
On the Aorus Xtreme it would be 2x to CPU and 2x through chipset
The Zenith II it would be 1x to CPU and 3x through chipset, it uses 4x lanes for USB-C according to anandtech
That would give you 29.6TB of SSD if you used the 12.8TB PM1735’s and 1TB 980 Pros.
Not to through even more decisions into the pot, but…
You could get some M.2 to SFF-8643 adapters and then run some higher capacity U.2 drives like the PM1733 and then each one of the M.2 slots would give you up to 15.36TB bringing the total up to ~87TB of SSD.
More yet, you could run 2x U.2’s on each of the x8 PCIe slots (not sure about the right adapter here) and then get up to ~122.9TB of SSD at Gen4 NVMe speeds.
The enterprise SSDs are costly though…
If storage (space and speed) and GPU is more important for your workload than CPU clock speed you may look at an EPYC based or Threadripper PRO based system. Both support 128 lanes of PCIe Gen4.
Thank you again for all your advice. All of this information is taking me to the right path.
I have been running 18 enterprise SATA SSDs using a MegaRAID SAS 9361-24i. While they have been quite useful, it occupies a lot of space and I want to update to PCIe Gen 4 and take advantage of the faster speeds. Definitely, Storage (space and speed) and GPU is more important for my workload than CPU clock speeds. AI training is all done in the GPU and now we can potentially migrate all pre-processing can be performed in the GPU with NVIDIA Dali. CPU cores will be purposed for multiple containers or VM deployment for running small scales data analytics clusters (i.e. elastic stack, spark).
One thing seems clear, it seems that whether I decide to use SSD AICs, or keep my SAS 9361-24i with my current SSDs, or running 2x U.2’s (SFF-8639) on each of the x8 PCIe slots (Doubt I will go this route), I will need to use water cooling either custom or 2x AIO type 3090s.
Threadripper Pro seems to be the way to go for my particular application as you mentioned; nevertheless, it seems that we will need to wait for the mobos and the price for the processors.
One question on a specific recommendation “2x AIC’s with 2 M.2 SSDs each (probably have to buy the 4x m.2 models and only fill with 2 SSDs each.” why only 2 if there is space for 4?
Gigabyte does make a 2 slot blower 3090 link here
I would be hesitant to put them directly next to each other or a long card right next to it, but you could put your SAS card next to it or maybe find an add-in-card that you can connect U.2s to (like this guy but gen4)
You can connect the SFF-8643 card to an Icydock hot swap bay for M.2 or U.2. (also watch Wendell’s review of the U.2 version) That would leave plenty of room for the 2 slot 3090’s to breath on either the Gigabyte or the Asus board and the 4x SSD’s connected to the SFF-8643 cards would only take up a 5-1/4 bay. But I believe they are only PCIe Gen3.
With most AIC’s if you put the 4x m.2 AIC’s into an electrically x8 slot only 2 of the 4 bays will have a connection and function unless you find one that does some pcie switching witchcraft on the AIC board. I’m pretty sure supermicro makes some that can run 4 m.2’s on a x8 slot but they are gen3, someone else might make gen4 version but that is not something I’ve researched.
Now you got me interested on the potential restrictions of AICs. I think that you are correct on SFF-8642 cards being PCIe Gen3 only. Haven’t seen anything on Gen4 yet. The good thing about these cards is that you can use different RAID modes with your SSDs. I got an amazing performance on SAS but this is drastically reduced when using SATA.
The more I read, the more convinced I am on going for water cooling; although, there are still some reasons that makes me think about twice due to the maintenance. Plus I have never built a water cooling system in my life.
It just depends on your workload. For gaming 3090s seem to be an overkill. But for my case, I really take advantage of the HBM2 or GDDR6x memory. In deep learning the more layers you add to your model and the greater the batch size the more memory it will require to train the model; otherwise you will get memory errors. You can always use model parallelism or data parallelism for accelerating your process and you do not need SLI for this. I will be exploring SLI for other applications myself and possibly for exploring applications in my research. NVLINK is widely used for connecting multiple GPUs in server grade systems.
Knock yourself out man, just genuinely curious if SLI and nVLink even has any worthwhile applications now that PCIe 4.0 allows for x8 + x8 with no slowdowns whatsoever (if the motherboard allows, of course).
There are some things that may not be relevant to your usecase:
We don’t really care about noise that much (WS is in rack with much louder equipment), this is also why we avoided any kind of watercooling (and longevity)
Meshify C - only case we found that has “good enough” auirflow, still fits in 19" rack and barely fits NH-u14…
Gigabyte Turbo 3090s - blower style 2 slot (they help drawing air out of chassis!) with nvlink
TRX40 Creator - standard ATX (case compatibility), 10G eth, “good-enough” VRM (until fan dies ) and we can still stuff 2x pcie extensions (well 1, since we stuffed additional networking card because it was lying around…)
(theoretically we could stuff 2x2 slot GPUs if need be.
Ram runs XMP no problems (well, don’t change UEFI settings TOO much,…)
Temperatures are under control even when both CPU and GPUs are being hit (no throttling on GPUs, CPU does “throttle” (does not boost above what AMD promises) when it runs for ~40min)). From time to time i check what’s happening with VRM temps (Motherboard under linux is a bit… special… so let me know if you need start-guide for that ).
So far we’re happy with it, however we were forced to make some compromises (mostly MB, though it seems to work OK). We don’t really plan to overclock this, so i think we’re safe. Let me know if you need any more info.
One issue we had: one of GPUs has slight ticking noise from fan when under idle… But that goes away when it starts spinning fans faster… We’ll warranty it if it dies…