Epyc Build

This is my first time being able to actually build a workstation class pc, I am not sure how to figure out what workstation gpu is best bang for buck, im not really trying to play games on this pc it will be specifically for rendering, and I am interested in Ai/deep learning tasks, but that would be more of a side case, it will be mainly for 3d rendering. Now I have been looking at the Radeon Pro Duo Polaris, I been looking at the tesla’s and titan gpus, and I have looked at the quadro p and k series cards. At first my main choice was like a 6950xt since I thought it would seem good with the graphics memory but I thought I heard that Nvidia was just better for rendering then the AMD Radeon card so that brought me to the 3080 or 3090 however, I thought since I’m up here would it just be better for me to get a workstation card since this wont be playing games on this pc. I am really just looking for some guidance or some websites or places to go for me to be able to figure out what GPUs I can use. I am looking for something that has as much VRAM as possible but not like 4k lol, I would like to only spend no more than 1k but more preferably 600-800 bucks. This will be a dual Epyc 7551 build. Thanks for your help.

I am a novice at AI stuff myself but;

  • if you want AI and ML, then do not bother with any AMD GPU; you want Nvidia. This is not optional. CUDA is pretty much the way to go and trying to shoehorn premade AI workflows into AMD GPU compatibility is a giant headache
    edit: seems like AMD support may have changed in the time since I tried this last, see details here; https://forum.level1techs.com/t/mi25-stable-diffusions-100-hidden-beast/194172/176https://forum.level1techs.com/t/mi25-stable-diffusions-100-hidden-beast/194172/176

  • before you start building something you need to think strongly about your form factor and your overall system format; is this going to be running in a desktop PC case or a server rack of some kind?

  • how much VRAM do you actually need? This is incredibly important because you cannot just upgrade your VRAM later, you must buy GPU(s) that have sufficient amount

  • are you planning to use NVLink to access larger amounts of VRAM?

  • what workloads are you actually planning to attempt?

I have been futzing around with some light AI stuff for the past months, mostly just running models found online such as Stable Diffusion ( GitHub - AUTOMATIC1111/stable-diffusion-webui: Stable Diffusion web UI ) and language models ( GitHub - oobabooga/text-generation-webui: A gradio web UI for running Large Language Models like LLaMA, llama.cpp, GPT-J, Pythia, OPT, and GALACTICA. ).

Notably, not training models; do you plan to do that??

For Stable Diffusion, I did really well with a single Nvidia RTX A4500 graphics card. This card has 20GB VRAM and has performance on par with a RTX 3070. Notably, it has a blower-style thermal solution which ejects heat out the rear of the computer, and has low~ish power draw at around 200~ish Watts with a single 8-pin PSU connector. Note that for Stable Diffusion alone, this card is actually overkill, the best budget card is actually the RTX 3060 12GB, which has just enough VRAM to do well in mid-sized image generation batches and costs a measly $350.

I wanted to move up to language models, which requires more VRAM, and I wanted to splurge a little bit. At the moment, the best bang-for-your-buck configuration seems to be 2x RTX 3090 graphics cards, in a NVLink configuration. This gives you a total of 48GB VRAM (24GB each); note that you might have to write the code to utilize it yourself though. Unfortunately, most all 3090’s are gamer-cards with a multi-fan thermal configuration that dumps hot air all over the interior of your PC case, and they pull 350W each, some even require 3x 8-pin PSU cables each, so suffice to say these cards are hot and large and blow up your PC’s power draw and thermals pretty badly. If you can utilize them, they are plenty powerful for the price (currently running about $760 each on eBay, plus $200 for the NVLink Bridge if you can find one of appropriate size), but you might end up suffering if your PC is not already designed to exhaust 700W of hot air. I am still figuring this part out because, fool that I am, I have shoved two of them into a Meshify C PC case with a 1300W PSU. And that is not even getting started on the issues with overheating memory junctions and re-padding your GPU and finding a pair of identical cards that can physically fit with your case and motherboard.

My suggestion, is to re-evaluate closely your needs, your budget, and your tolerance for thermal headaches.

If you have a lot of money to blow, you can just get two RTX A5000 and have the same amount of VRAM with NVLink but with lower power draw and a blower design that will eject all the heat out the back of the case. No headaches there.

If you have even more money to blow, you can just get a single RTX A6000 or 6000 Ada, which has 48GB VRAM in a low power blower form factor.

If you have less money to blow but can tolerate potentially butchering your PC case’s thermal configuration you can attempt 2x RTX 3090’s and then deal with the power and heat later. Note that imposing a Power Limit with nvidia-smi helps a lot here.

If you like the above option but have a little more wisdom, you can scrounge eBay for blower-style RTX 3090’s that were briefly made by both Gigabyte and Asus but are since discontinued; these should negate the thermal issues by blowing the hot air out the rear of the case.

If you cannot afford any of the above, then consider re-evaluating your plans and needs. A single RTX 3090 will probably get you pretty far but might impose limitations if you want to scale up, however it might be your best option to get started quickly for ~$760 on eBay.

And, after you have thought about all that, think again if its even worth building a physical bare metal system in the first place; you can just rent out time on AWS or similar with access to the highest quality most powerful GPU’s on the planet for a fraction of the money you will spend building a similar computer system yourself.

1 Like

BTW most of the higher wattage solutions rely on a low ambient air temperature. The room with the PC will probably need air conditioning while you are running your ML tasks. This either means adding a small air conditioner, or if you have central air, possibly adding a booster to the duct that dumps air into the room with the PC.

2 Likes

@MikeGrok that is a really great point. When I got into this, I just assumed that I would be running the AC pretty much all summer from now on, usually once the room ambient temps get above 75F.

the gf has been asking now “why is it so hot in here?”, please pray for me when she finally figures out that its from the PC… :sweat_smile: :grimacing:

yeah it seemed like the dual 3090 would be the best solutions. and yeah I know heating/venting would could be a problem im not worried about it, I plan on having that in my basement, with air if needed and I also have a 4u rack mountable case as I said before it would really be for 3d rendering and things like that, Ai would be kinda an offshoot because I find it interesting. Thanks for your guys help so far @gc71 @MikeGrok

I am in Alabama, 2 years ago during the summer, we had 3 weeks where the nighttime low did not drop below 93f, and the daytime high did not drop below 109f, peeking at 113f. I don’t know where you are located, but sometimes just opening the window is not a good option. Other times it is a great option.

If heat and or power, maybe and or price is a concern, and you are considering dual 3090, take a look at the 4090. I have heard that for a given compute level on ML tasks, it performs twice the work of a 3090. If you are still planning on dual 3090, make sure that your ML task can run on a pair of GPUs, vs only using one GPU at a time. You can still run VMs, and give each VM a GPU, but it would be nice if the task was multiGPU capable.

You could make it a 4th gen Xeon build and play with the AI accelerators that are on the CPU. Just an idea.

I appreciate the post, good point, I was going to just get a single 3090 and if need be upgrading to two, as far as cooling I am up in Michigan, and I think I have that taken care of. Yes it can get hot up here, but, even still I have an air conditioner in the room so that if I need to cool down the room I can do that. The 4090 is an interesting concept for sure, I thought about that but thats like 2.2-2.5k or something where I could get 2 3090s for like 1-1.4k and besides, this is kinda a new build for me so I have a general idea of what I need on the rendering side, just not on the Ai side, so the single 3090 would technically be plenty for the 3d rendering part, its the Ai stuff that has me wondering, so if a single 3090 is all I need then cool I wont get the second one. So that’s kinda my thinking about the gpus so far. I thought that since I was willing to spend that money would it be worth getting a Quadro but doing the research it didn’t really seem like it, so thats why I came here to get others opinion on it. Its kinda a bummer though, I wanted a reason to actually buy a Quadro card and it looks like its really not a reason beside stability. :frowning: lol

1 Like

its too late for me, I’m already bought into to the epyc line my friend

What type of ai are you doing?

if it is Stable Diffusion based, that only requires 11GB of ram for most tasks, including many things like training your own models. I plan to get the 4070 (12GB) for my PC. Plus the idle thermals on that are only around 15 watts. The ram requirements dropped significantly about 1.5 months ago.

I am currently using a M2 MacBook air 24GB/2TB. It works, and is the same speed as the 4070 for 512x512 size, which apple optimized to work on their ai processors, (ai processor use = +3.5 watts) but no one has modified that code for larger sized images. The apple AI processor is about 20 times faster than the GPU for AI tasks, but the interesting AI images that I can generate get interesting at higher resolutions, and then the 4070 will be 20x to 30x faster.

I also bought into the epyc line. My cpu is the epyc 9124, and as soon as I find the ddr5 rdimms I bought (or buy other ones) I will be able to post some benchmarks.

For now the main thing would be Stable Diffusion based things yes, I would be interested in doing some ML for like a chess engine called LC0 or Leela Chess Zero and other things, but like I said I’m kinda new to the Ai/ML part of it. But yeah those are the main things I’m interested in at the moment.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.