Hi,
Per title, I’m taking care of one R750xa that runs LLM Inference, no training (For now lol). Due to rackspace/PDU capacity constraints, i cannot order a proper chonker to replace the R750xa, but I do have a budget to replace the two A40s that sit in it, that I have to use up this year.
I would love some input on this from people more sane than me:
My plan would be to replace the 2xA40 for 2xRTX Pro 6000s, BUT that would not “exactly” fit the official power budget of the server for GPUs (4x300W).
If I go the Server edition route, the math on paper is simple: 2x600W = 4x300W ; Not to mention the Server editions have advertised “configurable powerdraw”, where if I set the power target to 50% I could get to the aprox. 300W/card.
Then, there is the MaxQ blower-style with limit set by default 300W, but having two blowers stacked is no-go for me and I kinda want to keep the NVLINK bridge there, but I guess I could live without it. The only proper deal breaker for me is putting blowers into a server, which I’m not entirely comfortable with.
It would be no problem for me to make or source a 2x8pin EPS → 1x16pin firehazard cable. As long as I don’t use 2 8pins from same 12V branch, it should not be a problem (Correct me if I’m wrong)
Due to company policies, I can only order Ada/Blackwell GPUs, so no 4xA100 for me…
So the question is: Should I go for the 2x Server edition cards /w NVLINK or get the two MaxQs and put them in separate GPU cages?
Thanks for all the inputs!