Threadripper Pro 5995WX performance

Hi everybody,
I bought a PC based on 5995WX, AsRock WRX80 Creator R2.0, and corsair vengeance lpx 256gb (8 x 32gb) 288-pin pc ram ddr4 3200 (pc4 25600). I use the PC for Ansys Fluent simulations. I have selected 4 NUMA per socket and disabled hyperthreading.
When I run the simulations, I see something weird. As I increase the number of cores for the simulations, the CPU usage increases until it reaches 100% with 44 cores. If I increase the number of cores, the simulation time does not decrease but increases drastically. Any idea why this happens? Any suggestions on how to make use of all 64 cores?

I just saw this video from STH, and this might be an explanation that your workload might be memory constrained.

They show some graphs about one of their benchmark that also stagnates around ~40 cores. I’m not sure if more memory would help or if there is a bandwith constraint. But this just reminded me of the video that I just watched.

1 Like

It may be that the problem your solving isn’t big enough to be paralized across that many threads efficently; the overhead of the solver interface between threads is greater than the work being done per thread.

Another very likely canadate for the platue is memory bandwidth. In my experience, obvisouly depending on the solver, 2-4 cores per memory channel is what to target for peak performance.

Out of curiosity do you see any improvements when running in 8 numa nodes?

1 Like

Many thanks for the suggestions. Makes sense now.

The BIOS doesn’t allow me to put more NUMA nodes than 4.

Ohh I wasn’t thinking NUMA control from bios, I had my comsol thinking cap on and thought you could just run fluent with the “-numasets 8” argument, but that is not the case.

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.