Hello everybody!
Short story: In my research lab, we handled to get a used AMD EPYC 7532 and TYAN S8030 GM2NE for our deep learning developments.
Currently, as OS we are using Pop OS 22.04 with the package for CUDA, CUDA toolkit and so on.
But the problem is where we train a model, the code compiles in the CPU and GPU (RTX 3090) but it’s kind of slow, as a reference we took an i7 12gen (I didn’t remember the exact model) with an RTX3070TI, and it’s a 20% faster to training the same model.
As OS SSD, both machines use Crucial P1 500gb.
So the questions are the following:
It’s necessary to compile the TF package to get the full power of the CPU?
A collaborator also suggest the usage of ZenDNN to enhance the overall throughout of the system, but I read the ZenDNN package is worth to use for the inference stage and not for the training, I’m right?
As hardware, the other difference between systems is the RAM since the i7 has 32GB DDR5 XPG in dual channel and the EPYC runs on 16GB DDR4 ECC (again I don’t remember the type) in single channel.
Hope someone could help us find the issue.
Greetings and happy holidays.