Hello, do you know why the same algorithm can take different times on different computers? One takes 11 sec per iteration (Xeon W-2223) and the other (Ryzen 7 2700) 31 sec per iteration. I have seen benchmarks in single core and they are practically the same.
The language is MATLAB and the algorithm type is a Monte Carlo. There are multiple operations with vectors and matrices but both have AVX instructions. The latest version of matlab is R2021a Update1.
But it’s been a year after that. And he says at the beginning that the problem is fixed … Also I can’t post because the thread is archived. Also thank you very much. I hope someone else can contribute something something.
You say they perform similarly for single threaded performance and that your algorithm is AVX accelerated - have you compared the AVX benchmarks for the two?
E.g. my Xeon E3-1231 is faster in AVX accelerated FFTs than my Ryzen 5 1600, although the Ryzen is generally faster.
The Xeon is capable of using AVX512, where as the Ryzen is not, so if your algorithm can use AVX512 that may explain it (maybe…)
I used Matlab 2018b on linux and I remember it was using OpenBlas project in the background for some matrix operation.
It’s not just the “matlab language” the function call waterfall of calls that can end up in openblas and then it’s a openblas issue.
Why the lapac openblas project decided to calculate the stuff differently on Xeon than on Ryzen? It can be the Ryzen is missing some capability that openblas takes advantage.
-or-
Maybe Matlab uses Intel-Optimized Math Library for Numerical Computing that work only on the Xeon
But for Matlab to use the AVX instructions, does it have to be explicitly put in the code in some way, or does the compiler do it?
Is there a way to know which instructions AVX has used? For example, how do you know that you have used AVX 512 and not AVX 2?