It would appear the new AOCL 4.1 libraries are inferior to MKL 2022 libraries even when running on one of the brand new threadripper machines:
AOCL is behind MKL in performance by about 10% in this workload. It is wild that Intel can write demonstrably more efficient libraries for AMD hardware than AMD can; surely AMD has more insight into their own hardware than Intel does.