Optimal SIMD/Vector array width


Current State of Vector/SIMD

ISA Architecture SIMD width SIMD ISA
Intel 64 Skylake-X 512 bit AVX-512
AMD64 Zen 128 bit AVX
AMD64 Zen 2 256 bit ?
Power POWER9 128 bit VSX
ARM Cortex-A8 128 bit NEON

According to @Methylzero, AVX on Zen is internally 128 bit wide


What is the theoretically optimal SIMD width for CPUs?
I would guess this varies based on use case, but I’d be curious what people’s thoughts on this are.


  • Use case (duh, not everything on CPU is SIMD heavy)
  • At some point, moving the data to a GPGPU or ASIC probably becomes more efficent
  • large TDP difference from SISD operation, as on Intel’s AVX-512
  • CISC/RISC ISA making a difference?

Why are you asking about this?

Partly based on a February Lounge post by @wendell, in response to me asking about the Talos II board he is testing; maybe the SIMD width of 128 bits on POWER9 is a limitation? But Zen’s SIMD is also 128 bits wide too…

and also information I’ve been thinking about from these two threads:

in the FMA thread, I asked this question specifically about Power but knowledge on that is a bit more scarce, so I’m curious what information/discussion a more general question will turn up.