Question has anyone tested Zen 5 with CPU or mixed CPU/GPU Inference with LLMs? With the new AVX 512 updates to Zen 5 I’m wondering if there has been any speed improvements. I’m not expecting GPU like speed but has the speed came up from 2 tokens per sec to maybe 5+ tokens per sec for many models. I would think that the 9700x would be the best CPU for this since there is only 1 CCD and you could put 96GB of RAM in a machine to load up models that even a 4090 would not be able to load.
Phoronix has 9600x/9700x benchmarks
The higher core count cpus are only launching next week.
Yeah, there’s not gonna be a 2.5x improvement…
1 Like
Thanks for the information. Yeah I was not expecting performance to go up 2.5x but those benchmarks are quite impressive.
If your main focus is LLMs, remember that CPU inference is heavily bottlenecked by memory bandwidth, and this won’t change with Zen 5 on consumer platforms (still dual-channel only).
It’ll be faster, sure, but don’t expect much more than a 20% uplift.
3 Likes