I hope they can overcome this memory bandwidth limitation and achieve really fast speeds with consumer hardware.
Itās not all about the hardware, itās the software nvidia has the cuda toolkit that all AI system can use/built to use. AMD has ROCm slower and not as plug and play to use, and Intel has Ipex that almost nothing works with but Iām helping intel write hijack code so when a cuda call is made by an AI program intel cards can use it. So AMD could put out a card with 48gb vram but with out the software it would be pointless. And if you think 2k is a lot donāt look at this at 19k NVIDIA Tesla A100 80G Deep Learning GPU Computing Graphics Card OEM SXM4 | eBay
Ollama does support AMD graphics but that doesnāt mean all the models will run on AMD graphics. I do like how when I use a larger model it uses system memory as well. Would make sense to support other vendors.