I think the easiest, as well as the cheapest, option for the largest models right now is actually a 24GB-48GB VRAM GPU + as much memory as you could fit to your motherboard. Regular desktop processors - albeit slow, and slower in this configuration - can fit as much as 256GB in 4x64GB, and MoE models are perfectly usable for a great many personal purposes even with the slower system RAM, as long as you offload all the non-routed blocks to GPU.
See this thread for what people have been trying. 256GB RAM + 24-48GB VRAM should allow you to run 671B models at ~3bit quant, 235B models at 6-7bit quant, and anything smaller at 8-bit quant or more.
Otherwise, pray they’d produce a version of AI Max+ 395 board with regular DDR5 memory controller and 4x(SO)DIMM slots. The higher quants of the largest open-weight models really won’t fit under 128GB + any cost-effective amount of VRAM.