I am looking to upgrade and add additional services to my homeserver - in 2024, what is the right allocation of my workloads across what hardware/software stack?
I currently have an Intel Skylake 6500 (old gaming PC), with no GPU running the following workloads in Linux Mint:
LinuxMD RAID1 with 2 HDDs and EXT4 filsesystem
Local SMB/NFS shares for media sharing to desktop, HTPC etc
Syncthing (for phone sync, offsite backup)
Nextcloud (AIO running in VM for file sharing, notes, talk etc)
I have a few goals for an upgrade:
Keep idle power as low as possible (~70c/kWh here)
Upgrade to 2.5G ethernet
Migrate from old gaming PC case to small formfactor case I could put on a shelf in my networking rack
Migrate to ZFS/BTRFS to protect against media bitrot
Allocate more cores to Nextcloud to increase responsiveness
Run Ollama with 7b/13b/35b sized models with an exposed API
(optional) 1080p game streaming to my HTPC?
With these goals, what is the right strategy and where should the workloads run?
12 core AM5 (EPYC 4464P?), and run everything on CPU?
6/8 core AM5 + dGPU for ollama/games?
8 core AM5 APU and use the iGPU or NPU for ollama?
Wait for better Zen5 APUs? AM5 Strix Point or Strix Halo?
I was thinking about the AM5 DeskMeet x600 as a starting platform, but I am open to suggestions or general concepts on how anyone would tackle this at the lowest pricepoint and smallest formfactor.
I guess this would be your best option.
Games wouldnât do that good without a proper dGPU, and any iGPU/NPU is not really good enough for running models bigger than 7b. For 13b+ models youâd be better off running on the CPU rather than relying on the iGPU/NPU.
Forgetting gaming for the moment, is there any way to estimate the performance of CPU vs iGPU vs NPU for running difference sizes of models? Iâm trying to get my mind around why iGPU or NPU would be worse for larger models?
7b/13b models run OK on CPU on my 5600g right now at ~5tok/s, but I would ideally like 2-10x that performance.
Both CPU and iGPU would still be limited by memory bandwidth, so youâd end up with almost the same performance for both, but with way more hurdles to get the iGPU to properly work and be able to make use of UMA.
Since you already have a 5600g, you could just try out running your favorite LLM on it and find out both how hard it is to setup, and speedups you may get vs CPU inference.
The AM5 Ryzen 7900 (non-X) is still the sweet spot when it comes to idle CPUs with a good power profile, that still pack a punch. Sadly it only supports unbuffered ECC memory, and the AM5 platform really sucks when it comes to PCIe lanes. On the other hand, if it fits what you need might be worth a look?
Thanks for this, this was really useful. I think I will just go with a fat CPU for the next couple of years until iGPUs and NPUs get more powerful and the software stacks stabilize - or until a 4 channel DDR consumer platform emerges.
From what Iâve heard from nextcloud is that is clunky no matter how much hardware you throw at it. One efficient way of increasing its performance is to integrate it with memcache.
Since youâve mentioned NPU: you can replace the WiFi A E key with something like. Google coral if you plan to ever use some kind of image or video pattern matching processing.
Yes. Currently on home servers you can go one of three ways;
Low power NUC(ish) or RPi. Great for a web server and simple applications, not that great if traffic is above 10-30 users, and for homelabbing it is just too easy to hit the roof. Power budget, 5-20W idle, up to 50W active.
Go older gen AM4 or Intel Gen 12+. The 5600G is a really cheap and power efficient chip, and good enough for some VM usage. The 12100 and 12400 is nice too, as is the low power 13th and 14th gen.
Power budget, 25-50W idle, up to 90W active.
Go new with AM5. Not the best for power budget but really good performance. Higher Intel chips just draw too much power to be viable here at he moment. This is where you go of you want to go crazy with hypervisor, VM, Docker containers. The 7900 is the top pick for this simply because of a low idle score (35W) and max 95W system draw. Going lower I would say better to go with the 5750G Pro.
As for reusing your old 50W+ gaming rig as a server, donât. And EPYC / Xeons are built for an even power distribution at load, but terrible idle specs.
So, that is why I hold the opinion that the 7900 is the sweet spot right now for a good homelab CPU. It just refuse to compromise on the power / perf / price triangle
You didnât answer your claim about âAM5 platform really sucks when it comes to PCIe lanesâ, intresting it excels all other platforms youâve mentioned.
RPi is a crap-tastic platform unless youâre possible going for something that more or less runs entirely in memory otherwise both Allwinner and Rockchip are much better options but again irrelevent to what OP is asking.
These are all legacy platforms with with limitations and again irrelevant to what OP is asking about.
As much you seem to want to hype the Deskmate thereâs no confirmed (to my knowledge) support for ECC except that it boots (according to ASRock). There are severe limitations when it comes to overall connectivity and storage expansion.
The AM5 Epyc series are not confirmed as compatible with the Deskmeet and why bother with a new cooler for the 7900? The supplied one will do just fine unless there are physical limitations (havenât checked).
I would also go for Crucial (P5 Plus or better/newer) instead of Samsung given their overall trackrecord.
Which is again irrelevant but still better than most of what youâve mentioned. Worth also taking in mind is that PCIe versions do make a lot of difference so there really isnât much of a need for more lanes if you pair hardware with corresponding version.
Thereâs hardly any hardware that can make use of PCIe gen 5 and thereâs still plenty of hardware that runs at PCIe gen 3 and some even 2, so with PCI express lanes itâs still pretty much quantity over quality.