Background
The goal is to setup a high performance but lowest-power high-availability cluster with software such as Proxmox or XCP-NG to host and manage cluster operations.
What I came up with for the video was to use 3 mini-PCs connected to each other via “high speed” thunderbolt and then using the 1 or 2 built-in 2.5 gigabit nics for network-facing operations.
Each Mini PC idles at less than 9 watts; at full load the cluster is using less than 300 watts of electricity.
A downside of this is that often these mini PCs only have one or two network connections, and they usually top out at 2.5 gigabit.
In another video we covered m.2 cheat codes to give them faster network cards… but for a lot of home use cases… 2.5 gigabit is fine.
And more often than not people want something that doesn’t use a lot of electricity more than they want raw speed.
A nas could be used for optional bulk storage… connected with iscsi or a simple file share.
Let’s take a closer look at the setup.
Setup
The physical setup is pretty easy – I used some different Intel NUCs I have been collecting around the office to test for this kind of use case.
Install Proxmox and plan your IP addresses on each node. Connect the thunderbolt ports together. If you plan to use 3 nodes, make sure that each node actually has two thunderbolt ports and connect each node to each other node.
Security/Thunderbolt
The hardest part is to modprobe thunderbolt-net
and then configure thunderbolt for, essentially, no security (just to make our lives easier).
vi /etc/udev/rules.d/99-local.rules
and add this line:
ACTION=="add", SUBSYSTEM=="thunderbolt", ATTR{authorized}=="0", ATTR{authorized}="1"
That should be enough for thunderbolt0 to be available in the network interfaces. If you don’t see a thunderbolt0 interface double-check your thunderbolt cables. Often USBc to C cables will “work” but not really for this use case.
For a three node cluster, we want to ensure that all of our nodes can see each other node.
Design Pattern for High Speed Interconnect
Two Ports Direct Connect Three Nodes is very useful outside of Thunderbolt. Remember Those Cheap Intel Omnipath cards I bought?
Thunderbolt is almost forgotten for networking purposes. It is very unoptimized and the “ping” latency times out of the box on the linux kernel leave a lot to be desired. I don’t think anyone that knows what they are doing is testing this anymore, and thunderbolt-net does not (yet) seem to even support the AMD chipset solutions (though it can with some backing – more on that in the future).
Okay… mini PCs and anemic 40 gigabit is too slow for you, the same what about speedy 100 gigabit? Intel omnipath cards are around $30 and two of those in some regular desktop machines. They work fine with cheap fiber qsfp adapters to direct connect. Cool them well and the sky is the limit for an inexpensive way to scale your cluster without a central switch.
IP Addressing
Since Thunderbolt is a direct connect beast, that means each port essentially gets its own IP address.
thunderbolt0 IP | thunderbolt1 IP | |
Nuc1 | 192.168.100.1 | 192.168.102.1 |
Nuc2 | 192.168.101.1 | 192.168.100.2 |
Nuc3 | 192.168.102.2 | 192.168.101.2 |
… This is assuming the subnetmask is 255.255.255.0 (default for). If you’d rather do it with a /30 IP subnet for each port to show the class how talented you are at subnetting, or just virtue signal, or confuse newbs in the reply, yes, any IPs will work as long as you get the subnetting and netmasks right.
Proxmox also needs another trick to work properly and that is getting the nslookup
for each node right.
Each nuc will need a hosts entry that specifies the thunderbolt IP that corresponds to the other 2 nodes. And this is different on each node – not something PVE proxy really expects to be a thing – but it does seem to work anyway.
What I mean is that if you are on the terminal for Nuc2 and ping nuc1 and nuc3 you should get
192.168.100.1
and 192.168.101.2
back as the IPs (and they should respond to pings).
This requires that you manually edit /etc/hosts
on each of the 3 nodes in the cluster and add the other two nodes’ IP Addresses.
It is also very important the thunderbolt cables be attached as shown in the video and that all the ping commands work.
Setting up The Proxmox Cluster
With the network interfaces configured, it is essentially a texbook Proxmox Cluster configuration from here.
https://pve.proxmox.com/wiki/Cluster_Manager
Can the cluster have more than 3 nodes?
The reason it is easy to run 2 or 3 nodes is because each node can connect directly to each other node using one or two ports. If you wanted to run 4 nodes, you’d need 3 fast connections on each node.
Why am I only getting 800-1200 megabytes per second?
Internally our little NUCs share 40 gigabit link for both 40 gigabit thunderbolt ports – at most we’d only ever see 4 gigabytes/sec between nodes. Software limitations mean we typically see 10-15 gigabit per port +/- anyway.
The other problem is that a fair amount of bitrot is creeping into the thunderbolt-net driver its seems. I’ve been working on some fixes around AMD PCIe tunneling, but nothing to share here just yet. If you’re itching for something to work on, understand kernel drivers (or want to learn) this would be a pretty fantastic project for you, I think.
Would a thunderbolt NAS work for the cluster?
Not really because there is no easy way for the 3 nodes of the cluster to share