Thunderbolt 4 based ring network between Intel NUC's for Ceph Storage on Proxmox

SapphironZA · March 21, 2023, 11:24am

Hi All

I am working on a proof of concept for a micro hosting requirement. The client requires hosting of many small VM’s, about 30-100 of 2GB of RAM each and 30GB of storage each. CPU and disk Load generated by these VM’s are very low.

They are looking to retire their two old Dell R540 servers due to very high datacentre power costs.
rather than buying a single server to replace it and having that single point of failure, we are thinking of going a homelab-ish route of using micro desktops as servers in a cluster. We cant seem to find clustering hardware that is not total overkill for this requirement, or terrible expensive relative to the resources you get.

We are investigating the option of setting up a cluster of Intel NUCs like these ones: https://www.intel.com/content/www/u…nuc-12-pro-kit-nuc12wshi5/specifications.html

We like that it has a 2.5G nic with Vpro. We will likely Vlan public traffic onto a Vlan interface setup in Proxmox and use that interface for the management lan and backups to external storage.

We are also wondering, is since each NUC has two thunderbolt 4 port, would it be possible to build a 10gig ring network with Thunderbolt cables and avoid having to buy expensive thunderbolt to 10gig adaptors. This Ring network would likely use OSPF like apaird’s video here Fully Routed Networks in Proxmox! Point-to-Point and Weird Cluster Configs Made Easy - YouTube

We are looking at 3 or 5 nodes initially, up to a maximum of about 9 at most if the concept works very well. if we need more than that, we can probably justify buying a supermicro 4 node server to replace it.

Using a SATA SSD for boot and a M.2 SSD for VM data. We know that there is no disk redundancy, but the requirement can tolerate 5min of downtime and a minute or two of data loss in the case of a node failure. We are wondering what would work best for storage.

We are wondering if it is viable to setup the M.2 SSDs in a ceph cluster with 1 OSD per node. We will be using something decent for the M.2 SSD, but at 2.5Gbit or 10gigabit networking, I don’t see the SSD being the performance bottleneck. The shared storage nature should allow for migration and HA in the case of node failures or maintenance. I know general practice is to use at least 4 OSD’s per node, but I am not certain as to the thinking behind that. I have seen people using single OSD nodes in their lab environments.

Anything less obvious that we may be missing, or is someone using hardware other than Intel NUC’s for a similar purpose?

ucav117 · March 21, 2023, 2:19pm

Off the top of my head I don’t believe it is possible. My understanding of Thunderbolt is that it is PCI over serial interface. That means that there is no networking or host-to-host inherent in the standard. It is meant for host-to-device connections and then the device uses the serial connection to leverage the pci lanes on the host.

All this to say that it is a bit over my head but on fist glance this is a very non-standard use of thunderbolt and there are other ways to go about it.

Have you looked into M.2 to 10G Ethernet adapters?

Also take a look at Minis forum devices. @wendell has been experimenting with adding connectivity to them.

Zedicus · March 21, 2023, 2:38pm

thunderbolt direct PC to PC connections are possible and a driver is available for windows. so actually maybe.

actually it does appear that the linux kernel does support seeing thunderbolt as a network interface and is available to create adhoc networks. cool.

um, so thats all i got, i knew it had been done on windows, but i have never set one up on any platform.

https://christian.kellner.me/2018/05/24/thunderbolt-networking-on-linux/

here is a guy that did it once on linux. so in theory you should be able to build out the port on ProxMox if you can find the port ID.

LiKenun · March 21, 2023, 5:33pm

I’ve looked at many such Thunderbolt-to-10GbE adapters. They’re bulky and much of the heft seems to be for heat management. One of those adapters is easily a quarter of the volume of the NUC itself.

vic · March 21, 2023, 5:55pm

Seems you really meant a ring topology. I wonder if driver is available in Linux. Nothing is impossible. Perhaps your firm can sponsor an open source initiative. Waiting to see some excitement. Btw, peer-to-peer networking through Thunderbolt cable is surely possible in MacOS if I recall correctly.

ucav117 · March 21, 2023, 11:09pm

That was an interesting read. I am curious about the performance in real world as it seems in his example it is poor.

Crazy, learn something new everyday.

xi_Slick_ix · July 8, 2023, 10:10pm

So I got this loosely working under Proxmox 7.x with some skull canyon nucs. I’m actually the dude apaird called out at the beginning of the video. That reference came from a chat on his Discord. I waw someone on a ServeTheHome video reference in the YT comments doing TB networking between two hosts and that sent me right down the rabbit hole, asked him about it, etc.

On apaird’s discord, in the proxmox chat, if you search for thunderbolt networking you should see some discussions about a few people trying this with different levels of success. The thought process at the time was the Linux TB networking drivers were not super mature. I have not tried under Proxmox 8.x yet.

If you experiment with anything, please share the details!

Also, if wendell & apaird would do a collab, that would be epic.

littlefooch · July 9, 2023, 1:02am

try this
thunderbolt is certainly capable of ethernet networking
look for the “Thunderbolt™ Networking
Bridging and Routing
Instructional White Paper”
on thunderbolttechnology dot net (intel)
circa 2014 but the basics still work

wendell · July 9, 2023, 2:06am

I commented I was working on this in Feb or March. Out of the box Linux you can do about 10gb. It mostly just works, just have to load modules and config the interface
I plan to release a guide that lets you get it to about 35gb with lower CPU overhead. Soon ™

I commented because I wondered if anyone else has done it or thought of it and the answer seems to have been no

LiKenun · July 9, 2023, 5:45pm

I had my thoughts towards connecting machines in different rooms. But… $369.99 for a 10 meter optical Thunderbolt 3 cable

uAlex · July 17, 2023, 4:24pm

Did you already got time to write something down to use Thunderbolt networking?

Kormac · August 7, 2023, 12:03am

@wendell - I’m looking forward to reading the guide as I’ve been thinking about moving my existing homelab hypervisor setup to this configuration (either NUCs or Ryzen Mini PCs) and leveraging TB4 for connectivity between the nodes.

scyto · August 13, 2023, 12:17am

+1 here, i have the mesh working (thanks to uAlex on the proxmox forums)

Can you give us a hint on the approach (is it something easy like using systemd.link to set negotiation speed (i saw that in the man pages and wondered if it was simple) or will this require a forked thunderbolt driver?

wendell · August 13, 2023, 3:54am

I went down a rabbit hole on the amd side of things. Thunderbolt net does not autoconfigure on their pcie tunneling side of things sadly and I’m not sure the issue.

On the Intel side across a bunch of my devices some work better than others? And some have a ping time randomly between 10 and 300. If shouldn’t be like that.

What did you end up doing for yours, and what hardware do you have?

scyto · August 22, 2023, 12:21am

thanks @wendell i spent my vacation last week doing this (instead of gaming, lol) proxmox cluster proof of concept (github.com) everything works as documented (i make no claims about best practices).

I also found there is a signifcant bug with IPv6 over thunderbolt-net on promox/debian 12
(ICMPv6 works fine but services listening on :: won’t answer - this isn’t an obvious IPTABLES issue or anything like that).

In terms of that bug i reached out to the maintainers of thunderbiolt-net to see if they had a thought about - no conclusions on that yet. I happened to ask about speed too, i got the following:

“The link is (if you are using recent kernel on both sides) 2 x 20G but
the DMA hardware can only go < 20G. With Intel Alder Lake-P and beyond
it can go > 20G but it is more like ~26G not 40G.”

I don’t know how to get my NUC to NUC connection to negotiate at more than 10G tho…

LiKenun · August 22, 2023, 12:26am

That Thunderbolt reserves some minimum amount of bandwidth (8 gbps?) for video… would it apply to this use case and thus cap the theoretical maximum bandwidth for networking?

scyto · August 22, 2023, 12:27am

If they continue to reply to me I will first focus on the IPv6 issue, then ask how to configure the thunderbolt-net to do more… will post back here if i learn aything

scyto · August 24, 2023, 8:44pm

FYI i have now done iperf3 speed tests and on proxmox on my 13th gen NUC I am hitting 26G as the guy from intel said i would. I would say that’s the most we are going to get until they decide to change the DMA controllers in future revs…

littlefooch · September 13, 2023, 8:56pm

Give me an idea of what a ‘ring’ would look like.
I’ve been able to build a ‘thunderbolt’ network using TB cabling between computer hosts on windows.
Take a look at the original bridging and routing paper done in 2014 to think about what is possible:
Google ‘thunderbolt bridging and routing’ and you’ll find it.
At some point you will might want to figure out how to ‘switch’ between thunderbolt networks.
One option I’ve used is to use a single computer as a ‘switch’; that is, you can install multiple TB cards in a single PC and do the same on other PC’s to connect them together. Note the intel while paper above assumes two thunderbolt ports per computer.
Another thought is to use the bridging option on a Microsoft networking adapter which would allow you to ‘piggyback’ on a 10G cable and network (with switching).
Another option is to use a Thunderbolt hub which will take one TB input and output the signal to 3 TB ports.
Something to think about.
Hope this helps.

scyto · October 2, 2023, 5:44pm

Give me an idea of what a ‘ring’ would look like.

like item 2 and 4 here proxmox cluster proof of concept (github.com)

Note this have only tested this TB4, i am not sure if this is required as it uses xdomain which is part of the TB4/USB 4 spec.

It will also only work reliably with the patches making their way through into the Linux kernel at the moment.

And yes a hub toplogy is possible , i have not tested, but the code maintainer shared this with me:

you could build a “network” like this too (assuming a hub with 3
downstream ports, one upsream):

 Host A ---> Hub ---> Host B
              +-----> Host C
              +-----> Host D