Running websites on off-grid Pi clusters (feat. Jeff Geerling)

ThatGuyB · February 26, 2022, 6:06am

The video bellow was my inspiration for this post. Well, it’s mostly a response to the video.

I am going to bully @geerlingguy a bit here, because my autism kicked in, and I can’t not talk about the lack of redundancy. I mean, does it count as bullying if you are criticizing ideas, not people?

I had an idea for an enterprise setup running on a Pi cluster, but my (proposed) implementation is completely different than Jeff’s. It may probably end up being way more expensive too, so take that into account as well. I will be proceeding with it, when I finish my most urgent priorities.

The setup Jeff did works, as proven in the video, but it is far from being reliable. First off, the Turing Pi 2 (TP2) is a single point of failure (SPOF). We, sysadmin folks, talk a lot about redundancy and keeping hardware (and software) running and we proud ourselves with our uptimes. Well, I now prefer frequent restarts and verify that updates don’t break stuff, as I get farther away from data centers and into the home server stuff, but that’s besides the point.

The TP2 (toilet paper 2?) has a lot of SPOFs that could go wrong:

single PSU
single Pi that takes care of the routing
single Pi that takes care of the storage (the ZFS pool NFS server)
single UPS
single network connection

Again, while not perfect, it gets the job done and given the same budget, you could probably figure something out with normal SBCs and a switch, but you would run into most of the same limitations.

How do we get around those limitations, you may ask yourself, curious creature. Well, the answer to this is “add more hardware” at the problem. Those are all hardware limitations.

So, what we can do is get another TP2 configured the same, right? Well… maybe. With 2x TP2s, if one of them fails completely, you still get the second one running. That would be basically a standby node and you would pay for 2x 4G plans and have redundant internet.

But there’s a catch. If the network connection between the 2 TP2s dies (like, if you have just 1 eth port connecting each other), then you may get into a situation where both believe they are the only one alive and try to be the active nodes. Even if we use the 2nd eth port in a LACP, balance-alb or active-standby configuration, you still have this risk, although the chances get significantly lower (maybe one of your switch chips on one of the TP2 board dies to trigger this event, but then only the main Pi CM4 will still be able to talk to the reverse proxy, not the rest of the nodes).

So, a workaround with the TP2 would be to get 2 switches (stackable or not, doesn’t matter) and connect each TP2 eth port to each one of the switches. If the switches are stackable, do a LACP or balance-alb, if they are not, just active-standby. Connect the switches together and connect a port from each switch to a router that has the 4G modem on it (or any internet connection). If you want redundancy, you may complicate the configurations with 2 routers, both connected to each switch. Either using keepalived and synced routing and firewall rules for Linux, or CARP for BSD. And you may connect a group of 1x TP2, switch and router to a UPS, and the second group to another UPS.

So now, we can have a k3s cluster split between 2 physical nodes. With 2 master nodes and 6 worker nodes, we can do quite a lot. We could even load balance them and if one TP2, switch, PSU or UPS dies, it’s no biggie, the other one will take care of it and launch additional containers. Because the TP2 build is self-contained, it doesn’t need resources from the other side.

But we’ve come pretty far and we didn’t even ask questions on the way. Is a TP2 good for such a situation? Shouldn’t we aim to do better? A TP2 should be around $200, with each SO-DIMM adapter for compute modules being around $10, bringing the cost to around $240, unless you use Jetsons.

With $35 CM4s x8 and TP2 x2, you raise the build price to a whopping $760. Not bad for a redundant cluster. I’m not taking the PSUs, UPSes, switches and routers into account, because those are shared components between Jeff’s modified workaround build, and my plan for a build.

We can get 2x Odroid HC4s for 2x $73 (so $148) and 6x Odroid N2+ for 6x $83 (so $498), bringing the total cost to $644. So around $100 cheaper, give or take, but instead of being solely dependent on the TP2, we have multiple boards that work together. The 2 HC4s can serve as the two NFS shares and master nodes, while the other 6 N2+ boards can be the workers. But the cost may be a bit offset by needing to buy switches with a few more ports than what you could get away with the TP2.

There are tradeoffs to this approach though. Jeff has a 2x 2U setup, but of which cases are likely way more expensive than my jank idea, but I guess the TP2s can work outside a case too, just like I plan for the SBCs. But neither would be very portable anymore.

For the software side of things, I will give it a deeper think when I actually build the thing, because right now I am uncertain if I should go with LXD or with any k8s stack. I prefer the classic way of managing services, so LXD makes a lot of sense to me, but even if I change my mind after the fact, I can run a k8s stack inside LXD, but not the other way, and I’d have to migrate the k8s containers into LXD then figure everything out. It’d be a bit of a headache, so it’s likely I’ll go with LXD.

But, one software that I would definitely change, at least in my own infrastructure, is to not make a SSH tunnel, but instead use wireguard, do some keepalive checks and if the tunnel is down, restart it. That way I can avoid doing TCP handshakes twice, for the SSH tunnel and for the communication between the reverse proxy and web servers. And speaking of reverse proxy, I’d go with HAProxy. I may use nginx as the web server, but as a reverse proxy, I don’t like it that much. The wg tunnel would obviously be running on the router.

Dutch_Master · February 26, 2022, 7:47am

That’s the beauty of OSS: if you don’t like the original idea, fork it and make your own

geerlingguy · February 26, 2022, 8:14pm

Indeed, that’s part of the reason I’m happy to provide the source for literally everything I do:

Turing Pi 2 cluster setup (K3s + 4G networking): https://github.com/geerlingguy/turing-pi-2-cluster
My Drupal website: https://github.com/geerlingguy/jeffgeerling-com
My website’s Docker image: Docker Hub

Though I do keep my actual site database and some of the infrastructure scripts private, for obvious reasons.

The crazy thing about this setup is I was hit by multiple DDoSes the day I posted that video, and some smaller scale ones after. The Turing Pi cluster itself was doing okay and handling the load—the VPS I run as the reverse proxy kept blowing up with over 2 Gbps of traffic hitting it. I sadly had to enable Cloudflare on the domain. Something I hadn’t done since moving my site to Drupal (PHP/MySQL) 8 years ago.

ThatGuyB · February 27, 2022, 1:07am

Thanks for sharing the links, Jeff. I hope you caught that most of my post had humor in it and it wasn’t really an attack.

I haven’t heard you mentioning much of the downsides in the video. I do however remember when you presented the TP2 in the original video. You presented the limitations so well in that video, but in this one, you just glanced over them. I mean, most people who know a bit about hardware will know that running the NFS server on the CM4 with access to the SATA ports means that if that Pi goes down, the whole cluster will be down.

I wanted to improve on your idea on the TP2 a bit. If you want some tips, always feel free to ask me on the forum, DM me and we’ll figure it out.

There are a few tradeoffs that I don’t believe I mentioned. If portability is a concern, some redundancy has to be sacrificed. But 2x 2U rack mounted TP2s with 2 switches and 2x UPSes and 2x SBC routers in 1U, would make for about 11 rack units. Although a bit heavy, given some wheels, it can still be “portable” if two people lift it up in a car, but it won’t be weather proof (I may have an idea for another project for that).

With the weird SBC designs I chose (Odroids), I cannot really rack mount them. Well, I can, but they would be janky. But with some 3D printing skills, we can find a solution to mounting the toasters and other boards in around 4U, so not really more space used than the TP2, with the quality being a bit lower (using plastic instead of metal), but with less weight to carry.

Other downsides include higher power consumption (more hardware) and more heat (because more hardware). Adding the space usage and the price that I already mentioned, and we now get an idea of why data centers cost so much. It not just doubles the price of the TP2 setup, you need more hardware to support all the networking and power (and in data centers, cooling too).

I am not planning on a portable setup, I plan on redundancy in a fixed place, but without blowing up the power budget. I want a data center in a room that can be both silent and not output much heat, so that anyone can replicate the setup and host their own services.

I am hoping that in 2 or 3 months, I will be able to start the project.

But I may be doing a side project at some point once I get my hands on a 4/5G portable hotspot, with a solar powered, portable server (or cluster, who knows), that may be a bit easier to carry than the 2U case and the giant UPS.

I may be trying this purely on IPv6 just to see the limits of what I can do, but if I can’t achieve it, I may use a VPS as a reverse proxy too. Also, I highly suggest you try a site to site VPN, as opposed to SSH tunneling. Wireguard is easier to set up, is lighter and faster, but IPSec is the standard used in enterprises and a bit harder to setup. Your choice. Just my $0.02. If I have to use a VPS as a reverse proxy, I’ll be using wireguard, as I have tested wg for an off-site surveillance camera footage and worked wonders.

That’s sad to hear. People can be real pricks. I wonder if they feel good about taking down websites or something.

We must do what we must. Hopefully it will calm down, if it hasn’t already.

system · November 27, 2022, 7:08pm

This topic was automatically closed 273 days after the last reply. New replies are no longer allowed.