Container Platform Recommendation?

joelmaxuel · February 8, 2019, 8:28am

Hey all,

I was at a place the other day that provided a lot of swag and discussed (among other things) containerization of services (won’t reveal the host of said thing in an effort to not sway the answer).

They have their own product, I am not fully set on their route however do see the benefits of containers on a system as opposed to all the services on bare metal (namely portability to ease migration; easy restore if a single service gets screwed up).

However a couple things I use require a lot of resources as it is, and don’t want to knee-cap the time it takes to do things (call it my stigma to virtualization, which may not really apply here). The things I run include:

Jenkins
GitLab
NextCloud
TvHeadEnd (LAN use only)
X2Go (Planned)

With that said, I have the requirements:

Somewhat resource-bound dual-X5650 (total 12c/24t x 2.67 GHz)
Debian host OS (open to switch eventually)
Containers are tiny (bulky data stored outside container); able to bind mounts in
Low I/O waste (don’t want to notice a difference in that half-hour Jenkins build)
Host OS can easily access logs (for box-wide fail2ban)
System resource overview (host and containers) - optional
Gratis

The model I see is that the host OS would only have the ssh service, database service (to provide for certain services), fail2ban, and the local (likely to be now LAN) mail service. Everything else is split across a few containers.

Is there a container solution to recommend? I have used LXC a tiny bit months ago and may be reasonable for the job (not sure if you can mount in directories from the host, which will be very important).

Thanks!

imhigh.today · February 8, 2019, 3:57pm

Containerization is not virtualization. They’re conceptually similar, but the workflows and best-practices are very different.

Somewhat resource-bound dual-X5650

Containerization won’t help with that. The “more efficient”, “high denisty” claims only matter if you’re contenting with duplication of resources. KVM virtual machines, for instance, have multiple root filesystems, GNU userlands, shells, and system tools for every virtual machine you have. A thousand Docker containers or LXC containers only have that one copy if they’re identical.

If your applications use memory, cores, and I/O, there’s no getting around the resource use. Your applications use what they use. If you’ve got resource contention, you need to address that by using ligher apps or getting better hardware.

Containers are tiny

A basic Docker container is about 8M, basic LXC OS containers can be closer to 200M. Gitlab’s .deb package is 101MB. Unless you’re running hundreds of instances, container size isn’t a serious concern.

Low I/O waste (don’t want to notice a difference in that half-hour Jenkins build)

Now here’s an example where Docker can really help. Docker’s build process uses filesystem overlays and only changed layers get acted on. If your builds only change a few things, Docker only runs the parts that change. This can bring your half-hour Jenkins builds down to 10 seconds… if your Jenkins jobs are configured to use Docker. If your build jobs are currently single script iterative builds, then containerization doesn’t offer you any benefit. The “differences in workflows and best-practizes” thing is critical here.

It should also be noted that Gitlab CE has native Docker support workflows as well.

Host OS can easily access logs

This is best handled by using a real logging solution, not bind mounts. Kubernetes, for instance, has log management features if you go that route.

Fail2ban is also kind of terrible (threading performance is garbage, start/stop times suck, et cetera). Also, the firewall configuration to a decent Docker setup can get stupidly complicated and fail2ban doesn’t handle it well by default.

System resource overview (host and containers)

Docker has a pretty wide set of tools for this. Cockpit being one of the most simple and pretty, Kubernetes’s stuff being some of the most powerful. You’ve got Swarm somewhere in between, plus a bunch of community-based tools.

LXC’s management tooling is a little less robust, but the CLI might be all you need for management of containers. There’s also Libvirt integration, so you have virt-manager and any other libvirt capable tooling.

Personally - i think containerization is overrated.

If you’re not going to containerize everything, you’re simply complicating your workflow and going to make your life hell. It required a re-evaluation of your application stack and use cases from the ground up. If you aren’t willing to do that, or can’t for “the boss says this needs to be done now” reasons, rethink your motivations for considering it.

If you still treat servers like pets (easy way to tell? You still name them rather than giving them serial numbers), go LXC. You don’t log in and fix containers, you shoot them in the head and redeploy fresh ones. Hence “cattle, not pets”.

It’s lighter than KVM, allows you to think about things using the familiar “boxes on my LAN” mentality, but still gives you many of the simplification benefits by limiting each server to a specific role.

joelmaxuel · February 11, 2019, 11:41am

I imagine I am getting closer to naming my systems by serial number; as my “pet” names have been iterative of old hardware, much like the cold numbering off of cats replacing dead ones (ala “Snowball II” from the Simpsons). Ultimately I am looking to treat the containers like cattle - if one goes troublesome, I redeploy a known good backup.

My understanding with LXC is that the container filesystem is merely a directory on the host FS, so I don’t see a need to bind mount for say, Fail2Ban; just include the full path. For the Android builds via Jenkins, the source code will need to be “mapped in” somehow, but I will figure that out. My biggest concern would be that file processing will slow down (for builds) - but it does not look to be the case here.

Since I did not see myself needing something very involved to the point of Docker or Kubernetes, I believe I will roll forward with LXC. Many thanks.

reavessm · February 11, 2019, 6:29pm

Kubernetes automates a lot of what you are looking for. It’s a container orchestration platform that works off of ‘desired state’ configs. So you say that want X many webservers running, if one dies, it automatically spins up a new one. If a whole machine dies, all of the containers on that machine are spun up on another machine. You can have your ‘pet’ service (i.e. named gitlab service) on top of a ‘cattle’ infrastructure (i.e. 20 something dockers somewhere on several different machines). I really think it’s the best of both worlds. It can even handle scaling based on load

Ruffalo · February 11, 2019, 6:34pm

I use LXD (LXC orchestrator) for my home server stuff. Each LXD container works exactly like a separate VM, with its own init, IP address, and everything, but unlike a VM is extremely light on resources. Works great, but you will want to run Ubuntu as the host OS as that’s where it’s developed and best supported.

domsch1988 · February 12, 2019, 2:48pm

I’ll only slightly add to the incredible answers already given.
I’m currently looking into Docker at work to speed up and simplify Setup of Linux Servers for Clients.
From my understanding so far, Docker is mainly useful if you need reproducible builds, a whole bunch of instances of the same thing or want to ship a complicated, preconfigured environment to multiple machines.

If all you do is run a single server or two, and set them up once and be done with it, i don’t see the benefit Containers would provide. A major benefit (especially once you get into swarms etc.) is scaleabilty and on demand deployment. Both are kind of useless if you have a single Hardware Server doing a predefined job you configure once for the foreseeable future.

I might be totally off though, as i’ve just started diving into the whole thing.

hem · February 13, 2019, 8:30pm

Personally, I used to run everything bare metal, which is fine if you’ve configured it several times and is comfortable doing so. However, I’ve installed the software I use in my homelab so many times because I screwed something up somewhere, that I this time, with a new home server, decided to try out running everything in containers. Must admit, I’m positively surprised, as much with ease of working with them as with the resources the software uses.

If you deploy you containers using compose, or Portainer, it’s possible to set lower and upper limits for resources available to your containers. In docker-compose, you’d have to use the deploy tag. Also, gitlab, iirc easilty use 2G+ RAM, this can be tweaked by editing the java engine settings file (forgot exact name), but it can have consequences on the stability and functionality if you set it too low.

Here I’d use Ubuntu, I used to run Debian, but there is a bit more legwork to do when it comes to configuring it compared to Ubuntu. It’s good for learning, which is why I ran it, Ubuntu simply has more in their repos and things here and there are already configured for you, which will save you some time.

Again, imhigh.today hits the mark.

Agree with imhigh.today

Most containers allow for configuration of a volume or bind mount to access these live.

Can achieve this to some degree using Portainer image for docker. Quick to setup, works ok for standard things, if deeper information is necessary for specifics, it can be done in an alternative way. I enjoy using portainer for quick overview, I never use it for deploying containers, those i do in CLI.