OS for AMD Servers running K8s

Hi All,

Hyperthetical setup.

I have bought 10x AMD Milan 7713, 256GB Ram, 12x Samsung 1TB NVMe drives (RAIDZ2 pool?) and Mellanox ConnectX 4 NIcs. I want to host these servers in a data center and run Kubernetes on them.

I have watched a couple of videos on youtube and I still don’t know which OS to use. For this hardware setup, it seems like a good idea to go for TrueNAS, but will this work for an Production eviroment?

I hope I have provided all the information needed to answer my question. :slight_smile:

Single or dual-socket mainboard? Is that 256GB of RAM per system or your total amount? Same for the NVMe drives and the NIC’s.

Btw, you’re contradicting yourself:

If you buy something, it’s no longer hypothetical :roll_eyes:

3 Likes

I’d just go Debian or Rocky. BSD with k8s is a bit strange. I wouldn’t do it, especially for production.

1 Like

Single socket servers and that RAM and HDDs is per server.

Maybe I should have said, I want to buy, but it doesn’t change the question. :slight_smile:

Thanx, having a look at those 2 options.

have you picked out your K8s version you are going to use…

their documentation should have the list of supported OS’s and versions of packages listed.

1 Like

Not yet, but my concern is more about getting the full speed of the NVMes in RAIDZ2 and RDMA support over network. It’s my understanding that you need the latest kernel for that. I am not 100% sure on this.

I’ve been looking into OpenSUSE MicroOS + Kubic recently - they’re built for this kind of thing – but I haven’t gotten as far as trying them out.

They should ship with 5.15


But I’m confused, if you’re about to be using Kubernetes, why would you use ZFS, as opposed to running Ceph (Rook) as one OSD per device

Are you worried that 10x12 == 120 drives is too small to get good data distribution and performance out of Ceph?

That would probably be the better route at this deployment size. ceph definitely has some overhead, but at this scale, it’s nothing those systems can’t handle.

Also, I did not know about Ceph. :-p

Ceph is a whole rabbit hole in itself. I haven’t worked with it myself in ~6 years, but I could definitely point you in the right direction if you have any questions.

Since this will be a production setup, I would rather stick with what I know. If zfs will not give me the performance that I expect, then I might have a look at Ceph. Thank you for the feedback anyway.

1 Like

The problem with zfs is that its a local filesystem. Kubernetes clusters expect some sort of distributed block filesystem for persistent volumes, and that’s where the challenge comes in. Obviously there are solutions, but ceph sort of does the heavy lifting in merging all that storage together.

Well, technically, kubernetes allows you to run with just local storage, but its not the most robust solution. Have you considered how you will manage high availability of storage?

1 Like

Ah, I think I understand what you are trying to say. We are going to use Rancher for PVCs.

I’m not familiar with rancher, so I won’t comment on it, just wanted to make sure you were all set in that regard.

I guess only time will tell. :slight_smile:

1 Like

Well, you have a plan, so that’s the first step, right? Hah

1 Like

My biggest consern is the networking side of things and I do not know yet where to post my question on this forum. I have joined another youtubers’ discord and will ask there as well.

Regarding the mellanox gear, you could post in #networking and you might get some better response. I’m not familiar with that gear personally, so I can’t help myself, but the networking category would likely alert people who are more experienced with it.

1 Like

Have you looked into Openshift? Potentially on oVirt? It comes with a simple installer, web dashboard, and a somewhat easy to configure Ceph cluster with Openshift Container Storage. Then your OS install would simplify to either RHEV/oVirt nodes for an automated install or RHCOS for a manual install