Proxmox Asymmetric Cluster Management

zanginator · November 4, 2020, 10:10pm

Hi everyone,

I know I am a little late to this, but better late than never!

So for ‘Devember’ I am going to be focusing on getting my Proxmox Asymmetric Cluster Management system up and running.

A few years ago I wrote a bash script (that was awful I admit) to automagically migrate a VM in Proxmox from one node to another when the system load increased. The idea being that you could have N number of low power virtualization nodes and one or two high power nodes.

One of the great advantages of virtualization is resource utilisation and instead of having several idling servers, you could have just one or two boxes hosting VMs.
Well, the question you have to ask is… is this as efficient as it could be? Is your web, email, PBX etc going to be at high load 24/7? Does your virtualization cluster need all the hosts running at one time? In the enterprise this would be crazy, but like many of us here I have a homelab and its using power sitting idle most of the time. Power that I am paying for.

So what if there was a system that could scale the underlying processing power dependant on the VMs load.

This year I have dabbled into looking at this project but never committed to it. So now seems a good time to.

I am going to be using Python with my rational being that it’s already packaged in many popular Linux distro’s. As a result I will be using proxmoxer to hook into the API of Proxmox (I don’t feel like re-inventing the wheel with my own API hook in… yet).

My goals are roughly as follows and in the order I feel they need to be achieved in to get this working:

Get status of the cluster and nodes.
Get status of VMs in cluster.
Add Manual moving of VMs.
Something to make suggestions on what needs to happen. (ie what VMs should migrate and to what node.)
Make the above automagic. (and implement migration lock)
Power on/off cluster nodes.

(I did say it was a rough list of goals)

Guess it’s time to get this project on the road.

zanginator · November 5, 2020, 8:47pm

So I’ve made a start, hooray.

Most of what I have done is comb through code I had already done and clean it up a little.
There are things I am unhappy with in the code, but it’s testing code that works and I’ve got time to refine it.

Anywho, here are some screenshots if it running. Only pulling basic cluster node stats at the minute, but I am settling into working with the API more than anything.

So the main menu:

The Node Information (after selecting a node):

I am not sure if I will actually commit to having anything under ‘Network Info’ or what I would display. Maybe just links, IPs and Speed?

Anywho, hopefully on the next update I’ll have VM info added and a breakdown of where VMs are in the cluster. (And also remember to minimise the IDE when taking terminal screenshots)

EDIT: To those who maybe interested, you can see what sort of things can be pulled via the Proxmox API 2.0 here. https://pve.proxmox.com/pve-docs/api-viewer/index.html

zanginator · November 12, 2020, 8:19pm

This past week has been interesting. I plagued myself with indecision of how I want the program to function. So much of this post will about my rational and how I decided on functionality. If you don’t want to read this long winded post, skip ahead to Changes.
TL/DR: I decided to use a database and make the system stateful.

Decisions

One of the things I really wanted to do with this program was keep it stateless, meaning all the information required for the program to operate (with the exception of the config file) would be pulled from the API.
This however produces a problem, one part of the operation is to prevent auto migrations of VMs or of VMs to/from a certain node.

The idea behind this functionality is to allow the operator to pin a VM to a node (if you have an application in running in a VM that’s particularly sensitive, it maybe beneficial to prevent it from moving) or even reserve a node in the cluster for the operators needs. To be clear, it won’t prevent manual migrations (either in the program via API or in the WebUI) only auto migrations being triggered by the program.

“Wait, hold up… in your amazing 6 step plan, the migration lock is on step 5! What gives?”

I shifted to doing it now as I noticed it had a larger impact than initially thought on the programs data handling. Hence I shifted to doing it now as I thought this may require some data management of some description. Turns out, I was right.

So here is what I tried and why I settled on what I did… a database.

Store the migration flag in cluster.
Store the flag in a local file.
THEY’RE IN THE DATABASE!

1- At first I looked at storing the flag within the cluster itself and that would allow the program to remain stateless. pvesh can allow you to set a config variable (and hence also through the API), the problem is via the API you can’t apply a custom variable. Not a problem, I can store the migration flag in the node config in its description as it can take a string value. However I had doubts if this was a reliable way of doing this due to an issue I have experienced (and seen a few cases of it dotted around the web), the config sometimes resets on a node reboot.
A planned function is to have the cluster turn on and off nodes, so I have played with the wake-on-lan config option (where you store a MAC address for a node that will accept a WOL packet) and occasionally it would reset upon a node reboot.

This then lead me to numero 2.

2- Store the flag locally. This sounds pretty simple, on the surface, until you start picking apart what you need to store. So I need to store the ID of the node/VM, whether its a VM or node and a migration state. As the API operates using JSON formatting, my immediate logic was to follow the same convention.
What I quickly realised however is that I was attempting to re-invent the wheel. I was building a JSON handling database and it was crap.

This then lead me to numero 3.

3- Yes it’s a database. I ultimately decided that a database was the best idea, however what kind of database was the next question. There were two things that I thought were needed for this to fit nicely into this program.

Light Weight.
Not require any additional server setup.

So a light weight database was a functional requirement as a full blown Relational database like MySQL would be 100% overkill (maybe not on a large cluster with hundreds of VMs, but that isn’t the target setup here). As I was essentially building a document orientated database, I took to looking down this route.

The second requirement of no additional server was kind of limiting. A system like MongoDB would be pretty well suited to what I was doing, but also required additional setup. After some looking around I looked into TinyDB. This is a small document oriented database written within Python and ultimately was what I was trying to write.

Database

So after playing around, finding the pitfalls and looking into the full capabilities. I’ve got to say TinyDB has impressed me. I am slightly worried for future dev as written on their own “Why Use TinyDB?” Under the Why Not Use TinyDB is the following: access from multiple processes or threads

This posses an issue with say a separate thread doing migration functions. However I am hoping this won’t be an issue as the system doesn’t have a lot of data changing rapidly. Additionally the secondary threads will likely only be reading the database and not making changes.

Changes

0.0.2 (2020-11-12)

Cleanup of Code.
- Applied proper Python forming to some functions.
- Added some more comments.
Removal of ‘Network Info’ in node info menu.
Added VM information querying.
Added Migration controls to VM and Node Menu.
Added TinyDB to store information and settings.

Screenshots

A lot of the changes have just been in how the program runs and doesn’t really have many user facing changes.

That said the Main Menu does now have the “VM Information” and “Rebuild Database” options

Screenshot 2020-11-12 at 20.02.45

The “Rebuild Database” option is only in here at the minute as certain changes that happen in the cluster aren’t reflected in the database yet. Plus if it messes up, its a nice option to have to essentially say “reset all”. Later, I plan to put a small confirmation message in front of this option though.

The VM Information is much like the node one. With the exception of the “Other Stat” displaying the node the VM resides on. “Toggle Migration” which does what it says on the tin and toggles the migration flag. “Select VM” is me playing around with being able to reselect a VM without having to exit to the main menu and re-enter the VM menu. I will likely reflect this change into the Node Information Menu as well.

Screenshot 2020-11-12 at 20.03.04

What’s Next?

With my data handling dilemmas out of the way (for now) I am hoping to make faster progress through planned functioanlity. First off is to tackle the Manual Migration. I will likely push this as v0.0.3 without other changes.

zanginator · November 28, 2020, 9:30pm

Talk about shooting yourself in the foot.

Had quite a few distractions recently, but now that I have time to sit down and actually look at this… progress has been made.

At the end of my previous post I stated:

And I did. As a result, I didn’t post anything about it as I felt it was a small change. At first I thought it would be a bigger step to integrate as it was a point where I was POST(ing) data to the cluster rather than just GET(ting).

But it did set about what I wanted to do in v0.0.4, which is what this post is all about.
Fresh off of pushing v0.0.3 I set out the following goals for v0.0.4

Move Migration Controls (and related) to their own section.
Add a migration status viewer.
Add a suggested migration node.
Add a migration lock on manual migration if a VM is in flight.

As it sat, the migration code was sat within the VM menu section. It made sense in terms of execution, but knowing there would be other related stuff (and a lot of it) I knew it had to be somewhere else.

As a picture is worth a thousand words and a video is too big, a gif will have to do.
migration

So in the above it starts in the Main Menu over an option called ‘Migration Status’. This simply shows if any VM is in the process of migration.
Screenshot 2020-11-28 at 20.56.45
Nothing is moving.
Screenshot 2020-11-28 at 20.57.42
This shows VM with id 100 is in a migration. I will deepen this up at some point to display it’s source and target. For example source_node -> VM -> target_node

Within the VM Menu is the ‘Manual Migration’ option that was introduced in v0.0.3. Hovering over the option now will present to you information about if migration is available to you. If the VM is already undertaking a migration, trying to enter the menu will not work and the message below will state
“Manual Migration Unavailable - Migration in progress.”
Otherwise it displays:
“Manual Migration Ready”

Within the migration menu, it displays target nodes. One will be marked with a (suggested) tag.
The program takes the node load-average and divides it by the max cpu cores the node has. The reason being that as nodes may have different core counts, their capacity is different. The decimal values gives you sort of an aggregate CPU load average… or a CPU percentage average.
This could have been mitigated by using the CPU percentage usage, however when querying the node I found that it would occasionally state a far lower CPU usage than what was actually in use. The load average isn’t perfect either as indeed it is an average over a time period. Loads that may have been moved to the machine may not be reflected within the load average yet.
Unfortunately, both ways have their issues. Personally I think the route I choose makes the most sense, but maybe a hybrid approach would be preferable. For example say two nodes are close in load average, take a look at CPU usage to make a final decision.

This update had quite a bit of time thrown at it as I played quite a bit with the UPID of the cluster tasks and took some time determining how best to query the nodes for what they were up to.
To keep it brief, turns out there was a really simple way to do it. In the tasks brought back there’s a tag called ‘saved’. If the saved state was 1, the task had completed (and an endtime (in epoch) would be present). If it was 0, it meant the task was in progress. If it wasn’t even present (as I came across was a possibility), it means the task hasn’t started. In other words, a migration task was about to begin.

Changes

0.0.4 (2020-11-28)

Moved Migration controls to separate file.
Added Migration Status Viewer.
Added Suggested Migration Node.
Manual Migration lock if VM is in flight.

0.0.3 (2020-11-12)

Added Manual Migration of VMs.
- Accessed via the VM Menu.
Added Migration Toggle (stored in DB for persistence and for later use)

What’s Next?

Looking at my list of “rough goals”, I need to tackle steps towards automation. The first of which is periodic monitoring of the cluster and the VMs. After that, it will be automatically firing the migration code that has been written.
This will involve some threading, which should be interesting.

zanginator · December 19, 2020, 6:29pm

Let’s please not talk about shooting yourself in the foot. December has flown by for me.

Originally although I started this in November, the idea was to get the program into a state where I could get all the tedious little stuff out of the way and focus on the larger stuff in December… Well we’re 19 days in and not a lot has happened.

Although I have been working on one of my original objectives (sorry nothing ready to show in prime time yet), which was to have the system be stateless. (IE no database).
As it stands, pretty early on I decided to go down the route of using a DB (in this case TinyDB) as an ‘ease of use’ option. Now I am removing it.

What are you doing instead?
Back to one of my original plans of keeping the stuff in the nodes.
I had issues with setting custom flags in configuration. However I’ve realised, ‘Why not just store this in the node/vm description?’.
This does require the development of string parser and something that can grab certain snippets out of it, but this really isn’t that big of a task.

So hopefully soon, once it’s in full swing (and I’ve added some error catches) I’ll have an actual update.