ProxMox troubles

So... The office manager got the root login info for our servers from the CEO because he "needed them." We were starting up a new server, and the office manager nominated one of the employees who "knows all about clustering" to set it up.

I know, bad place to be. Everything could have been avoided, and it should have been, but it was not.

  • First off, our hosting provider does not support multicast traffic so
    the initial sync failed.
  • Also, a bunch of config files were screwed up,
  • AND now the web GUI wont start.

I have done my googleing, I dont know what to do.

All our virtual machines are running, just the web GUI that is not working. "vzlist" shows all nodes running fine, i can control them with "vzctl." I am just afraid to reboot the server just in case it has been damaged beyond repair. I want every node in the cluster removed from the cluster, so we can start from scratch (and do it right).

I admit this is over my head. Im calling out to other Linux experts. What do I do? I would be willing to pay someone to fix.

Paste the config files to a paste host, and link them. Hints about where the changes were likely made in the files. If ProxMox either spews out some error messages when you try to launch the GUI, or if you can find error messages in a log, those would be helpful as well.

Also ProxMox have their own forum. You may find greater expertise there, should you not find it here.

in /var/log/daemon.log:

/etc/pve/local/pve-ssl.key: failed to load local private key at /usr/share/perl5/PVE/HTTPServer.pm

The key is there, and has the correct permissions.

When running ls -l /etc/pve/:

Here is the single entry in /var/lib/pve-cluster/config.db:


This means i cannot simply copy all VM data to the other host, this one has no record of them to the new host wont either.

I also get a lot of these errors:

ipcc_send_rec failed: Connection refused

At this point we are in contact with the ProxMox developers themselves. They want about $720 to fix it. If anyone wants a few hundred $$$ and can guarantee a fix, now would be the time to speak up

I'll gladly help you fix it, but I can't guarantee a fix, and we would likely both be spending a great deal of time. Thus the time you'd spend troubleshooting potentially might be worth more than the money you'd be spending to have it fixed by the pros… Although $750 is a lot of money, it sounds like a bargain for a guaranteed fix (provided they are not the types to format and reinstall).

On the other hand. Any problem you learn to fix, is a problem you may save $750 (or whatever the price) to fix next time it arises.

I am currently waiting to hear back from the ProxMox guys. Depending on their response we could move forward. If we can get it fixed, fantastic. If we can get it fixed enough to pull data off, well thats good enough too. Prices can be negotiated afterwards.

Assuming it does not go through, what information do you need in order to take a look? I can provide ssh login VIA email if that would help at all.
Do you have background experience with ProxMox or OpenVZ?

None. I just like solving problems, and mostly I have luck in doing so. I have extensive Linux experience, and have a reasonable amount of experience with other container and virtualisation environments.

Sure, if you like I can take a look. If you do, though, only give me read access. Don't give me root access. Unless I figure out exactly what the problem is, I don't want the right to change anything. That way neither of us gets in trouble, and if anything is done, you will know exactly what, and can set it back should it not work :)

I'll PM you my e-mail address. Then you can decide what to do. It is pretty late here, am typing this in bed. If I receive your e-mail before I fall a sleep (or I wake from the ping on my phone), I'll take a look this evening. Otherwise, I'll check it in 6-8 hours. If ProxMox get back to you before I do, just disable the account and I'll know what's up ;)

if you pay me $300 i can fix this for you.

:>

step 1
Check permissions, if you ran sync from root most likely all files are also rooted. (doubt this was running as root, its wrong on so many levels to do so...)
step 2
check configs, if config was synced well change it to fit your server not the one you synced from.
step 3
check if this is your ssl key not one replicated over...
step 5
make it wurk.