Sysadmin Mega Thread

Even compared to yum, its slower.

Yeah but speed isn’t a focus. Reliability is. Especially since most of the time updates are handled automatically with a satellite server.

Granted, we don’t have that so it takes me awhile to patch but who cares I’m still getting paid.

Well I have never known yum to be unreliable. And it doesn’t seem to actually improve anything.

which other package managers allow the rollback of specific transactions and allow one to only update specific packages with patches from a defined security advisory based on severity level?

With a priority for uptime we need more out of a package manager than just update or install and watch it go brrrrrr.

The only other one I know of with a similar feature set is zypper.

Nice if you don’t trust the maintainers to provide a coherent system, or if you’re writing shit proprietary code.

But really, who uses that?

Your system is either up to date or it’s not. Not sure why you’d want to selectively update something.

Why should a DNS server get feature updates to packages when all it is doing is serving DNS. If new package foo-1.1.0 requires a reboot to go into effect to the latest and greatest but I know I don’t need it and can get by with backported patch foo-1.0.1 (which won’t require a reboot) then I’d prefer the patch instead because it requires less change.

Because a new version could fix a bug, exploit or other flaw. A feature update doesn’t include only new features.

This idea of backporting bug and security fixes is the result of lazy software development and lazy sysadmin work.

This is what HA and failover is for. Well, it’s designed for unplanned failures, but using it for planned shouldn’t be a problem if it works properly.

1 Like

I don’t think its laziness its just two different philosophies.

There is nothing lazy about wanting to keep a service running.

Let me give you an example of something that recently happened.

We had a production database host, which serviced around 50 clients. Replication and all that jazz.

The underlying hypervisor went down in the IaaS and then failure mitigations moved the vm to a different host. Performance tanked all the while.

Thank goodness we had other VM’s in place to maintain services.

However, what if that was the DNS server or the IAM server? The chances might be low that both the primary and fail-over are on the same underlying hypervisor but is there any way to check for that? No, not really. How does one cover for that scenario besides being reactionary?

What if updates were being installed during something like that. VM is toast as has to be rebuilt. We have this scripted with orchestration tools but it still takes time.

Not everyone works with stateless/ephemeral data.

I guess the stuff I care about the most are easy rollback of transactions in the event of failure and a history log when working in a large distributed global team.

6 Likes

ZFS snapshots

2 Likes

Hey freq, I’ve been seeing some really funny issues and PRs on the zfs repo over the last month or so. Everything okay over there?

I’ve broken vendor software before by fully patching.

Used yum undo to roll it all back, then slowly installed patches until I found which ones broke it.

Had to exclude them for a while until there was a fix.

1 Like

I’m pretty sure VMware ESXi has a feature to ensure critical backup systems are never on the same physical hardware at the same time. I can’t say for sure since I was not the one configuring it.

There’s limits though. I know the system treated our server blades as individual systems, even though if the power distribution board failed or the networking, all the VM hosts would have gone down.

But then, that’s no different from a datacenter rack power failure or network switch failure, I suppose.

Edited: Found it I think. VM to VM affinity rules (or anti-affinity). See: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.resmgmt.doc/GUID-94FCC204-115A-4918-9533-BFC588338ECB.html

They pushed a not-ready-for-prime-time init system onto enterprise users, so yeah… (and no, systemd was not ready for prime time when it first was pushed into RHEL)

RedHat also has a long history of NIH-syndrome, so them building something new instead of working with an existing community shouldn’t exactly be surprising either. Once you have the money and market share to just supplant whatever already exists, then why not? It rather makes sense from a business perspective.

Every now and then there has to be a big change, or nothing ever changes.

As for enterprise customers and RHEL: If they are that hard into avoiding change then they never upgrade their RHEL version.

I know because as a programmer I was still having to build versions of software for RHEL 5 in 2018.

Yes, that’s how the enterprise “uses” start-ups. They get to try out the latest and greatest, and once a consensus appears to have been reached, or the kinks have been worked out, then the enterprise runs with it.

RHEL5 is still in (extended) support, even today, so as far as Enterprise is concerned: it was still perfectly fine back then (well, depending on other factors, of course, like, were they paying for said support, because if not…ouch…). That support is ending soon though, and any properly functioning enterprise business will have migrated off of it by the time support ends, because, if not, there will most likely be attention from (upper) management, and not the kind one would want.

I have two NAS servers that are in discrete locations and I want to have some my files synced automatically and securely. I’m currently using rsync and cron to do local backups, but I am looking for the best way to do a comparable thing with two servers that are not on the same network. Any recommendations for the safest and most secure way to do this?

Syncthing maybe?
https://syncthing.net/

10 Likes