Sysadmin Mega Thread

SgtAwesomesauce · July 22, 2020, 6:15pm

Even compared to yum, its slower.

Dynamic_Gravity · July 22, 2020, 6:18pm

Yeah but speed isn’t a focus. Reliability is. Especially since most of the time updates are handled automatically with a satellite server.

Granted, we don’t have that so it takes me awhile to patch but who cares I’m still getting paid.

SgtAwesomesauce · July 22, 2020, 6:26pm

Well I have never known yum to be unreliable. And it doesn’t seem to actually improve anything.

Dynamic_Gravity · July 22, 2020, 7:35pm

which other package managers allow the rollback of specific transactions and allow one to only update specific packages with patches from a defined security advisory based on severity level?

With a priority for uptime we need more out of a package manager than just update or install and watch it go brrrrrr.

The only other one I know of with a similar feature set is zypper.

SgtAwesomesauce · July 22, 2020, 7:49pm

Nice if you don’t trust the maintainers to provide a coherent system, or if you’re writing shit proprietary code.

But really, who uses that?

Your system is either up to date or it’s not. Not sure why you’d want to selectively update something.

Dynamic_Gravity · July 22, 2020, 7:56pm

Why should a DNS server get feature updates to packages when all it is doing is serving DNS. If new package foo-1.1.0 requires a reboot to go into effect to the latest and greatest but I know I don’t need it and can get by with backported patch foo-1.0.1 (which won’t require a reboot) then I’d prefer the patch instead because it requires less change.

SgtAwesomesauce · July 22, 2020, 7:59pm

Because a new version could fix a bug, exploit or other flaw. A feature update doesn’t include only new features.

This idea of backporting bug and security fixes is the result of lazy software development and lazy sysadmin work.

This is what HA and failover is for. Well, it’s designed for unplanned failures, but using it for planned shouldn’t be a problem if it works properly.

Dynamic_Gravity · July 22, 2020, 8:08pm

I don’t think its laziness its just two different philosophies.

There is nothing lazy about wanting to keep a service running.

Let me give you an example of something that recently happened.

We had a production database host, which serviced around 50 clients. Replication and all that jazz.

The underlying hypervisor went down in the IaaS and then failure mitigations moved the vm to a different host. Performance tanked all the while.

Thank goodness we had other VM’s in place to maintain services.

However, what if that was the DNS server or the IAM server? The chances might be low that both the primary and fail-over are on the same underlying hypervisor but is there any way to check for that? No, not really. How does one cover for that scenario besides being reactionary?

What if updates were being installed during something like that. VM is toast as has to be rebuilt. We have this scripted with orchestration tools but it still takes time.

Not everyone works with stateless/ephemeral data.

Dynamic_Gravity · July 22, 2020, 8:14pm

I guess the stuff I care about the most are easy rollback of transactions in the event of failure and a history log when working in a large distributed global team.

freqlabs · July 23, 2020, 4:09am

freqlabs · July 23, 2020, 4:11am

ZFS snapshots

SgtAwesomesauce · July 23, 2020, 4:16am

Hey freq, I’ve been seeing some really funny issues and PRs on the zfs repo over the last month or so. Everything okay over there?

nx2l · July 23, 2020, 10:25am

I’ve broken vendor software before by fully patching.

Used yum undo to roll it all back, then slowly installed patches until I found which ones broke it.

Had to exclude them for a while until there was a fix.

zlynx · July 23, 2020, 5:20pm

I’m pretty sure VMware ESXi has a feature to ensure critical backup systems are never on the same physical hardware at the same time. I can’t say for sure since I was not the one configuring it.

There’s limits though. I know the system treated our server blades as individual systems, even though if the power distribution board failed or the networking, all the VM hosts would have gone down.

But then, that’s no different from a datacenter rack power failure or network switch failure, I suppose.

Edited: Found it I think. VM to VM affinity rules (or anti-affinity). See: https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.resmgmt.doc/GUID-94FCC204-115A-4918-9533-BFC588338ECB.html

marelooke · July 23, 2020, 5:44pm

They pushed a not-ready-for-prime-time init system onto enterprise users, so yeah… (and no, systemd was not ready for prime time when it first was pushed into RHEL)

RedHat also has a long history of NIH-syndrome, so them building something new instead of working with an existing community shouldn’t exactly be surprising either. Once you have the money and market share to just supplant whatever already exists, then why not? It rather makes sense from a business perspective.

zlynx · July 23, 2020, 8:20pm

Every now and then there has to be a big change, or nothing ever changes.

As for enterprise customers and RHEL: If they are that hard into avoiding change then they never upgrade their RHEL version.

I know because as a programmer I was still having to build versions of software for RHEL 5 in 2018.

marelooke · July 23, 2020, 11:17pm

Yes, that’s how the enterprise “uses” start-ups. They get to try out the latest and greatest, and once a consensus appears to have been reached, or the kinks have been worked out, then the enterprise runs with it.

RHEL5 is still in (extended) support, even today, so as far as Enterprise is concerned: it was still perfectly fine back then (well, depending on other factors, of course, like, were they paying for said support, because if not…ouch…). That support is ending soon though, and any properly functioning enterprise business will have migrated off of it by the time support ends, because, if not, there will most likely be attention from (upper) management, and not the kind one would want.

ifc2000 · July 24, 2020, 10:00am

I have two NAS servers that are in discrete locations and I want to have some my files synced automatically and securely. I’m currently using rsync and cron to do local backups, but I am looking for the best way to do a comparable thing with two servers that are not on the same network. Any recommendations for the safest and most secure way to do this?

sanfordvdev · July 24, 2020, 2:04pm

Syncthing maybe?
https://syncthing.net/

Dynamic_Gravity · July 24, 2020, 4:06pm