ThatGuyB's rants

I appreciate the insight, I might pin your comment, but mere mortals can’t afford to lose 3 to 5 times the capacity for redundancy (that’s what, 33% usable capacity and 20% respectively? and that’s not counting hot spares) for our home labs or even for small productions, so we’re trying to squeeze all we can out of raidz. We are aware resilvering tanks the performance and might kill some drives during the process, so we try to keep the pool small (that’s why I never recommend people in their own setups go above 11 drives per vdev and then just do stripped vdevs).

Sure, when it comes to important large production boxes, going balls to the walls with mirrors makes tons of sense, where impacted performance loses you money (like if people can’t click that “add to cart” button quickly and place the order).

For the previous company I worked for, working with small budgets and making due with LACP (and balance-alb when running out of lacp groups) on 2x gigabit ports, we never went above 6 drives in raid6 (I wanted to put the cards in HBA and use ZFS, but got outmatched 2 to 1 by my colleagues - if we went ZFS, I would still do a 6 drive vdev and stripe 3 of them). Our small production was pretty snappy despite the somewhat underpowered backend.

Not sure if working for such a large company as ix has allowed you to see the craziness that happens in the low budget departments. It’s a fun world in itself, but virtually all companies that cut corners will eventually find themselves in a situation where they lost money because of it, so next time around, they will go with a saner config (like a 3-way mirror). The ones that didn’t have an it manager that either saw it happen before, or lied about seeing it just to get approval for a higher budget to avoid it entirely (our department didn’t have an it manager, we used to but he left the company and we restructured under a CTO instead, with developer background, so all decisions were kinda democratic there).

1 Like

At the time IX systems was about 65 people.

I was only at ix systems for 3 weeks for a support job. I learned a bit, and gave them some ideas that completely changed the way they diagnose and replace potentially bad hardware. I also wrote a shell script to automate a boring and error prone hand data analysis task. I was more concerned about stopping them from doing stupid shit than being friendly.

I was let go after I left in the middle of the day to get my girlfriend (now wife) to the hospital. I had followed procedure, but my boss didn’t and lost face when looking for me. She had a hernia, the doc said if I had gotten her there 2 hours later it would have become life threatening. Her disability (quad amputee) and vanity (she wasn’t put together, literally and figuratively) prevented her from just calling an ambulance.

Unlike most people, I read the SAS spec cover to cover when it was initially released. I talked with their CEO about some issues in the freebsd driver that did not match spec and were costing them a lot of money. I also talked to him about some of the reasons behind designing SAS, and ways they could leverage that to get the software that they needed. They talked to LSI, who came back to them a week later and trained the senior staff on a software package related to that (I was not in the meeting and did not know the details). Their daily shipping costs for replacement hardware was around $12k. If they got the tools that they needed to match what the SAS spec required, their replacement hardware costs should drop to 20% and become much more convenient to customers. I know that they stopped all shipments right after that meeting.

They were trying to get into markets like lawyer offices, doctor offices etc that did not have machine rooms. The hardware they supply is noisy. I suggested that instead of investing ever more money in vibration isolation of hard drives and fan profiles, they just buy some acoustic enclosures, and gave a few examples that dropped sound levels from 30db for 4u and under $500 to 56 db at $7000 available in oak, beech, and teak. They said it was ridiculous, but now they are providing that very item. It completely solves the issue that they were encountering, and frees up 5 full time staff who wanted to work on other projects.

I was supposed to be getting trained, but the person who was training me was on a different shift, and we only overlapped by 2 hours a day. I noticed that much of the staff was spending the majority of their time hand analyzing the logs. Since I had several hours a day of ide time, I wrote them a 3k shell script which machine analyzed the logs and output an html file. During one of the daily meetings they brought up one of the logs I had analyzed, and announced that the solution was to replace the faulty drive. I asked about the other faulty drive and the questionable drive. They ran my script, quickly found the faulty drives, and decided to replace the questionable drive too. My script should have reduced the time needed to analyze logs from 6 hours per day per person to hopefully less than 2 (including phone calls and arranging shipping). In 3 days the support department went from using lots of overtime and being run ragged to being caught up and having idle time. As the most junior member of the staff, I was let go.

2 weeks later I had a 6 month contract at double the hourly rate.

Updates to FreeNAS that was on track to be released in 3 years was released in 7 months.

2 Likes

Have any experience with SCC (SCAP Compliance Checker) and making a custom benchmark (XML) or custom OVAL content for it?

1 Like

I don’t, but I have xml experience.

If you can show examples of a source and destination file I can write something that does that.

2 Likes

I won’t hijack ThatGuyB’s thread, but I’ll @ you in another thread, maybe in my ‘rant’ thread haha.

2 Likes

By the way when making a stripe of mirrors, you can start with dual drive mirrors, then add drives to existing mirrors if your need for read speed increases, or add more mirrored vdevs as you need more space. The reason for a 3 drive mirror per vdev is that so that the pool remains online and high performance even as drives fail, get swapped out, and resilvered. Also during writes the zfs server knows which VDEVs are busy, and directs writes to different vdevs if it is a stripe of mirrors the pool is forced to perform writes to a degraded vdev, which can get messy. I think a jbod (just a bunch of disks) where each disk is a vdev is a better strategy. Also one of the VDEVs in a pool may get frequently accessed data while the others are less frequently accessed. It is possible to increase the performance of a single vdev by either adding a SSD device to that single vdev, or just add more rotational drives to that vdev.

on a 3 drive vdev during resilver:
drive being used for reads
drive being used for source to resilver
blank drive being filled with data.

on a 3 drive vdev after resilver:
drive being used for reads and writes
drive being used for reads and writes
drive being used for reads and writes

on a 5 drive vdev during resilver:
drive being used for reads
drive being used for reads
drive being used for reads
drive being used for source to resilver
blank drive being filled with data.

Unfortunately if a drive is going to fail, it usually fails while it is being the source drive during a resilver, hence the usefulness of more than 3 drives per mirrored set.

on a 3 drive vdev during resilver and second drive fails:
drive being used for source to resilver
DEAD: drive being used for source to resilver - during resilver drive fails
blank drive being filled with data.
Notice that there is now no drives available for reads, The entire pool may go offline until the resilver is complete which may take more than 5 hours.

on a 5 drive vdev during resilver and the source drive fails:
drive being used for reads
drive being used for reads
drive being used for source to resilver
DEAD: drive being used for source to resilver - during resilver drive fails
blank drive being filled with data.

You can see why it is worth it to spend more money on more independent drives if the data needs to be high availability.

It is a good idea to have at least one of the drives in each mirrored array on an independent disk shelf.

Also remember that redundancy is not backup, backup needs to occur independently of redundancy. It is usually a good idea for the backup server to request data from the storage server instead of the storage server pushing data to the backup server. In case of ransomeware if all of the data on the storage server gets compromised, you don’t want the backup server’s data to also become compromised. Also in case of dedupe, if the dedupe table becomes larger than system memory, you don’t want the data on the backup server to become unavailable too.

2 Likes

I agree thus far, until this point.

I don’t think ZFS dedup is worth the hassle for budget stuff (which is basically my expertise), which is why I want to get into solutions like restic or potentially bacula. If you have lots of RAM (again, big enterprise customers) then maybe, dedup can save you TB of data if you have a hypervisor with, say, windows VMs in the upper double digits (even if you have 10 VMs you potentially save 40GB * 10, so 400 GB in one shot if all VMs run the same version of windows - and if deduped enough it can also give a speed boost since the same DLLs are cached in ARC and don’t need to be read again).

But generally my zfs pools run on low RAM (ix doesn’t even look at you if you run ZFS on less than 8GB of RAM, or at least that was the case a few years ago and I run ZFS on devices with 4GB of RAM and also run other things on top, like nfs and iscsi and it’s still fine). What was the recommendation for dedup, like 1GB per TB of total (not usable) capacity?

And I want to rant about RAM. Why is RAM so dang expensive on consumer hardware compared to stuff like CPU? You pay $250 on 64GB of RAM and $200 on a CPU. It’s ridiculous. So is SSD storage if you want to go over 2TB per drive, even if you go with bare basic 2.5" drives, let alone nvme. You get to 8tb qlc stuff and suddenly you are paying $900 for a storageless build and $1000 on 4 drives !!! ..... ?!?

In the enterprise, it makes more sense. You spend like $2000 for RAM and 8 to $10k on a CPU. Consumer memory is too darn expensive. So is flash storage. And I know if you run gigabit you can just use HDDs, but for the energy saving, it makes tons of sense to get ssds. I’m planning a new build and I’m salty about how much I need to spend on storage (and I already need to spend more on some spinning rust for a dedicated backup server - I currently have copies of my important data on my main pc and on my NAS as a backup, but I want to move everything to the NAS and have a separate backup server).

I can afford it, but I’m not made of gold. I’d rather not spend as much money if I can help it, but I don’t trust the used market, unless what I buy is so cheap that it doesn’t matter if it dies or not.

I’m getting tired and I feel like I’m losing focus on the above every two sentences or so, so I’m going to stop here.


In other news, my threadripper system worked flawlessly yesterday, but today booting appeared to disable the network card (I suspect a kernel crash). Tried booting into an older kernel, still failed. Can’t figure it out and I’m too lazy to reflash hrmpf on a usb stick to see if there’s anything to fix, like the vfio script or dracut. And without a dgpu to troubleshoot (since my gpu is passthroughed, although when I had 2 gpus, both had the driver blacklisted and I didn’t have a tty before either, but never had this problem). I’ll probably buy a used 710 or something that sips power as a troubleshooting GPU (interestingly, can’t find any gt 1010 gpus around, which should be the cutdown version of the 1030, neither new or used - people who sell 1030s for $70 must be crazy).

1 Like

ZFS dedupe is crazy dangerous. IXSystems had a client who backed virtual machines on the zfs server with dedupe. They had over 60 to 1 compression until they filled up their ram with their dedupe table. The problem was that the computer was already the most high end computer on the market. They had to wait 5 months for intel to release a new server line that could hold more memory before they could read any data from that pool.

Have you seen the epyc 8004?
amd is making an epyc mini. It is 1/2 an epyc using cut down epyc controller chips. 1P with up only 6 ram channels and 96 pcie channels. The CPUs start at $409 for an 8 core, or $639 for a 16 core.

I am currently running the epyc 9124 which is the cheapest cpu which would light up that motherboard. For an extra $800 system cost I don’t have to worry about running out of ram channels, pcie channels, or sata ports, and I get ECC.

1 Like

You do take regular snapshots right? Can’t you just roll the system back a day or so?

1 Like

There was no update on the system, I only powered on to launch a windows VM, then powered it off. The change is not in software, but in hardware. I removed 2 pcie cards (a gpu and a usb controller). I probably got lucky on the first 2 bootups (one when I had side panels still open, a second time to test it is ok).

And no snapshots are taken on the threadripper, it’s a playbox that has no important data on it and storage is limited. I will troubleshoot it when I feel like it.
:man_shrugging:

2 Likes

This was among the most retarded things I’ve done in a while. All because of this:

@MikeGrok yes, a ZFS snapshot would have definitely saved me there, but I wouldn’t know the root cause. I was dumb and added a fstab entry without noauto for an iscsi lun that only shows up on-demand (whenever I run the iscsi login script).

It took me like 5 minutes to find, it took longer to write a hrmpf iso to a usb (I should really make myself a ventoy usb and just slap the ISOs there, I used to use easy2boot ages ago, but I’d prefer open source if I can help it).

I obviously couldn’t find a log where mount of some mountpoints fail, because the system fails to boot properly, but I found in /var/log/socklog/kernel/current (which is how socklogd / svlogd logs stuff) an entry about iscsi not automatically logging in and then a bridge port going down, which made me look into fstab.

I didn’t even realize I edited that, that event was completely erased from my memory, I would still have mounted iscsi manually (although I might’ve used the wwn, as I remembered I got that working, so I don’t have to guess by the disk size). I kinda wish I maintained a changelog.

2 Likes

These past couple of days I’ve been messing with nixos microvm.nix. Seems promising if I’m planning to only run nixos. Probably something to look into in the future (or maybe sooner).

I’ve also messed with opennebula and the firecracker node, but I can’t figure out how this thing actually works. The weirdest part is that I’m getting stuck in a network message that the (to be) instantiated VM can’t get an IP / MAC lease from vmbr0. Which doesn’t make any sense, because it works just fine for the host. Setting this up is really not easy, unless you use the silly demo version miniONE (which runs everything on a single host).

I guess I’ll need to get the kvm host to work flawlessly (haven’t even attempted to test it), then troubleshoot firecracker. This shouldn’t be necessary, but I think I might be missing a step (probably NIC settings related and maybe storage backend).

2 Likes

Today I want to rant about USB. OMG, USB can suck @$$ sometimes. Here’s the deal. I’ve got an old Windows x86 tablet with a single micro-USB 2.0 port that’s OTG capable, also used for charging. My original setup with this was:

  • micro-USB B cable from tablet going to;
  • usb Y splitter, with an A female port and 2 A male ports (it was used to power an ancient 3G USB SIM modem to connect to the internet in the WinXP era and that thing needed more power than the mere .5A the standard USB A could provide);
  • 1x male A goes to a power brick;
  • 1x male A goes to a USB A female-to-female adapter;
  • inside the other female A port, plug in a USB stick, a keyboard, or a USB hub, powered or unpowered;

This setup works, but the tablet still gets discharged (albeit slower) when using a USB device, even with a stinking powered USB hub.

How to solve this? Well, buy an el-cheapo micro-USB OTG Y splitter: 1x micro-B female, 1x A female, 1x micro-B male. No matter the order you connect things to it, tablet does not charge at all, but it can see the devices attached. Plug this dinky adapter to an old android phone. Boom, works perfectly and charges.

Adapter failed. What other options do we have? Well, there’s USB type-C hubs that also have PD passthrough. Buy 2, to make sure (different brands and models). Originally used a charging adapter type-c female to micro-b male. This one didn’t show neither devices, nor charging the tablet. Got another micro-b adapter (different brand), still nothing. Test it with my newer(-ish) phone, works flawlessly and can connect usb sticks to it while charging.

Then I tried the hub with a type-c female to A male. Setup used micro-b cable to double A female to mc2fA to the hub. With the hub plugged to power, nothing can be detected. Without wall power, devices get recognized.

Ok, fine. Try using another adapter specifically soldered for this particular scenario ages ago (but had to discard the 5V brick and cut the wire). This one is kind of a Y-in splitter for data and power. Got female A to a male A that only passes data + another cable that used to lead to a 5V 5A brick (hard wired).

Guess what happens next.

  • micro-b male to female A cable;
  • Y splitter fA->2xmA;
  • 1x end to power brick;
  • other mA end to f2f adapter;
  • data usb A male cable from the other Y p&d (power and data) splitter coming in the f2f adapter;
  • usb A normal powered hub in;

No devices recognized, despite getting the power. It used to work when the p&d splitter had power coming in (well, it served more as a powered hub itself, it has 2 usb fA ports, but ignore that). It seems like a bare basic powered USB A hub doesn’t work if it doesn’t receive power from the USB data port. It works without wall power when plugged to a normal USB port (or through an OTG adapter), but when plugged to wall, but without power from USB, no devices get recognized on the PC (or tablet in this case).

Replace the powered A hub with the type-c hub that has USB PD (power deliver) passthrough. And this thing works. But I am now left with a lot of very jank adapters to make this work. Adapter-ception I’d say.

This is not portable at all, it’s a wire mess. It’s not a fire-hazard thankfully (as everything is soldered or tightly coupled), but I need a brick with 2 ports (A and C females). I might be able to do something like a “black-box cube” into which to throw all the adapters, but I’d still be left with a brick that has 2 cables coming out (one to the C PD hub, the other to the black-box), to finally combine into one cable going to the tablet.

Both C PD hubs that I ordered kinda work, but one is smaller than the other (and despite that, also has an audio out that doesn’t work on linux, lmao, so I’m using a FOSS USB audio out from thinkpenguin).


There’s no moral of the story (besides USB sucking), but a few big question are left. Why does the powered USB A hub fail when connected only via data to a host, while the type-C “powered hub” (the one with USB C PD passthrough) works? And why the micro-b OTG splitter didn’t work for the tablet, but worked for the phone (both are supposed to be OTG capable)?

2 Likes

This is why usb over wifi is a thing.

1 Like

Just started using restic. I want to comment on the above quote though. With zfs snaps, I technically didn’t need deduplication, as all the data was there and what changed just gets modified.

I don’t modify files that are gigabytes in size. In classic rsync / tar backup methods, files that change from 34MB to 35MB now use 69MB (nice intended). But because of how copy-on-write works (or at least AFAIK), if just 1MB changed, the original blocks move and the snapshot only consumes 1MB, not 69.

Restic on the other hand doesn’t really use “snapshots” (although it uses this particular terminology for “backups at a point in time”), but has deduplication, meaning that, in a similar fashion, if the blocks of a file are the same for most of it, only the changed data will show up in the repo, but the file won’t be copied over. Not sure if backups are really sped up this way, but it sure saves on space.

The reason I decided to go with restic and move away from rsync + zfs snapshots is because I wanted encrypted backups and was hoping for deduplication. I’m not enabling encryption and dedup on ZFS (I have my portable PC ZFS encrypted, but unlike that, my server should be able to reboot headless without awaiting for a password prompt and I ain’t putting no password files in plain text on my server).

I do have ZFS compression on a dataset (went for insane zstd-19, 'cuz backup server), but the CPU gets absolutely tanked (poor amlogic s905x3). Apparently restic does support compression, so I guess I might as well just move the data to an uncompressed dataset and only have the backup server serve as NFS and be done with it (I know, push is not the best when it comes to security, but I’m hoping I can still leverage some zfs snapshots to prevent the kind of ransomware attacks that would normally destroy all data on a backup system).

With each host compressing its own data, then the backup server can just mostly idle (I hope). The fan on it wasn’t happy either (although it wasn’t going full blast, but just bursting when it needed some cooling).

1 Like

Just want to quickly rant about people on the forum and IRL. There’s so many people that just see or hear something and are very quick to recommend something. But when I see that, I’m like “I don’t have enough information.”

Why? Maybe it’s part of my early childhood training on CCNA certification (like the most basic pleb-tier), where I got to always ask questions about the actual requirements.

I just feel like people will start recommending 8 core monsters to people that need to run only a nextcloud instance for 2 people, which can be done on a SBC. I don’t just feel it, similar scenarios happen all the time and more often than I would like.

Is there some kind of bias to recommend people what you’d want to buy, like some kinda projection? I find it really weird that people would just throw the kitchen sink at a problem, instead of trying to find a solution most appropriate to the problem. Well, the former takes less effort in a way.

2 Likes

Upgraded FreeBSD on my NAS to 14. I had more than 20 failed attempts, because of an unstable internet connection. And for some reason, freebsd doesn’t just cache and checksum what it downloads in the previous step, which is wild. At some point I was almost willing to plug it into my hotspot to see if I can upgrade it that way, but eventually I got a stable connection for long enough to finish upgrading.

Reminder that my WAN is a wifi connection. Sometimes the wifi drops packets (drops like wild, had like 69% failure in mtr to the gateway). And I can’t figure out for the life of me, why is it so unstable. It wasn’t like that for a couple of days. I can’t hear any appliances running, like a fridge or microwave, but it’s horrible. To be fair, the wifi is on the 2nd floor and my lab is on the 3rd and there’s quite some distance.

At some point I was wondering if it’s just the wifi implementation on Linux and if I should just do the jump to a BSD, or if it’s just the connection (TBH, I haven’t yet troubleshot the wifi, I haven’t enabled logging on the wpa_supplicant service). The plan was (and still kinda is) to switch to OpenBSD or FreeBSD (I’m biased towards the first for routers and firewalls and second for NAS’es), but when the setup works, it’s hard to justify tinkering (particularly on a router which I need to look stuff up, although I could do it on my phone, but :face_vomiting:).

In other news, an AP I had since November just died a few days ago after a power outage (it wasn’t on my UPS). It was a belkin ax3200 running openwrt. I was so happy with it though, but if the hardware is so crap that it can’t survive a literal turn off and turn back on again, then so be it. I did read a lot of bad reviews of belkin and linksys routers, but I didn’t believe they were that bad hardware wise.

I’m now looking for another wifi 6 AP that can run openwrt. I think I might go for the ubiquiti lite or LR stuff (I just read that the unifi 6 lite doesn’t support wifi 6 on 2.4ghz). But I don’t want snapshot stuff, I want something fully supported by openwrt. Apparently Banana Pi BPI-R3 is a supported router. And it looks like there’s a mini version with 2x 2.5gbps ports and wifi 6 (although it looks like it’s $125 for a bundle on amazon, eeww at the price and eeww amazon). And Netgear WAX202 is $100. Then, there’s also Asus RT-AX1800U / RT-AX53U (haven’t looked at prices yet, but I know these are old, might not be able to find them anymore).

I might just need to swallow it and buy something good (the belkin was $50 and it wasn’t even on offer - I wish I bought like 3 of them on black friday, but now I’m not even going to attempt to buy another one, despite its cheap price). That’ll put me at a very steep $300 to $450 for 3 routers. Uuuughhhh...... don’t wanna think about it. A pine64 a64+ is like $40, with about $15 for the wifi module, but it’s only wifi 5. Technically, there isn’t much benefit to wifi 6 (other than also running on 2.4ghz, but devices need to support it, otherwise they just run on wifi 4, but still good for longer range on newer devices). I’d also need a case (although I can just build one from scrap plastics I have around, with some standoff screws, just like I did my rockpro64 router).

The belkin was only a test sample, so to speak, I wanted to replace the current setup made of some junk netgear nighthawk something that don’t even allow manual DHCP IP assignments, for something that runs openwrt. But I didn’t want to buy 3 and realize that I can’t flash openwrt, so I only got 1 that only worked for less than 3 months.

IDK how the nighthawks would behave if I just slap them in bridge mode and use my own router for DHCP and DNS when it comes to the wifi switch-over (there’s 2 APs, one’s in “master mode” and another is piggy-backing it - I’d guess the wifi AP switching would be negotiated at the layer 2 depending on the signal, but I’ve no clue if something will break on them if they just become dumb APs, technically the slave one is just a dumb AP rn). I wish the CCNA had more technical details of how 802.11 standard works and not just focused on wifi security.

I’ll have to think really deep about this. Technically, I still need at least 1 AP, although 2 would be nice, for my own network behind my router (one for VPN’ed clients and one for direct connected ones). Right now, all my VPN’ed clients are behind the router and what isn’t is connected straight to the AP.

I realized in one of my posts and because of what I have to deal with at work, that people don’t know how to run monitoring software, or even what is that. It’s insanity.

Allowing your root fs to go to 100% (or even just /var/log) is unacceptable. You should get notifications when the storage is getting full and work on fixing it. Delete files, rotate logs, add storage, increase logical volume, anything! Sure, sometimes people do big OOPSIEs, like transferring an ISO to a folder that is part of root (like /home - linux sane configurations are another story) and filling it up before you have time to react, but that’s fine.

No, what I’m talking about is people going on for months with 99% utilization and one day waking up with their servers unresponsive, or some commands can’t run, or some processes crashed, because nobody checks some basic stats. It’s insane to check every server daily or even once a month if you have at least 10 (VMs or otherwise). But that’s the more reason to have something that alerts you when something goes wrong.

Just like people don’t run backups, people also don’t run alerting software. This should change. And I can understand people with a homelab not having one (although they should), but when people with full-on prod don’t do it, something’s gone wrong with our information sharing system.

We need more guides on how people need to run a data center. I should pick up the slack and do something simple. There’s a lot of small timers going into the DC and have no clue what they’re doing (or have just enough knowledge to be dangerous). The people that run today’s homelabs are running tomorrow’s data centers. And given how awful most of the internet’s infrastructure is ran, no doubt we have so many problems to deal with. This needs to change.

I have a very strong feeling about homelabers. In the enterprise, you’re not allowed to change things for the better, because downtime causes loss of profit, blah blah blah, you know the drill. But in the homelab, where you are supposed to learn and break stuff, I believe people should be running their homelabs as an example of how the enterprise should be running. But that’s clearly a very amateur job in most labs.

Using unreliable hardware and using smart software stacks to increase reliability is something that has been done in the enterprise since eons ago, but it hasn’t trickled down to the lab, because people don’t mind losing uptime, compared to the price of using multiple hardware. However, I’d argue that multiple hardware, if configured appropriately, is better than just one big box even for small time labers.

I really need to start working on my lab’s software stack and do some good for humanity and show “how it’s done.” Of course, if someone else does it faster than me, that’s fine too, but if nobody does it, I will.

I’ll probably need to start a website “how to run a DC” or something. I don’t want people to be setting things wrong and then cry about it later. I also don’t want people to set things up and forget, without a proper watchdog covering their back and without a proper “plan B” (backups).

I feel personally ashamed that it’s the current year and people still don’t follow best bare minimum practices…

1 Like

The D-Link Mediatek Filogic based boxes seems like a rather good option unless you need a tri-radio solution and are decently priced too with OpenWrt support?

IDK how to look for that. The owrt page only shows the devices themselves.

That is how I found banana pi and rockpro64. I was thinking between Asus, Unifi, or going with the expensive banana pi bpi-r3 mini. Technically, SBCs would have a longer lifetime (given their specs), so I assume I can run it for longer, compared to routers that might get left behind to ever increasing storage and RAM requirements.

I don’t think I’m entirely opposed to d-link and tp-link stuff, but I need some time to think about the local network. I might be able to get away with just a dedicated router / dns / dhcp combo (like my own rockpro64, but probably something smaller, like a nanopi r4s, which is supported by openbsd) that can serve as the main head of the network and then use the netgear APs we have in bridge mode. I need to test that, before I switch the entire network.

TBH, I don’t like the netgear nighthawks we have because they keep dropping for some reason (at least that was my experience with them). But maybe that’s a small price to pay.