^^ I'm so being triggered by this.
Maybe I shouldn’t post this because I’d look stupid but internet is full of stupid and ad-hominem so there it goes.
The purpose of any troubleshooting is to help discover a problem, not “the” problem, not “all problems”, and certainly not to just fix the issue or issues … if you ran a bunch of things and don’t understand where things went wrong in the first place don’t discount the issue (or more issues) as fixed. Troubleshooting checklist are tools, the success or failure however you choose to count them are useless metrics.
If you ever watched House MD (apologies in advance to medical professionals worldwide), the thing where they semi-randomly pump a patient full of drugs and spray through a bunch of different treatments and do DDX “differential diagnosis” analyses, that’s what most troubleshooting is. Ideally, you’d learn something new after each step or action or merely through observation and passage of time, and would use logic and eason and light statistical/probability analysis to figure out what’s going on and how to best get desired outcome. It’s kind of like scientific method, theorise, explore, prove or disprove partially or fully, hedge bets on further research, rince and repeat.
With any complex computer system sometimes you do get to the fine details, sometimes you reboot everything daily (hi Windows server), sometimes you add permanent monitoring and instrumentation (e.g healthchecking and control planes for machines or monitoring/observability for humans), sometimes you raise your arms and say “not worth my time and effort and money” and engineer your way around the problem (use storage raid, replicate a database, do backups, duplicate a backend requests twice and use whichever response comes first, do forward error correction, do tcp retransmits and so on).
Sometimes you give up, admit defeat, and decide to live with the issue because you want to go on with your life.
If you really want to nerd out on reliability and are looking for a good reason to keep the toilet seat warm, read up on STPA and predicting failures.
/rant
… now to practical considerations…
pfSense isn’t that complex to learn - given your Cisco networking background you’ll be fine. Cisco has some specific terminology, so does pfSense. A lot can be done through the webui. As long as you’re not afraid of tcpdump or Wireshark or reading the logs once in a while, you’ll be fine. (basically standard troubleshooting applies, you most likely won’t need it at all).
Underneath it’s a just a software router and it’s (packet_in, run_code, packet out) … you’ll do fine.
A good thing about pfSense is that it’s popular, lots of people use it and are aware of some typical ways in which typical deployment scenarios like yours are handled. It’s a good choice for companies because finding support is easy (you leave a company or go on vacation, they can find a vendor to fix things up).
Similar with Ubiquiti, say what you want about clicking on java served web UIs when configuring POE, their hardware is not complete ‘s**t’ even though most of it is made in the same company as TP-Link… and lots of people know it, it’s easy to get/swap out and find people who work with it daily to maintain it.
I’d stay away from Mikrotik unless you’re a network professional or enthusiast… they’ve tons of features but not nearly enough QA and way worse track record when it comes to regressions on updates … they’re cheap which is great for homelabs you can power cycle / upgrade/ downgrade whenever, but moving from one product to the next in their ecosystem there’s always more uncertainty when it comes to predicting performance and reliability.