I admit don’t really understand this “degraded exponentially”, business when you add more hosts.
It sounds very unspecific to me - in trying to understand you better the impression I have is that your concerns are based on unresolved issues from years ago that you did not fully understand then (hence unresolved obviously), and don’t remember well enough today - except that it didn’t work for you. I’ve been there myself, and I can’t deny you had a bad time back then, but my experiences so far using linux for routers have been a lot more positive, challenging at time, but positive overall, and judging by your description it sounds like something was misconfigured for you back then. Was it HTB, was it the route cache size or route cache gc, or was it connection tracking - it’s hard to tell now - and the kernel networking stack has indeed changed a lot since 2015.
For example these days we have: https://www.kernel.org/doc/Documentation/networking/nf_flowtable.txt
This bypasses most of routing and firewall stuff and it likely would have resolved (or at least masked) whatever tuning problem you had back then. It doesn’t sound like you had driver/interrupts balancing/chipset issues otherwise your performance would probably suck for a lot fewer connections. This netfilter module is still relatively new - it wouldn’t have been available 5 years ago when you were testing.
The issue I have recommending “accelerators” is that they’re either too cheap when it comes to capabilities or too expensive in $$$.
If they’re cheap, they’re likely both not programmable and have only a small amount of inflexible CAM for storing the FIB (I guess you could still call it a FIB even through there’s all kinds of junk in those entries), and all you typically need is one kid doing a bunch of DHT / P2P lookups or ramping up connections to start thrashing your CAM entrie – at which point you start writing complicated firewall rules to limit them, … ugh.
On the other hand, if you buy enough of those expensive cards you get sent a bunch of sad consulting engineers to go with the cards that probably will be hard to use effectively – both the card and the people, … poor folks god bless them.
There’s also no reason to not consider having a secondary x86 router and doing some kind of VRRP / moving VIPs between them … so that your networks is still up while you’re upgrading one of the routers. … well except that 2 machines cost twice the money. … but you also get to spread the load across 2 machines when both are working.
In any case, … regardless of the chosen option - setting up a simulated test network should be easy enough thanks to network namespace containers and ipvlan/macvlan these days - and is worth it for @dual_brot to do, before they rip out the currently “working” router.
For example, one could easily grab a linux laptop and write a shell script to create e.g. a 200 virtual nics each with their own mac address (macvlan), assign them IPs and then have e.g. nginx listening on all of them.
And then on the other side, you can have your load generator linux laptop - just setup another 200 network namespace and run something like github.com/wg/wrk once per network namespace with a random subset of those IPs. You’ll have thousands of real connection tuples on the router in no time, and you’ll be able to observe the behavior including the overall throughput.
One thing people don’t realize is that a typical e.g. macbook pro level hardware you can just get from a computer store (or order online these days) can issue enough legitimate looking requests to rival what 10 years ago would have made the news as a dos attack.
Like: OMG! the RPM on our spammy blog is over 60,000 ! - oh it’s just Bob testing, ah ok … 5 minutes later … oh damn it Bob you’re running with multiple threads now, stop pointing loadtest traffic at prod.. It’s that ridiculous.
My point is that you don’t really need to buy expensive “solutions” just so you could do a basic low bandwidth (<10Gbps) loadtest. (well you probably need a desktop with a free pcie slot for >1Gbps, there’s a shortage of thunderbolt 10Gbps nics for some reason).
There’s no reason not to do it, and whichever issues you resolve during loadtest will be issues you won’t have to resolve before stuffing the router into “prod” environment. You can ask for help here on the forums or on IRC or on various mailing lists where developers hang out.
The thing I don’t know much about is pfSense, … I sort of gave up because I wanted to run my router virtualized with a KVM host and virtio nics, … and FreeBSD (and pfSense) had a bunch of issues with both e1000 and virtio drivers and I couldn’t really get useful performance out of it even at home and ran out of time to debug, in contrast, linux worked oob - so I just used that. Now my home router is just a plain old linux box running debian testing - not any kind of fancy routing distro.