I see PFSense can do multi-wan load balancing (thanks to: https://216.244.80.51/t/multi-wan-firewall-router/55864), and I assume it would support multi-wan failover, but can anyone with experience confirm this?
Here's my 2¢ as a pfSense user in production with 3 WAN links for over 3 years.
Failover is one of those things that is somewhat of an art.
The system has to be able to detect when a connection is down and trigger a failover.
From my experience, that detection is the problem. Typically what it does is ping the gateway IP. Sometimes the gateway IP responds to pings, but there is no further connection... so no failover would be triggered.
Or, if there is high latency, because someone is hogging all the bandwidth, you can trigger the failover unnecessarily... which is really bad because the failover links are usually slower than the primary link.
And everytime you failover, your TCP sessions have to be reset, people get disconnected from their services, etc.
And the worst scenario is an unnecessary failover 'storm' where it failovers and un-failovers over and over.
So, what I did here, is I manually change the default gateway whenever I have to failover. It's a quick method of changing the WAN connection you are using, without disabling the interface (which might be used for VPN, etc).
And since I don't do failover, I don't do load balancing/sharing either. That's even more of a sketchy situation.
You answered exactly what I wanted to hear. I'm familiar with your description of failover and it is an art to a degree depending on how you want the firewall to react. I actually had to change one of our location's failover settings yesterday to get it to act the way we want to. Basically, it can be done.
My plan is to build one for home use and use it for a year, get to know it as best I can and then replace our firewalls one by one at our locations.
Yeah you can have it use a different IP for failover detection, but that can be problematic. I was using 4.2.2.2 for failover detction and I noticed it was down for a little while a couple weeks ago, And since you should only use an IP for failover detection, you can't have it check something like a round robin DNS entry.
pfSense need the ability to check a handfull of IPs and do some fuzzy logic to determine a failover event trigger. Until that happens (and I don't know im on 2.2.2 still) I'm not going to risk it. The only problem is that I have to keep my finger on the trigger... I'm the dutch boy with his finger in the dyke.
We had a bit more of a complicated setup. We are on an MPLS with only T1 connections at some of our locations and if the circuit hits capacity, it will drop packets which was providing a false positive when trying to reach 8.8.8.8 or 4.2.2.2.
Yeah you are choking the connection.
Gotta setup QoS/bandwidth shaping, or whatever.
I need to do that on my system because I have idiots who have been using dropbox and amazon cloud. I did throttle my wireless network, but I need something more sophisticated.
I'm not going to mess with it until I upgrade to the current version. Maybe this weekend as I'm doing some maintenance. But I hate fucking with anything in production.
Life of an IT manager.