RSTP blocking WAP port with multiple SSIDs to multiple VLANs

I’m facing a strange issue and I’m hoping someone with more knowledge can help me out (or just help me further debug).

I have some unifi switching/wireless hardware (specifically in this case a couple unifi switches, and a unifi WAP) paired with a pfsense router/firewall. General topology is as follows:

image

The VLANs are configured on pfsense and are isolated from each other (all traffic blocked between them in the firewall).

The issue I’m having, is that when connected over wifi via ssid1 (and probably ssid2, although untested), doing heavy local traffic triggers rstp to trip and temporarily block the WAP port on switch 1. I can get full bandwidth through to the internet without issue from a wifi connected device, but when doing something, such as smb,nfs,http,etc from a wifi connected device to the local server, the unifi controller notes that the WAP port gets blocked due to RSTP discarding. Once the RSTP is unblocked a few seconds later, another couple MiB (at most) will get through, and immediately block again.

I was able to use iperf at expected full bandwidth between the server and a wifi connected device, so it’s clearly not all TCP traffic that causes this, but once using any typical network application between a wifi device and the server (as mentioned, smb,nfs,http, and others all seem to trigger it), RSTP discarding is triggered on the WAP port. I’ve enabled IGMPv3 and IGMP snooping everywhere on all the unifi networks/devices, but that didn’t seem to help.

Disabling RSTP on the WAP port on switch 1 sort of helps. When doing the problematic traffic with RSTP disabled, the WAP port will simply disconnect and quickly reconnect once or twice every second, severely reducing the bandwidth, but not blocking the connection entirely for extended periods of time like RSTP does.

Does anyone know what may be going on, or have any suggestions for further debugging?

Not to spoil the fun, but have you considered disabling RSTP?

Did you setup bridge priorities in any way?

Do you have any software bridges (other than the one built into the wap)?

Yeah, disabling RSTP sort of half solves the problem. The WAP will disconnect from the port for maybe like 200 ms once or twice a second, severely limiting the throughput, but at least there’s no continuous blocking.

I don’t have any bridge priorities setup, and to my knowledge I don’t have any explicit software bridges anywhere. I have lots of trunk ports, and my server has access to both VLANs, however it has independent network interfaces for each vlan, and only one app has access to the vlan99 network interface, which is pretty heavily firewalled off from everything else. I suppose pfsense natively bridges traffic even between vlans, but again I have all IPv4+6 traffic blocked between the 2 vlans on the pfsense firewall rules. Maybe that’s still passing L2 traffic that’s not ipv4/6 traffic between VLANs? I’m not sure. pfsense was definitely passing traffic between the VLANs before I enabled the ipv4+6 firewall rules to block it.