I've been having all sorts of weird network problems lately where after a few days my internet connection becomes really unresponsive. Looking at the traffic graphs (on pfsense) I can see there's constant wan traffic which is unaccounted for. It's usually like 8mbps or something like that so not that high. In addition to that my state table is at 60,000 or so. That turned out to be samba. On my NAS the smbd process has a high (70% or so) CPU usage so I figured that was the problem.
If I restart the smbd service nothing changes but restarting the server solves the problem, but it always returns. I haven't been able to get to the bottom of it and I don't really know where to start.
Anyway, last night I rebooted the server because my connection was flaking out. After that the states returned to normal but the connection was still acting weird. I could see the constant traffic on wan so I had a look at it with wireshark. There was a ton of ntp traffic from different servers so I figured it was some kind of ddos. I reset the modem which gave me a new IP and things went back to normal. But when I looked at the traffic with wireshark again I still see a lot (not as much but still constant) of ntp traffic except that it's all originating from pfsense. Nothing I do stops it, I've turned off the ntp service and blocked all ntp traffic in the firewall but it's still there and I can't figure it out. So far my connection is working fine but I have no idea what's going on.
Do you have the capacity to turn off DHCP and statically assign all IP addresses then log all traffic for those IP addresses to make a better determination of what's generating the traffic. Do you think your samba server could have been hit by an attacker and they have access to it now? Might want to check where the data is going from there.
You could try to QOS your upload speed to ~60% of it's actual capacity. On my cable internet if I completely saturate my upload it kills my download as well.
I doubt the samba server is compromised, all the traffic (or atleast the states, when I get home I'll look at the actual traffic) is between the server and some local VMs. It might be a problem specific to the plex server as that seems to trigger it. But if it turn off all the client devices the server still uses a load of CPU and the States persist even after clearing them. It's always 30,000 or 60,000 states (more or less). If I stop the server the states go away but as soon as I start it again, even with all the clients turned off it starts back up. I'll have a look at the actual traffic tonight and see what's going on.
I had originally thought this was what was causing the problems with the internet, I thought it was too many states for it to handle, but it seems to be something to do with the ntp traffic. It looks like a ddos attack and when I reset the connection and got a new ip it stopped. But I'm still seeing ntp traffic originating from pfsense and I can't seem to stop it. It's the download that has all the extra traffic, the upload seems fine.
The problems seem to be unrelated and I think i was only noticing all the states caused by samba when the internet was flaking out because otherwise I wasn't really watching it.
This is the CPU usage of the smbd process. It sits between 25% and 90% when the system should be idle
High number of states in pfsense
This is pretty much what they all look like, 10.1.1.20 is the server
What does the wireshark look like?
Whats file system is your harddrives formatted in?
NTFS is horrible on the cpu....hooooorrrrible
It's not a disk problem, the smb process is flooring it no matter if there is something accessing it or not, even if the clients are turned off.
just humor me, what file system is the discs partitioned and formatted in?
btrfs, ext4, zfs. no ntfs
ZFS can be somewhat rough not sure what goes on behind the curtains, i recently did a small VM test of freenas, and basically it took a 4 core 4,4ghz 8gb ram VM for a ride just copying ~10gb. To the point where it slowed to a meezly ~40mb/s on a Gbit net work all caused by the cpu causing a bottle neck.
This isn't a filesystem issue, it's clearly a problem with the samba service
have you tracked via tcpdump/wireshark what/who SMB is talking to ?
Yeah it's connecting to local clients, it appears to be only having issues with the VMs which are on a different subnet. I don't see all these states for my laptop or anything else. And it still generates the states and CPU usage even if the clients are turned off.
Are any of the local clients running an antivirus?
Since they're on different subnets have you verified the routing table is configured properly? Has anything changed on the network prior to this happening?
Routing is working fine and even samba is working. It seems to only happen with one or two of the clients, both are Ubuntu VMs, my laptop doesn't have this problem which is also on a different subnet. I might see what happens if I restart everything but don't turn on one of the VMs.
No, none of the Linux machines are running antivirus. I'll check the client devices out but it seems to me like an issue with the server
What I need to know is where to look to figure out what the server is doing so I can identify the issue.
So looking in to this mysterious ntp traffic I realised that it's not my router which is producing it, the ip address is similar but on close inspection it's different to my wan address. I believe this address to belong to the modem (not really a modem but I'm not really sure what it is) which I have no control over. It's a public IP and doing a port scan shows it has ssh but nothing else. So it's probably broken and just spewing ntp traffic out everywhere for some reason, I don't know I'll have to call the ISP which is going to be super painful.