If packet loss was bloodloss I'd be dead

So, I got the server setup, file server in place, media server in place, rsync backups taken care of, a lot of network configuration on the router for a bunch of this shit and thought I nixed my packet drop issues.

But here they are, still here.

So if anyone remembers the blog when I showed this:

It is now alive and instead of running a VM for the game server I decided why not just use this box for it instead? Maybe get a VPN setup for emulation multiplayer without having to dedicate some time, how cool would that shit be? This will be the machine used for all my game servers. (I’m going to a lan in June, so hopefully I can knock out any kinks by then.)

And while the other server has some other alarms going (A friend of mine told me about netdata, which is where I’m getting all this info from) none of those are actual problems just telling me that my servers are 5-6 year old hardware

If I run

netstat -s | grep retransmitted

I would initially get something like 1000 segments retransmitted after an uptime of an hour. I unplugged one of the ethernet cables and rebooted. I thought maybe there’s more configuration I needed or having two IP addresses is screwing it up, but still dropping packets.

Here’s the segments retransmitted for my desktop though so not sure there’s really that much of an issue.

    1847 segments retransmitted
    1299 fast retransmits

I checked my routes and it goes like

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         gatekeeper.leon 0.0.0.0         UG    100    0        0 enp3s0
default         gatekeeper.leon 0.0.0.0         UG    100    0        0 enp2s0
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 enp3s0
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 enp2s0
gatekeeper.leon 0.0.0.0         255.255.255.255 UH    100    0        0 enp3s0
gatekeeper.leon 0.0.0.0         255.255.255.255 UH    100    0        0 enp2s0

And I notice something weird, that my domain destination is on /32 instead of the actual subnet the computers are on /24 (I’m a basic ho I know) and that the gateway is empty. I have an internet connection. I can SSH into everything. Netdata shows in browsers and plex is still running on the other box. I’m not sure if this really is weird or not. It’s kinda new to me.

So I’ve gone into pfsense and changed the gateway ip of the DHCP server to 192.168.1.1. And that seems to have fixed it? I’m still getting a warning, but There’s no actual change to the gateway in the routing table. The download from steamcmd is going much faster than it used to, the last time I tried it failed, and the time before that it took an actual hour, this time it completed in about 15 minutes. Both cables are in and link is up and after about an hour I am down to

    27 segments retransmitted
    1 fast retransmits

The server itself is ubuntu server 18.04, running some 5000 series xeons on a TYAN S7002 motherboard with whatever revision is the really stripped down model. I have turned off the software firewall in ubuntu for testing purposes. Thinking about maybe switching to centos or fedora. But if I can actually just keep doing I’ll just run a VM for getting familiar in the red hat distros.

Two default routes? Pick one and go with it…

Eek, two adapters on the same subnet with the same subnet mask?

If it want to team/bond them together do that. If you want to route them, then rip. Your setup is asking or Route confusion in multiple ways.

1000 retransmits over how many? (Doesn’t sound like a big deal) give us a percentage.

What does ip -s link tell you?

Having two default routes and two interfaces on the same subnet is fine, albeit possibly confusing for folks who don’t understand Linux routing internals.As long as you’re not trying to actively steer traffic while doing nat or some such thing, should work fine - only one of the duplicate routes will appear to be active (the latest one added).

You can use ip route show table all (if I recall the incantation correctly) to get the contents of routes outside of the main routing table which may provide some clarity (or confusion, depending on how your day is going). This is because on Linux each network namespaces can have multiple routing tables, and to keep things neat by default, lower-level routes exist in tables that aren’t main table. There’s also ip rule that basically determines which routing table packets should be using - you can read up more about this mechanism on lartc.

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    RX: bytes  packets  errors  dropped overrun mcast   
    65468      754      0       0       0       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    65468      754      0       0       0       0     
2: enp3s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 00:e0:81:b8:56:07 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    5157727001 3566705  0       38215   0       38896   
    TX: bytes  packets  errors  dropped carrier collsns 
    53121843   519597   0       0       0       0       
3: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000
    link/ether 00:e0:81:b8:56:08 brd ff:ff:ff:ff:ff:ff
    RX: bytes  packets  errors  dropped overrun mcast   
    10833872   99092    0       37847   0       38522   
    TX: bytes  packets  errors  dropped carrier collsns 
    493345     1976     0       0       0       0    
default via 192.168.1.1 dev enp3s0 proto dhcp src 192.168.1.124 metric 100 
default via 192.168.1.1 dev enp2s0 proto dhcp src 192.168.1.123 metric 100 
192.168.1.0/24 dev enp3s0 proto kernel scope link src 192.168.1.124 
192.168.1.0/24 dev enp2s0 proto kernel scope link src 192.168.1.123 
192.168.1.1 dev enp3s0 proto dhcp scope link src 192.168.1.124 metric 100 
192.168.1.1 dev enp2s0 proto dhcp scope link src 192.168.1.123 metric 100 
broadcast 127.0.0.0 dev lo table local proto kernel scope link src 127.0.0.1 
local 127.0.0.0/8 dev lo table local proto kernel scope host src 127.0.0.1 
local 127.0.0.1 dev lo table local proto kernel scope host src 127.0.0.1 
broadcast 127.255.255.255 dev lo table local proto kernel scope link src 127.0.0.1 
broadcast 192.168.1.0 dev enp3s0 table local proto kernel scope link src 192.168.1.124 
broadcast 192.168.1.0 dev enp2s0 table local proto kernel scope link src 192.168.1.123 
local 192.168.1.123 dev enp2s0 table local proto kernel scope host src 192.168.1.123 
local 192.168.1.124 dev enp3s0 table local proto kernel scope host src 192.168.1.124 
broadcast 192.168.1.255 dev enp3s0 table local proto kernel scope link src 192.168.1.124 
broadcast 192.168.1.255 dev enp2s0 table local proto kernel scope link src 192.168.1.123 
local ::1 dev lo proto kernel metric 256 pref medium
fe80::/64 dev enp3s0 proto kernel metric 256 pref medium
fe80::/64 dev enp2s0 proto kernel metric 256 pref medium
local ::1 dev lo table local proto kernel metric 0 pref medium
local fe80::2e0:81ff:feb8:5607 dev enp3s0 table local proto kernel metric 0 pref medium
local fe80::2e0:81ff:feb8:5608 dev enp2s0 table local proto kernel metric 0 pref medium
ff00::/8 dev enp3s0 table local metric 256 pref medium
ff00::/8 dev enp2s0 table local metric 256 pref medium

So the default route is correct. I’ll just clean up the rest of this and probably bond them together.

Thanks for kick in the right direction guys. I’m sure I can fix this now.

aka: its 50/50 whether or not its setup the way you intended and fails relatively silently (blank stare, no traffic moving).

Also, it potentially swizzles them as/if you re-enable/disable interfaces after boot. So, confounding further as you try to debug.