Networking wizards please help

I’m having a really strange issue with tailscale and my ISP router.

I used to have my desktop run a tailscale exit node so I can VPN home when I’m out and about, this worked fine for a while. Today I decided to finally set up a rock64 and move the exit node there so I don’t need to keep the desktop open all the time.

I installed tailscale on the rock64 exactly like how I did my desktop (both arch, same config) and try to connect with my phone, then the internet connection went down, for all the devices on the network.

LAN connections still work, the router itself can connect to the internet. I unplug the rock64, internet is back. :upside_down_face: wth. So I try to use my desktop as the exit node instead, internet still works, select rock64, internet down.

My setup is as follows:
Fiber comes in to the ISP router (Home Hub 4000), it has a 10G port and 4 1G ports. My desktop is connected to the 10G port, rock64 to 1G port, phone to the wifi.

What I tried:
Desktop use rock64 as exit node, internet works
Swap the port for desktop and rock64, now phone can use rock64 as exit, desktop will bring down internet
plug rock64 and desktop both to 1G ports, both phone and desktop will break internet
port back to original, Phone connected to cellular, everything works
plug rock64 and desktop onto a unmanaged switch and switch to router, either 1G and 10G everything works.

at this point I can only conclude the router have some Link layer issue, but what is it?

BS magical nonsense - some property of traffic that rock64 sends is tripping up your home hub - we just need to figure out what exactly.

Are you masquerade-ing traffic outgoing from rock64?

If yes, can you also try resetting the IP TTL on outgoing packets to 30 (sometimes gets used for router detection, maybe the home hub is trying to be overly smart).

Are you on ipv4 only, dual stack, ds-lite, … ?

# Generated by iptables-save v1.8.9 on Mon Mar 27 20:15:10 2023
*nat
:PREROUTING ACCEPT [13364:3732388]
:INPUT ACCEPT [4190:1616051]
:OUTPUT ACCEPT [20380:1390835]
:POSTROUTING ACCEPT [20380:1390835]
:ts-postrouting - [0:0]
-A POSTROUTING -j ts-postrouting
-A ts-postrouting -m mark --mark 0x40000/0xff0000 -j MASQUERADE
COMMIT
# Completed on Mon Mar 27 20:15:10 2023
# Generated by iptables-save v1.8.9 on Mon Mar 27 20:15:10 2023
*filter
:INPUT ACCEPT [593817:111560235]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [1327843:1566369124]
:ts-forward - [0:0]
:ts-input - [0:0]
-A INPUT -j ts-input
-A FORWARD -j ts-forward
-A ts-forward -i tailscale0 -j MARK --set-xmark 0x40000/0xff0000
-A ts-forward -m mark --mark 0x40000/0xff0000 -j ACCEPT
-A ts-forward -s 100.64.0.0/10 -o tailscale0 -j DROP
-A ts-forward -o tailscale0 -j ACCEPT
-A ts-input -s 100.83.14.65/32 -i lo -j ACCEPT
-A ts-input -s 100.115.92.0/23 ! -i tailscale0 -j RETURN
-A ts-input -s 100.64.0.0/10 ! -i tailscale0 -j DROP
COMMIT
# Completed on Mon Mar 27 20:15:10 2023

tailscale did all the firewall configuration, all I did was enabled ip_forwarding as instructed using sysctl.

I tried to set ttl to 30 with sudo sysctl -w net.ipv4.ip_default_ttl=30, didn’t seem to make a diffence. My thinking was if it’s rock64 confusing the router, would it have the behavior where some ports will let one of my desktop use vpn while other ports let my phone use vpn?

That’s the default for packets originating on the machine, when rock64 is routing packets originating elsewhere coming in from tailscale interface to it’s gateway it decrements the TTL on those.

See this for iptables TTL stuff


And yes, it does look like ts-postrouting will rewrite the srcip to rock64 on packets marked 0x4000, which are whatever originates from the Tailscale network

yeah, the iptable stuff are going over my head, the above iptable dump should be all the machine has, can you tell me what iptable command I need to use to set the ttl? thanks.

iptables -t mangle -A FORWARD -j TTL --ttl-inc 1 (from stackexchange)

sadly did not seem to make a difference either

For starters, run traceroute 8.8.8.8 from an unrelated system (e.g. your desktop) on your LAN before and after you bring up the VPN (e.g. on the rock64).

yes, I did that, traceroute just have “no reply” I’m trying to gather packet capture on various machines right now. so far from the rock64 side it just looks like the icmp request are not getting any response.

Why are you running a traceroute from the rock64?

no response on either rock64 or desktop, I just have the rock64 packet cap for now

edit: now after dinner I can’t get the internet to cut out all of a sudden, how nice

Check MAC / ARP address of your default gateway IP before / after you lose connectivity. Determine if arping (Linux) or hardping (Windows) responds.

you mean arping between my devices and/or router? connection between local devices and routers are fine (I can access router web page for example)

In that case, you should get at least one traceroute / mtr response. You might have to back off on the security settings a bit if it’s set to block responses.

nope, not a single reply, even thought I can ping my gateway, and if I do ip r get 209.51.188.116 it tells me it goes via the gateway

also I now I packet cap of both the rock64 and desktop at the same time, it’s super weird. I lined up both packet caps with a IGMP packet from a ipad, and when I go to the point where my desktop looses internet, both my desktop and rock64 start to see TCP retransmissions/unseen segment/dup ack and just general weirdness. and the IGMP packet from ipad start to only show up on one of the capture and not the other

I also see STP packets happen on regular, could it be some issue caused by that?

Interesting about STP, do you have any software bridges e.g. for VMs? Can you disable STP on them?

… although, if you can access the home hub webui it’s probably not STP cutting you off.

the STP packet seems to be coming from the router, I don’t know much about STP, but I don’t think there’s even any loops in the network, so why would it constantly send out packets…