Routing problem with mangle table on Ubuntu 16.04

My home router is just an older pc running the server version of xenial. Pertinent to the issue presented here, my wan interface is labeled enp6s0, my lan interface is labeled enp7s0.

I have installed openvpn on my router which connects as a client to an aws ec2 instance (also xenial server) running openvpn in server mode. I have no trouble routing ipv4 and ipv6 traffic from the router through the ec2 instance. I have no trouble using routing rules to select traffic from specific hosts on my internal network to route either through the ec2 instance or through my isp.

I do however run into trouble when I try to route traffic based on protocol and port number by marking packets in the mangle table. Specifically the traffic routed through the ec2 sucessfully reaches its destination, the reponse successfully traverses it's way back to my router but then never reaches my lan interface.

I normally use a script to start my ec2 instance, connect with openvpn and then setup my router routing tables and rules, but the following are the essential commands:

STARTING EC2 INSTANCE AND CONNECTING:

clifford@router:~$ aws ec2 describe-instances --instance-ids $( < vpn.sluggishmail.com-instance-id) | grep PublicIpAddress
                "PublicIpAddress": "54.245.59.110",
clifford@router:~$ sudo openvpn --daemon --config /etc/openvpn/client.conf --remote 54.245.59.110 1194 udp

NETWORK INTERFACES AFTER VPN CONNECTION:

clifford@router:~$ ip address show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
   valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
   valid_lft forever preferred_lft forever

2: enp7s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 68:1c:a2:12:66:b5 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.3/16 brd 192.168.255.255 scope global enp7s0
   valid_lft forever preferred_lft forever
inet6 2600:XXXX:XXXX:XXXX::3/64 scope global
   valid_lft forever preferred_lft forever
inet6 fe80::6a1c:a2ff:fe12:66b5/64 scope link
   valid_lft forever preferred_lft forever

3: enp6s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 68:1c:a2:12:66:b6 brd ff:ff:ff:ff:ff:ff
inet 68.XXX.XXX.XXX/24 brd 68.XXX.XXX.255 scope global enp6s0
   valid_lft forever preferred_lft forever
inet6 fe80::6a1c:a2ff:fe12:66b6/64 scope link
   valid_lft forever preferred_lft forever

4: enp2s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:25:22:10:d0:7b brd ff:ff:ff:ff:ff:ff
inet 10.0.0.3/24 brd 10.0.0.255 scope global enp2s0
   valid_lft forever preferred_lft forever
inet6 fe80::225:22ff:fe10:d07b/64 scope link
   valid_lft forever preferred_lft forever

6: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 100
link/none
inet 10.8.0.6 peer 10.8.0.5/32 scope global tun0
   valid_lft forever preferred_lft forever
inet6 2600:XXXX:XXXX:XXXX::1000/64 scope global
   valid_lft forever preferred_lft forever

After manually setting up the routing table (ISP traffic through the main routing table, VPN through the custom AWS table) using the ip command,

ROUTES:

clifford@router:~$ ip route show
default via 68.XXX.XXX.1 dev enp6s0
10.0.0.0/24 dev enp2s0  proto kernel  scope link  src 10.0.0.3
68.XXX.XXX.0/24 dev enp6s0  proto kernel  scope link  src 68.XXX.XXX.XXX
192.168.0.0/16 dev enp7s0  proto kernel  scope link  src 192.168.1.3

clifford@router:~$ ip route show table AWS
default via 10.8.0.5 dev tun0
10.8.0.1 via 10.8.0.5 dev tun0
10.8.0.5 dev tun0  scope link  src 10.8.0.6

Then MASQUERADEing traffic outband through the VPN:

clifford@router:~$ sudo iptables -t nat -A POSTROUTING -o tun0 -j MASQUERADE

At this point, traffic still successfully travels through my ISP without touching the VPN. If I add the rule:

clifford@router:~$ sudo ip rule add from 192.168.1.12 table AWS

Traffic from host 192.168.1.12 successfully routes over the VPN. I have analogous rules and routes for ipv6 traffic that also successfully routes. The reverse, setting up the default routing table to use the VPN and then exempting specific hosts to use the ISP, also works.

But I run into trouble when I try to route traffic on specific protocols and ports over the VPN.

Deleting:

clifford@router:~$ sudo ip rule del from 192.168.1.12

And marking tcp traffic on ports 80 and 443 from host 192.168.1.12:

clifford@router:~$ sudo iptables -t mangle -A PREROUTING -s 192.168.1.12 -m multiport -p tcp --dports 80,443 -j MARK --set-mark 1

And adding a rule to route marked traffic through the VPN table:

clifford@router:~$ sudo ip rule add fwmark 1 table AWS

At this point, traffic successfully routes over the VPN to its destination:

clifford@router:~$ sudo tcpdump -n -i enp7s0 tcp port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp7s0, link-type EN10MB (Ethernet), capture size 262144 bytes
07:54:42.056497 IP 192.168.1.12.56414 > 52.38.255.144.80: Flags [S], seq 2097315203, win 64240, options [mss 1460,nop,wscale 8,nop,nop,sackOK], length 0

The response successully routes all the back to the address associated with my router's tun0 interface ( 10.8.0.6):

clifford@router:~$ sudo tcpdump -n -i tun0 tcp port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on tun0, link-type RAW (Raw IP), capture size 262144 bytes
07:54:10.719152 IP 52.38.255.144.80 > 10.8.0.6.56395: Flags [S.], seq 1816310859, ack 2633952775, win 26883, options [mss 1198,nop,nop,sackOK,nop,wscale 7], length 0

But traffic is "dropped" there, never making it back to interface enp7s0 or the host. I put drop in quotes because I don't mean there is an iptable rule that evaluates to -j DROP. Even if I make the filter table rules as liberal as possible (ACCEPT everything) the traffic still disappears.

I have been playing with this for two weekends now, and I am still not able to remedy the problem. If anyone made it all the way through this post, I thank you. And if you can spot my mistake, I would appreciate if you would point it out.

1 Like

If anyone is interested I found a way to solve my problem. Well I am not sure if it is a solution or a workaround that is just masking my routing problem. But after some more digging, I stumbled accross:

IMPORTANT: We received a report that MASQ and SNAT at least collide with marking packets. Rusty Russell explains it in this posting. Turn off the reverse path filter to make it work properly.

Linux Advanced Routing & Traffic Control HOWTO / Chapter 11. Netfilter & iproute - marking packets

Testing here reveals that the route filtering and mark don't play well
together. Try:

# for f in /proc/sys/net/ipv4/conf/*/rp_filter; do echo 0 > $f; done
# echo 1 > /proc/sys/net/ipv4/route/flush

Rusty Russell: routeing, SNAT, MASQ, and fwmark

To test this out, I setup openvpn and routing table as above with the following rules:

clifford@router:~$ sudo iptables -t mangle -A PREROUTING -s 192.168.1.12 -p icmp -j MARK --set-mark 1

clifford@router:~$ sudo ip rule add fwmark 1 table AWS priority 10

And then setting up some logging:

echo 1 > /proc/sys/net/ipv4/conf/all/log_martians

to show traffic dropped by the rp_filter. And to show how far the traffic actually gets before being dropped:

iptables -t raw -A PREROUTING -i tun0 -p icmp -j TRACE
iptables -t raw -A PREROUTING -i enp7s0 -p icmp ! -s 192.168.1.14 -j TRACE

(192.168.1.14 is a host that produces a lot of icmp traffic that is a pain to sift through but isn't relevant for this post.)

With that in place, a simple ping from host 192.168.1.12 to google is predictably dropped:

ping -4 -n 1 google.com

Pinging google.com [72.195.166.57] with 32 bytes of data:
    Request timed out.

Ping statistics for 72.195.166.57:
    Packets: Sent = 1, Received = 0, Lost = 1 (100% loss)

which is logged on the router as:

clifford@router:~$ tail -f -n 0 /var/log/syslog
May 10 10:24:14 router kernel: [76133.452124] TRACE: raw:PREROUTING:policy:3         
    IN=tun0 OUT= MAC= SRC=72.195.166.57 DST=10.8.0.6 LEN=60 TOS=0x00 
    PREC=0x00 TTL=45 ID=42399 PROTO=ICMP TYPE=0 CODE=0 ID=1 SEQ=17
May 10 10:24:14 router kernel: [76133.452132] TRACE: 
    mangle:PREROUTING:policy:2 IN=tun0 OUT= MAC= SRC=72.195.166.57 
    DST=10.8.0.6 LEN=60 TOS=0x00 PREC=0x00 TTL=45 ID=42399 PROTO=ICMP 
    TYPE=0 CODE=0 ID=1 SEQ=17
May 10 10:24:14 router kernel: [76133.452136] IPv4: martian source 192.168.1.12 
    from 72.195.166.57, on dev tun0

So it is the reverse route filtering that is dropping traffic.

Loosening the policy results in traffic going through: rp_filter = 2):

root@router:~# echo 2 > /proc/sys/net/ipv4/conf/tun0/rp_filter

And the subsequent ping attempt:

Pinging google.com [70.186.10.23] with 32 bytes of data:
Reply from 70.186.10.23: bytes=32 time=92ms TTL=42

Ping statistics for 70.186.10.23:
    Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 92ms, Maximum = 92ms, Average = 92ms

Oddly enough, turning off the policy altogether:

root@router:~# echo 0 > /proc/sys/net/ipv4/conf/all/rp_filter

doesn't work. Which I don't really understand.

Besides setting 0 in /proc/sys/net/ipv4/conf/all/rp_filter
should also set on interface, since the highest value configured in all or a specific interface takes precedence.

3 year necro