Using NGINX reverse-proxy to make certain Docker-containers only reachable over WireGuard

drafput · July 24, 2021, 8:56pm

Good day, everybody.

For the last couple of days I’ve been trying to consolidate my servers and make my whole self-hosted infrastructure more portable. I would love to accomplish this using just docker(-compose).

I’ve already put all of my services into one big docker-compose file, with all services behind an NGINX reverse-proxy. To reduce the potential attack-surface, I would like to make my more sensitive services (password-manager, webmail, media-server, etc.) only reachable through a WireGuard VPN.

I’ve come as far as configuring a WireGuard container (linuxserver/docker-wireguard) and connecting to it. Where I’ve been failing is in trying to get the NGINX-container to listen on both the ethernet and wireguard interfaces.

This is a heavily redacted version of the docker-compose.yml:

services:
  nginx:
    image: nginx:mainline-alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "/certs:/etc/ssl/cert"
      - "/static:/usr/share/nginx/html"
      - "/log:/var/log/nginx"
    networks:
      - public
      - private
      - wireguard
    
  wireguard:
    image: linuxserver/wireguard
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    volumes:
      - "/config:/config"
      - "/lib/modules:/lib/modules"
    ports:
      - "51820:51820/udp"
    sysctls:
      - net.ipv4.conf.all.src_valid_mark=1
    networks:
      - wireguard
    
  public-container:
    ...
    networks:
      - public
    
  private-container:
    ...
    networks:
      - private

And this is what i was trying in the NGINX config:

server {
    listen <eth0-ipv4-address>:443 ssl;
    listen [<eth0-ipv6-address>]:443 ssl;

    server_name public.com

    include /etc/nginx/public-proxy.conf;
}

server {
    listen <wg0-ipv4-address>443 ssl;
    listen [<wg0-ipv6-address>]:443 ssl;

    server_name private.com

    include /etc/nginx/public-proxy.conf;
    include /etc/nginx/private-proxy.conf;
}

But it doesnt seem to be so easy. I’ve also tried setting network_mode: "service:wireguard" on the NGINX container, but then I make the proxy unreachable on the eth0 interface.

Any suggestions or pointers would be dearly appreciated. Should there be any way to do this, I will probably also need some help with the NGINX configuration and getting the DNS-situation figured out.

I thank you in advance!

zhnu · July 25, 2021, 2:10pm

Sorry had to think about it better I misread the first time:
So what I think it will work is you create 2 nginx containers, the first would be “public.com” and you would leave it with the current compose, the second container would contain “private.com”, and you would use network_mode: "service:wireguard" on that container since it will route all traffic thought there, you wouldn’t need the ports directive on that compose because the traffic will be routed thought the wireguard udp port.

  nginx:
    image: nginx:mainline-alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "/certs:/etc/ssl/cert"
      - "/static:/usr/share/nginx/html"
      - "/log:/var/log/nginx"
    networks:
      - public
  nginx:
    image: nginx:mainline-alpine
    network_mode: "service:wireguard"
    volumes:
      - "/certs:/etc/ssl/cert"
      - "/static:/usr/share/nginx/html"
      - "/log:/var/log/nginx"
    networks:
      - private
      - wireguard

If somehow nginx isn’t resolving the correct ip for the private-container you can tell nginx to use the docker dns: resolver 127.0.0.11 ipv6=off;

drafput · July 28, 2021, 3:43pm

Thank you for the suggestion, @zhnu! While what you are suggesting definitely works (your solution is what I initially pieced together using my google-fu), I was hoping there was a more “elegant” solution using only a single NGINX container. I’ll definitely keep it as my plan B though.

So please indulge me as I continue on this quest. In my continued testing I bumped into some things I’m unable to explain and I hoped you could help me shine some light on these.

Observation #1: When connected through WireGuard, traceroute shows only 2 hops.

The first being the WireGuard endpoint of the server and the second being the public address of the server. This led me to assume that Docker automatically added the necessary iptables-rules to route packets between its internal networks, as well as the host’s network.

Based on that assumption, I would expect to see the internal address of the WireGuard container (something in the form of 172.x.0.[2…254]) appear in the access log of NGINX. If that were the case, I could use an allow-directive in my NGINX configuration to restrict my private services to requests coming from the docker network my WireGuard container is on. But this is what I got instead:

That is the address of the default gateway on a different docker network, namely the one my certbot container is on. It always seems to come from the last network that is created by docker-compose.

Observation #2: When connected through WireGuard, nslookup correctly resolves docker’s internal hostnames, but browsers don’t.

While nslookup resolves the correct address of the NGINX container on the internal docker network, I get a DNS error in both firefox and edge, both with DoH disabled. I can however reach the actual address resolved by nslookup in both of those browsers.

I was testing this because I eventually wanted to have the docker DNS resolve to the internal IP addresses of the containers.

Observation #3: NGINX becomes unreachable when I remove the HTTP and HTTPS rules from ufw.

It was at this point that I remembered that docker overrides the iptables created by ufw anyway, so I went and removed all ufw allow-rules apart from those for ssh. This made NGINX unreachable when connected through WireGuard. It was still perfectly reachable without WireGuard though, albeit only over IPv4. (but IPv6 has always been a bit of a headache when using Docker though, so I was anticipating having to drop support for it)

Seeing it all typed out like this, it does seem like a bit of a soup, but any insight into either of these observations would be highly appreciated.

Novasty · July 28, 2021, 4:42pm

I am still trying to visualize the environment while I am typing this, so forgive me if I get something wrong.

Wouldn’t a split DNS + Reverse proxy be a preferable option?

So how it would work is the reverse proxy only serves content that is meant to be public while all your private services would not be mapped to the proxy.

You would then create an internal DNS server to do all your internal pointing while forwarding any public queries to a external DNS server (ISP or some other).

When you set up your Wireguard tunnel, add in the DNS flag, so the machine knows to use the internal DNS & redirect any private traffic you need through the tunnel.

drafput · July 28, 2021, 6:33pm

Okay, so I went back to the solution with 2 separate NGINX instances as suggested by @zhnu and stumbled into a small problem. See, when you define “network_mode” on a container, you then can’t then define “networks:”. To get it to work, I’ve had to do the following:

nginx-public:
  image: nginx:mainline-alpine
  ports:
    - "80:80"
    - "443:443"
  volumes:
    ...
  networks:
    - public

nginx-private:
  image: nginx:mainline-alpine
  network_mode: "service:wireguard"
  volumes:
    ...

wireguard:
  image: linuxserver/wireguard
  ports: 
    - 51820:51820/udp
  volumes:
    ...
  networks:
    - public
    - private

See how I’ve had to define the networks on the WireGuard container? Just a clarification should anyone come across this topic in the future.

Now, what @Novasty is suggesting sounds more like the complete solution i am looking for. I’ll make a few diagrams to aid in communication since words alone don’t seem to get it across completely.

drafput · July 28, 2021, 8:20pm

Voila, diagrams! Hope they help.

public

So this first one shows everything that should be reachable from the public web in green. These are all the static pages served by NGINX itself, as well as the public (read: non-admin) pages of some upstream web apps.

private

This second diagram shows everything that should be reachable ONLY through the WireGuard connection in red. That would be the rsync and Bitwarden servers, as well as the admin pages of those same upstream web apps.

Novasty · July 28, 2021, 8:34pm

Map everything through the NGINX Proxy and whitelist internal and wireguard IPs for access to admin panels and private services.

Something to get you started:

drafput · July 28, 2021, 10:01pm

The thing there is that all of the addresses inside of docker are dynamically allocated. And as I already noted in my second post, even if you define a static IP for the WireGuard container, that isn’t necessarily the address NGINX sees. So should I whitelist the entire 172.16.0.0/12 range? Because that feels like a bad idea… Is that a bad idea?

Edit: So apparently 172.16.0.0/12 is reserved for internal use. Is that still universally honored?

Edit2: Well, it definitely works! Many thanks to @Novasty for his input. Now if someone could give me some reassurance as to the security of this solution, i think this one is solved!

zhnu · July 29, 2021, 8:19pm

Sorry for the slow reply, I think you could route everything trough the wireguard container including the nginx container and exposing those ports on wireguard to make it more “solid”, but I’m assuming it’ll take too much time and effort to make it work properly because the most likely scenario is that wireguard would break internal docker DNS resolution and block everything not listed in a whitelist.

drafput · July 30, 2021, 10:56am

So, the solution with whitelisting internal IP’s as suggested by @Novasty doesn’t seem to be working.

What I did was the following. I edited the daemon.json like so:

{
  "default-address-pools":
  [
    {"base":"192.168.0.0/16","size":24}
  ]
}

and I then edited the NGINX configuration like so:

server {
  listen 443 ssl;
  listen [::]:443 ssl;

  server_name private.example.com;

  location / {
    allow 192.168.0.0/16;
    deny all;
  }
}

Initially, everything seemed to be working. When I connected through WireGuard, I could reach pages on the private.example.com domain, but without WireGuard, I got a 403. Brilliant.

However, today on campus I opened the browser on my laptop (which doesn’t even have WireGuard installed) and I was greeted with the admin login at private.example.com. What the…

The access log shows requests were coming in from 192.168.3.1, an internal docker address. Yet no clients were connected through WireGuard and https://ifconfig.co/ showed me a different, public IP.

I’m running some dummy requests against it now, and it seems to semi-randomly let one through because as far as NGINX knows, the request does come from inside docker and should be let through.

Now, this is what bothers me the most. That internal IP (192.168.3.1)? Well, that’s coming from the subnet my postgres container is on… I noted similar behavior in the third post on this topic, but that was when connected through WireGuard. Could somebody explain how that is happening?

Novasty · July 30, 2021, 4:14pm

what ip range does your campus use?

Move your whitelist to be right below the subdomain as well.

So it should look like:

  server_name private.example.com;
  allow 192.168.0.0/16;
  deny all;

drafput · August 3, 2021, 6:59pm

It looks like the network at my campus is on a 172.16.0.0/12 subnet, but that shouldn’t matter, right?

I’ve also moved the whitelist as @Novasty suggested, but that hasn’t changed anything.

I’ve also used a MITM-proxy to log the incoming requests to the server and they all have from-addresses that should be rejected by the whitelist, yet some show up in the access log with a docker-internal address.
Would you by any chance know of some network-tracing utility I could use to inspect packet routing inside of the docker network?

drafput · August 3, 2021, 8:01pm

I think I found the culprit! Drum roll, please… IPv6

By default Docker seems to map all incoming IPv6 traffic to an internal IPv4 address. This means that any host connecting through IPv6, is automatically on the whitelist.

I remember IPv6 in Docker being a PITA in the early days, but I still can’t easily find any documentation that explains how or why this works (I suspect it has something to do with the userland proxy?). If anyone knows where to find this, i would love to know.

It also seems that IPv6 support in Docker is finally coming along, so I’ll do some experimenting with that and report back if I make some progress.

drafput · August 4, 2021, 1:59am

I am starting to regret ever even looking into this…

So, my server only gets a /128, so I’ll have to resort to NATv6 if I want the actual client IPv6 to show up in the access log of NGINX.

I tried both the new experimental --ip6tables flag and ye olde robbertkl/docker-ipv6nat.
The official but experimental NATv6 broke WireGuard due to conflicting ip6tables rules and after my initial attempt, I can’t seem to get it to work anymore. The same goes for the NATv6-in-a-container solution (can’t get that working to begin with).

I’m also scratching my head at the fact that my access log does actually contain a lot of IPv6 addresses, but they suddenly stop (possibly after an update?).

To at least get some of my services back into use, I disabled IPv6 on the host and removed all AAAA records from my DNS. This is no solution however since I’m on IPv6-only networks most of my time (yay for countries with high adoption rates, AMIRITE )

It feels like this is getting a bit off-topic, but if anyone has ever successfully gotten IPv6 (be it with NAT or otherwise), I sure could use some pointers.

drafput · August 4, 2021, 10:14am

So, I got the robbertkl/docker-ipv6nat container set up. Now the correct IPv6 addresses show up in the access log and I can reach the internet when connected through WireGuard, but I can’t directly connect to any of the services running inside of docker anymore when connected through WireGuard. So it’s like this:

Wireguard off
GET admin.example.com/ → 403, IPv6 logged in access.log
GET example.com/ → 200, IPv6 logged in access.log*
GET eff.org/ → 200

Wireguard on
GET admin.example.com/ → err_connection_timed_out, nothing logged in access.log
GET example.com/ → 200, IPv4 logged in access.log*
GET eff.org/ → 200

* the top-level example.com/ is proxied through Cloudflare and staysreachable in both scenarios

daemon.json:

{
  "userland-proxy": false,
  "ipv6": true,
  "fixed-cidr": "192.168.0.0/16",
  "fixed-cidr-v6": "fd00:8008:8008::/64",
  "default-address-pools": [
    {
      "base": "192.168.0.0/16",
      "size": 24
    },
    {
      "base": "fd00:dead:beef::/48",
      "size": 64
    }
  ]
}

docker-compose.yml:

services:
  natv6:
    container_name: natv6
    image: robbertkl/ipv6nat
    network_mode: "host"
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/lib/modules:/lib/modules:ro"
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    restart: unless-stopped

  nginx:
    container_name: nginx
    image: nginx:mainline-alpine
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ...
    networks:
      - default
      - wireguard
      - ...
    restart: unless-stopped
  
  wireguard:
    image: linuxserver/wireguard
    container_name: wireguard
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    environment:
      - ...
    volumes:
      - "/home/ubuntu/docker/wireguard-data:/config"
      - "/lib/modules:/lib/modules"
    ports:
      - 51820:51820/udp
    sysctls:
      - net.ipv4.conf.all.src_valid_mark=1
    restart: unless-stopped
    networks:
      - wireguard

networks:
  default:
    driver: "bridge"
    enable_ipv6: true
    driver_opts:
      com.docker.network.bridge.name: br_default
    ipam:
      config:
        - subnet: "fd00:8008:8008:1::/64"

  wireguard:
    name: wireguard
    driver: "bridge"
    driver_opts:
      com.docker.network.bridge.name: br_wireguard

nginx.conf:

...
server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;

    server_name  example.com;
    charset      utf-8;

    ...

    index index.html;
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;

    server_name admin.example.com;
    charset     utf-8;

    ... 

    allow 192.168.0.0/16;
    deny all;

    location / {
        proxy_pass       http://django:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }

    location /stats {
        proxy_pass       http://expressjs:80;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
    }
}
...

ThatGuyB · August 4, 2021, 1:02pm

Just curious as to why you’re using wireguard and even nginx inside docker, instead of directly on your VM. Especially running wireguard directly on the host OS might save you some headaches.

drafput · August 4, 2021, 8:28pm

There are actually several reasons.

The first being that I change servers and distro’s all the time, so I used to put more time into set-up and troubleshooting than I put into actually using my services. Containers were sold to me as a “set up once, redeploy easily” solution.

I used to run every service in its own VM, but since I’ve had to scale down my infrastructure significantly, the overhead was becoming a problem (in addition to no longer having access to a VMware student license). So far using Docker seems to have saved a nice chunk of system resources.

I made the move to VM’s after a security scare where a classmate managed to deface my website through a Samba vulnerability. I was told that moving to containers could keep that isolation between services intact.

One nitpicky reason is that I’m allergic to the way applications leave junk or side effects behind after uninstalling. I like to try out new things, but the fear of cluttering my system is constantly holding me back. I hoped containers would allow me to quickly set up and test new things without the hassle of having to roll back everything if I don’t like it.

The final reason is that I’ll be looking for a job pretty soon and Docker is pretty high in demand around here.

ThatGuyB · August 4, 2021, 10:20pm

I think this is not healthy. Very rarely you will see a medium-sized business switch distros, not to mention big enterprises. As for home use, I think that’s unhealthy too, unless really, really necessary. But that’s just my opinion.

However, wireguard is the same on every distro. You only have to do the conf file in /etc/wireguard and back it up. You copy the conf file whenever you switch distros or move servers, it’s the easiest thing in the world.

There are free virtualization alternatives out there, which you should probably be aware of, because some business may require to run different OS’es alongside OCI containers (for example, a Windows Server for Active Directory or something). Proxmox, xcp-ng and libvirt (with virt-manager, OpenStack or OpenNebula) come to mind.

But I do agree containers save resources. I also run all of my services under their own VM, but lately I’ve been getting into Linux Containers (LXD) which are similar to how VMs work (even have live migration and HA), but instead of virtualizing the hardware (like CPU and NIC), you only virtualize the OS. LXD is not in such a high-demand as Docker, but it’s arguably similar fast to deploy if you take your time to automate stuff.

However, I still encourage you to keep learning OCI Containers (“docker”). They are in high demand, as you said, but don’t just blindly stick to one technology, use whatever is good for the job.

As for the Samba incident, keep your services up to date and secure them. For example, Samba only needs ports 445 and arguably 22 open (arguably, because you could use an administration interface, like VNC in Proxmox and not need SSH), block everything else. Additionally, if you don’t use a separate AAA server, when you create users accounts, if they don’t need to SSH into the server, set their shell’s to /bin/false and only set Samba account password, not local account password. Moreover, remove SSH password authentication and only use SSH keys. Going even further, don’t allow root login and only login via SSH to an unprivileged account (non sudoer) and only then su into a sudoer account (never to root).

Most businesses will be using either RHEL / CentOS (some may switch to CentOS alternatives after the CentOS 8 debacle, like Rocky or Alma), Oracle Linux, Ubuntu and SLES / OpenSUSE. Sometimes you may see Debian here and there and very rarely Gentoo. Most safe bet is, despite being a beta for RHEL, to stick with CentOS Stream.

LXD can help you test distros easily and maintain a pretty big homelab without the big resource consumption of traditional VMs.

I wish I could help you to debug your Wireguard issue, but I’m just now getting into OCI containers (I’ve only ran prepackaged docker containers, like bitwarden_rs). My recommendation is still to run wg as a VM, or rather, directly on the OS that runs your OCI containers for nginx and everything else. I prefer having my VPN on my router / firewall, because that way I can route traffic wherever I need to and filter traffic in one place, but that’s just me.

I believe the issue with wg may be the internal Docker networking. I don’t have sufficient knowledge to help you debug it unfortunately.

drafput · August 5, 2021, 11:18am

I completely agree with the points you make.

I had initially looked into LXD, Kata, or even Firecracker. None of those managed to persuade me away from Docker though (Which often gets first-party container images, good community support, pretty much all other niceties that come with a large userbase).

I already apply most security best practices, but it’s always the updates or server migration where it goes wrong. That’s why I stick so desperately to docker. Nothing beats simply having to update the host OS and pulling the latest images.

I don’t want a sysadmin job on the side just to host my own email and Bitwarden. I thought docker could be my easy solution.

That being said, I’m still looking to get this working. If I don’t find a solution within the next few weeks, I’ll gather my losses and consider something else.

ThatGuyB · August 5, 2021, 11:25am

You don’t necessarily need a sysadmin job for the Bitwarden(_rs) stuff, however, you have no other choice but to be a sysadmin on the side if you self-host your email, unless your email server is internal only, unfortunately. It takes quite a lot of work to keep your email server secure and out of random blacklists.