[Help] Docker container dnsmasq not replying to queries on Home Network – "Macvlan" and/or "Host"Network with Portainer, Ubuntu LTS 24.04

As the title says… Here’s some more network topology info and troubleshooting steps already taken:
I have registered a public domain (leal.app.br) and also registered it to use Cloudflare’s DNS as the DNS server, so I can already ping it from anywhere (response comes from a Cloudflare IP, as expected).


Network Topology

  • ISP Modem/Router: (accesses external Internet)
    • IP: 192.168.17.1
    • Acts as DNS for 192.168.17.x, forwards to ISP DNS
    • Main Router (TP-Link Deco XE75 Pro) - 1st floor:
    • WAN IP: 192.168.17.2
    • LAN IP: 10.0.0.1
    • Subnet: 255.255.252.0 (10.0.0.0/22) - valid IP range from 10.0.0.2 to 10.0.3.254 (10.0.0.1 is the gateway itself and 10.0.3.255 is the broadcast address for this network).
    • DHCP for all 10.0.0.0/22 devices
  • 2.5G Switch (managed but also dumb, bought on Aliexpress, “unknown” brand) - 2nd floor:
  • IP: 10.0.0.12
    *Connects main router, homelab server, main PC, and 2nd Deco
  • Homelab Server (Proxmox) - 2nd floor:
  • Proxmox: 10.0.0.10
  • Ubuntu Server VM: 10.0.0.100 (runs Docker, Portainer, Nginx Proxy Manager, etc.)
  • Main PC - 2nd floor:
  • IP: 10.0.2.1
  • Downstairs Windows PC:
  • IP: 10.0.1.1
  • 2nd Deco XE75 Pro (WiFi AP) - 2nd floor:
  • IP: 10.0.3.250
  • Dockerized Services on Ubuntu Server (network type, internal IP):
  • dnsmasq (macvlan, IP: 10.0.0.224)
  • Nginx Proxy Manager (bridge, 10.0.0.100:81)
  • Portainer (10.0.0.100:9443)

Goal
Set up Split-Horizon DNS so that internal hostnames (e.g., pve.leal.app.br) resolve to internal IPs (e.g., 10.0.0.10) for devices on the LAN, using a Dockerized DNS server (dnsmasq, possibly Unbound or BIND9 if needed). But can also act as a simple DNS Forwarder (to my ISP modem) and to also make sure that the external domain names that are made accessible externally by me (let’s say Plex or Jellyfin, in the future, that points to the internal plex/jellyfin server but also accessible externally from my house as in something like jellyfin.leal.app.br)


What Works
* All devices have internet access and can ping each other.
Docker containers are running and accessible via their mapped ports.
* I can ping the dnsmasq container’s IP (10.0.0.224) from my Windows PC (10.0.2.1).


What Doesn’t Work

  • When I set my Windows PC’s DNS to 10.0.0.224, all DNS queries time out and I lose internet access.
  • dnsmasq logs show queries arriving from my PC, but no replies are received by the client.

Troubleshooting Steps Taken

  1. Verified Container and Network:
  • dnsmasq is running in Docker (macvlan, static IP is 10.0.0.224, subnet mask set to 10.0.0.0/22 - same as 255.255.252.0).
  • Container is started via Portainer.
  • Confirmed with docker inspect that the container is on the correct macvlan network.
  1. Checked Listening Ports:
  • Inside the container, netstat -tulnp shows dnsmasq listening on 0.0.0.0:53 (UDP/TCP).
  • On the host, no process is listening on port 53 (as expected with macvlan).
  1. Tested Network Connectivity:
  • Can ping 10.0.0.224 from Windows PC.
  • tcpdump inside the container shows DNS queries arriving from the PC.
  1. Checked dnsmasq Logs:
  • Logs show queries from the PC (e.g., query[A] google.com from 10.0.2.1).
  • No reply lines or errors about upstream DNS.
  1. Tested Upstream DNS:
  • Configured dnsmasq to use only 1.1.1.1 as upstream (removed my local ISP modem as a DNS).
  • No change in behavior.
  1. Tested DNS from Inside the Container:
  • nslookup google.com inside the container works (using 127.0.0.1 or container’s own IP - 10.0.0.224 or 10.0.0.100 if network mode set to host).
  1. Tried Host Network Mode:
  • Ran dnsmasq in host network mode; still no replies to external clients. (10.0.0.100 is the Ubuntu Server host IP)
  1. Checked iptables/nftables:
  • UFW is inactive.
  • No obvious rules blocking UDP 53.
  1. Checked AppArmor:
  • No denials or restrictions found for dnsmasq.
  1. Tested with Local Records:
  • Attempted to resolve local records (e.g., leal.app.br); same issue—queries logged, no replies.
  1. Confirmed No Port Conflicts:
  • Nginx Proxy Manager and Portainer are on different ports and networks.
  1. Switch/Router:
  • No client isolation or port isolation enabled.
  • All devices are on the same subnet and can communicate.

Suspicions

  • Docker macvlan quirk: Host and containers on macvlan sometimes have reply path issues, but even with host networking, replies don’t reach clients.
  • dnsmasq-specific issue: Considering trying Unbound or BIND9 as an alternative.
    • Possible Docker/Portainer misconfiguration: But all other services work as expected.

What I Want to Achieve

  • Internal DNS resolution for custom domains (e.g., pve.leal.app.br → 10.0.0.10). (Split Horizon DNS)
  • Ideally, all LAN clients use the Dockerized DNS server for both local and external DNS.
  • No disruption to existing network/internet access.

Questions for the Community

  • Has anyone seen this behavior with Dockerized dnsmasq on macvlan or host networking?
  • Is there a known issue with reply packets not being routed back to clients in this setup?
  • Would switching to Unbound or BIND9 likely resolve this, or is there a deeper Docker networking issue at play?
  • Any tips for further debugging or a recommended configuration for Split-Horizon DNS in a home lab like this?

Thanks in advance for any help!
If more config files, logs, or network diagrams are needed, let me know.

P.S.: This is a summarized transcription of a ChatGPT conversation made via Abacus AI, I’ve asked it nicely to format all the important data as a forum post.
:saluting_face:

Try accessing the dns server from your computer using nslookup without “set[ting] my Windows PC’s DNS to 10.0.0.224”.

Hi!

Ok! I’ve changed the DNS IP to 10.0.0.53, makes more sense. Also, added a macvlan in Ubuntu so it can see the container (by default, it couldn’t). Used Abacus AI to help me do that, hopefully it was correct. Also, asked for it to make me a summary of my new attempts:

Key Changes and Troubleshooting Steps

  1. Checked dnsmasq is running as root:
  • Verified that the dnsmasq process inside the container is running as root, eliminating permission issues as a potential cause.
  1. Investigated dnsmasq logs and configuration:
  • Confirmed that dnsmasq was starting correctly with the provided configuration.
  • Identified that dnsmasq was being terminated, likely due to a Docker health check or manual intervention.
  1. Modified dnsmasq.sh to run in the background:
  • Removed --no-daemon from the dnsmasq command.
  • Added & to run dnsmasq in the background.
  • Added tail -f /dev/null to keep the script running.
  1. Identified host’s inability to reach the container:
  • Confirmed that nslookup worked inside the container but timed out from the Ubuntu server (host).
  • Recognized the limitation of Docker macvlan networks where the host cannot communicate with containers by default.
  1. Created a macvlan interface on the host:
  • Identified the correct network interface name (enp6s18).
  • Created a macvlan interface (macvlan0) on the host linked to enp6s18.
  • Assigned an unused IP address (10.0.1.254) to the macvlan0 interface.
  1. Troubleshooted routing issues:
  • Discovered duplicate routes for 10.0.0.0/22 on both enp6s18 and macvlan0.
  • Added a more specific route for the container’s IP (10.0.0.53/32) via macvlan0.
  1. Identified firewall issues:
  • Discovered that DNS packets were not reaching the macvlan0 interface.
  • Recognized that the host was receiving DNS queries on enp6s18 but not forwarding them to macvlan0.
  1. Configured IP forwarding and iptables:
  • Enabled IP forwarding in /etc/sysctl.conf.
  • Added iptables rules to forward UDP and TCP traffic on port 53 from enp6s18 to macvlan0.
  • Added a rule to allow established and related traffic from macvlan0 to enp6s18.

Current Status

  • IP forwarding is enabled.
  • iptables rules are in place to forward DNS traffic.
  • The next step is to test DNS resolution from the Windows PC or the Ubuntu host.
  • All required libraries for dnsmasq are present (no missing dependencies).
  • dnsmasq is listening on 0.0.0.0:53 and :::53 (so it should answer on all interfaces).
  • No SELinux/AppArmor denials (and the container is running unconfined).
  • No other process is using port 53.
  • Still, no queries are answered—even from inside the container.
  • Changed dnsmasq.sh last command back to --no-daemon

About NSLOOKUPs:

From my Win 11 PC (10.0.2.1), if I do a nslookup google.com 10.0.0.53 all I get is DNS request timet out. timeout was 2 seconds. Server: UnKnown. Address: 10.0.0.53.
I can ping it just fine, tho. Pinging 10.0.0.53, it replies normally. From my Win PC or from Ubuntu server, both can “see” 10.0.0.53 just fine.

P.S.: Forgot to say that when I do a DNS Lookup inside the container, via Portainer’s console UI thingy, I get the following:

leallab:/ nslookup google.com
Server:         127.0.0.11
Address:        127.0.0.11:53

Non-authoritative answer:
Name:   google.com
Address: 172.217.28.206

Non-authoritative answer:
Name:   google.com
Address: 2800:3f0:4001:846::200e

leallab:/ nslookup google.com 127.0.0.1
;; connection timed out; no servers could be reached

leallab:/ nslookup google.com 10.0.0.53
;; connection timed out; no servers could be reached

leallab:/ ping 10.0.0.100
PING 10.0.0.100 (10.0.0.100): 56 data bytes
64 bytes from 10.0.0.100: seq=0 ttl=64 time=0.059 ms
64 bytes from 10.0.0.100: seq=1 ttl=64 time=0.053 ms
^C
--- 10.0.0.100 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.053/0.056/0.059 ms
leallab:/ ping 10.0.0.53
PING 10.0.0.53 (10.0.0.53): 56 data bytes
64 bytes from 10.0.0.53: seq=0 ttl=64 time=0.025 ms
64 bytes from 10.0.0.53: seq=1 ttl=64 time=0.046 ms
64 bytes from 10.0.0.53: seq=2 ttl=64 time=0.045 ms
64 bytes from 10.0.0.53: seq=3 ttl=64 time=0.047 ms
^C
--- 10.0.0.53 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.025/0.040/0.047 ms

As you can see, by default, the nslookup is done through Docker’s DNS (127.0.0.11) and when I try to do it to localhost, it can’t reach it. Same for 10.0.0.53. But I can ping everything, just fine.