Issue with Docker on BTRFS Drive – Disk Full but Only ~30GB Used

Hi everyone,

I’m running into a recurring issue with my Docker setup and could really use some help.

Docker’s root (by default /var/lib/docker) directory is located on a dedicated 240GB SSD using BTRFS (in single mode).
I’m using Docker’s default overlay2 storage driver.

root@userver:~# cat /etc/docker/daemon.json 
{
    "storage-driver": "overlay2",
    "data-root": "/mnt/storage-docker/docker"
}
root@userver:~#

About once a week, the system reports the drive as full (df -h shows 100% usage), and Docker containers start failing or shutting down because they can’t write any data. However, when I check the actual disk usage with ncdu, it only shows around 30GB used.

To temporarily fix it, I stop Docker using systemctl stop docker, wait a little while, then start it again. After that, the disk space is reported correctly and everything goes back to normal. Just doing a fast restart doesn’t help—it has to be a proper stop/start cycle.

I’ve already tried scrubbing and rebalancing the BTRFS filesystem, but no issues were found.

I’m planning to move to a mirrored BTRFS setup using multiple drives soon, and I really don’t want to bring this issue over to the new setup.

Has anyone seen something like this?
Any tips on how to fix it or prevent it from happening again?

PS: I have no idea why but I cannot receive emails from this form platform. I’ve tried a password reset but I didn’t receive a thing. This has been going on for a while. Checked gmail filters and nothing in there. Also the winraid forum seems to be having the same issue.

Thanks!

what does sudo btrfs filesystem df /my/disks/mountpoint show?

maybe even sudo btrfs subvolume list (or show) /my/disks/mountpoint, could be millions of snapshots for all we know.

A lot of non-CoW tools in Linux are having trouble making sense of the CoW mechanics. I wouldn’t trust anything other than the CLI commands for BTRFS. I remember my Nautilus stating 80TB used out of 1.9TB…it was just adding all the .snap directory to the total. Utterly useless :slight_smile:, but technically the snapshots are POSIX-compliant and have filesizes, because browsable :wink:

1 Like

Are you perhaps using -x (search single file system)? In that case it won’t be accurate since it stops at btrfs subvolume boundaries. Also make sure to run as root since otherwise not all files can be accounted for due to permissions.

https://linux.die.net/man/1/ncdu

Most probably the disk is actually full and it works after a restart because temp files, caches, etc get purged.

1 Like

When the issue happends the df shows the disk as full.

As of now no snapshots or subvolumes has been created due to the issue above.

That is why I also tried with btdu. and the space he is reporting is still around 30GB and the rest of the space is marked as errors/unreachable.

Yep, running as root and with -x. but Iḿ not using subvolumes/snapshot.

Was thinking the same but btdu is showing that the fs has errors. But by restarting docker they go away… and I think docker temp files should show up on the disk tree.

Run without -x. Docker uses subvolumes for its volumes and images…

What kind of errors? Possibly permission errors? btdu is not a filesystem checking tool.

Also when using "storage-driver": "overlay2"?.
Why should the volumes or the images be freed up when stopping docker?

I´m gonna run without the -x but it will make 0 sense as most of my volumes are nfs shares and would show up in ncdu results.


As I expected.

In btdu you can see the ‘unused’ space. When the issue shows up the errors section will contains the 40-ish GB, and the 'unused’will be 0.

I’m confused… Your opening post says /var/lib/docker is running out of space and now you show /mnt/storage-docker?

What does docker info say? If I read the documentation correctly, overlay-2 is not supported on btrfs? So it might just ignore that setting. Or… perhaps that’s the issue? Would it be a lot of work to switch to the btrfs driver?

My bad I’ve explained it wrong. I meant that by default is that path but I’ve moved to a dedicated drive that is btrfs formatted.

root@userver:~# docker info
Client: Docker Engine - Community
 Version:    28.3.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.25.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.37.3
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 34
  Running: 28
  Paused: 0
  Stopped: 6
 Images: 39
 Server Version: 28.3.0
 Storage Driver: overlay2
  Backing Filesystem: btrfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 CDI spec directories:
  /etc/cdi
  /var/run/cdi
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
 runc version: v1.2.5-0-g59923ef
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.12-11-pve
 Operating System: Debian GNU/Linux 12 (bookworm)
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 46.96GiB
 Name: userver
 ID: 9a7b187d-3a36-4387-8a4e-ffaf9e32566c
 Docker Root Dir: /mnt/storage-docker/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  ::1/128
  127.0.0.0/8
 Live Restore Enabled: false

Somehow on my previous post I’ve forgot to paste the commands outputs:

root@userver:~# sudo btrfs subvolume list /mnt/storage-docker
root@userver:~# sudo btrfs filesystem df /mnt/storage-docker
Data, single: total=107.66GiB, used=102.12GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=2.03GiB, used=974.83MiB
GlobalReserve, single: total=190.95MiB, used=0.00B
root@userver:~#

Next time it happens, try to stop containers sequentially while logging disk usage. It could help to find the culprit that is holding all that space.

Maybe there is a process that is keeping open some big files that have been deleted. Until a process close a file that became anonymous, the hoarded fs space won’t be given back

Oh I’ve already tried that. The only way is to restart docker and only docker.

When it will happened I’ll show you :slight_smile:

1 Like

Yesterday
the drive was near the “time limit” and today docker is stuck.

user@userver:~$ df -h
File system           Dim. Usati Dispon. Uso% Montato su
 ...
/dev/sdd1             112G  110G     64K 100% /mnt/storage-docker
...
user@userver:~$ sudo btrfs subvolume list /mnt/storage-docker
user@userver:~$ sudo btrfs filesystem df /mnt/storage-docker
Data, single: total=107.66GiB, used=107.66GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=2.03GiB, used=979.02MiB
GlobalReserve, single: total=190.83MiB, used=0.00B

Can confirm that by stopping all the containers the space is still allocated.
After stop + start:

user@userver:~$ df -h
File system           Dim. Usati Dispon. Uso% Montato su
...
/dev/sdd1             112G   41G     70G  37% /mnt/storage-docker
...

if this magically goes away stopping + starting containers it’s probably a docker (or configuration) issue. Btrfs is either fragmented or it isn’t, stuff isn’t just fixed by stopping applications.

while from documentation, overlay2 is allegedly not supported on btrfs Select a storage driver | Docker Docs
afaik they want to recommend to use overlay2 on btrfs. docs: Add warning about using btrfs storage driver by vvoland · Pull Request #22621 · docker/docs · GitHub

Your issues reminds me of Docker gradually exhausts disk space on BTRFS · Issue #27653 · moby/moby · GitHub
which happens to people using the btrfs driver and not overlay2. Which is strange, maybe you are actually using btrfs driver even if you configured overlay2?

In the last two posts in that thread mention how this seems to be a fundamental issue between docker and btrfs (with the btrfs driver anyway), and how using overlay2 storage driver or podman (which is another software to run docker containers) works differently and does not have this issue on btrfs

It does not by stopping the containers. you have to restart stop, wait start docker back.

There is no direct confirmation that btrfs is not supported on overlayfs2. AS stated the btrfs is still having issues.

can´t see how:

user@userver:~$ cat /etc/docker/daemon.json 
{
    "storage-driver": "overlay2",
    "data-root": "/mnt/storage-docker/docker"
}

And i’ve provided before that docker info return overlayfs as the driver.

ENOENT stands for error no entry, and seems to account for the difference between the used space reported by du and df?

https://btrfs.readthedocs.io/en/latest/trouble-index.html

I use docker routinely on btrfs too, and apparently also with overlay2 (which is the default on my fedora machines and I’ve never changed it). I’ve never seen such issues though…

Does seem like something strange going on. Maybe another driver could help? But it seems worthy of a bug report perhaps…

I’ve migrated the data to a new set of drives in raids as I originally intended. I’ll keep you guys updated.

1 Like