Moby: orphaned diffs

Created on 20 Apr 2016  Â·  126Comments  Â·  Source: moby/moby

I'd like to know why docker uses so much disk, even after removing _all_ containers, images, and volumes.
It looks like this "diff" has a layer, but the layer isn't referenced by anything at all.

/var/lib/docker/aufs/diff# du-summary
806628  c245c4c6d71ecdd834974e1e679506d33c4aac5f552cb4b28e727a596efc1695-removing
302312  a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
302304  957e78f9f9f4036689734df16dabccb98973e2c3de0863ef3f84de85dca8d92d
302256  8db1d610f3fbc71415f534a5d88318bbd2f3f783375813f2288d15f15846d312
288204  ac6b8ff4c0e7b91230ebf4c1caf16f06c1fdceff6111fd58f4ea50fc2dd5050b
288180  04a478c413ea80bcfa7f6560763beef991696eace2624254479e5e5dd69708c6
287804  d033ab6e4e5231dc46c6c417c680b239bb0e843024738517cbb0397128e166ca
233420  8e21143dca49e30cae7475b71b5aee9b92abe2069fbb9ab98ce9c334e3f6d4fa
212668  a631b94f7a2d5d21a96a78e9574d39cdeebbc81b51ac6c58bd48dc4045656477
205120  ae13341f8c08a925a95e5306ac039b0e0bbf000dda1a60afb3d15c838e43e349
205120  8d42279017d6095bab8d533ab0f1f7de229aa7483370ef53ead71fe5be3f1284
205116  59b3acd8e0cfd194d44313978d4b3769905cdb5204a590069c665423b10150e3
205116  040af0eee742ec9fb2dbeb32446ce44829cd72f02a2cf31283fcd067e73798ab
158024  ef0a29ff0b515c8c57fe78bcbd597243de9f7b274d9b212c774d91bd45a6c9b1
114588  061bd7e021afd4aaffa9fe6a6de491e10d8d37d9cbe7612138f58543e0985280
114576  149e8d2745f6684bc2106218711991449c452d4c7e6203e2a0f46651399162b0
114532  52b28112913abb0ed1b3267a0baa1cacd022ca6611812d0a8a428e61ec399589
114300  52475beba19687a886cba4bdb8508d5aaf051ceb52fb3a65294141ab846c8294
76668   4e6afb958b5ee6dea6d1a886d19fc9c780d4ecc4baeebfbde31f9bb97732d10d
76640   c61340c6a962ddd484512651046a676dbbc6a5d46aecc26995c49fe987bf9cdc

/var/lib/docker/aufs/diff# du -hs a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
296M    a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea

$ docker-find a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
+ docker=/var/lib/docker
+ sudo find /var/lib/docker '(' -path '/var/lib/docker/aufs/diff/*' -o -path '/var/lib/docker/aufs/mnt/*' ')' -prune -o -print
+ grep a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
/var/lib/docker/aufs/layers/a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
+ sudo find /var/lib/docker '(' -path '/var/lib/docker/aufs/diff/*' -o -path '/var/lib/docker/aufs/mnt/*' ')' -prune -o -type f -print0
+ sudo xargs -0 -P20 grep -l a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
/var/lib/docker/aufs/layers/993e4988c510ec3ab4f6d139740a059df40585576f8196817e573a9684554c5c
/var/lib/docker/aufs/layers/95e68d59a8704f2bb52cc1306ca910ddb7af8956eb7c57970fcf7d8b3d9baddb
/var/lib/docker/aufs/layers/4e6afb958b5ee6dea6d1a886d19fc9c780d4ecc4baeebfbde31f9bb97732d10d
/var/lib/docker/aufs/layers/fd895b6f56aedf09c48dba97931a34cea863a21175450c31b6ceadde03f7b3da
/var/lib/docker/aufs/layers/ac6b8ff4c0e7b91230ebf4c1caf16f06c1fdceff6111fd58f4ea50fc2dd5050b
/var/lib/docker/aufs/layers/f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579-init
/var/lib/docker/aufs/layers/d5bbef5adf2efb6f15d4f96c4bee21beb955255d1ec17baf35de66e98e6c7328
/var/lib/docker/aufs/layers/9646360df378b88eae6f1d6288439eebd9647d5b9e8a471840d4a9d6ed5d92a4
/var/lib/docker/aufs/layers/cf9fd1c4a64baa39b6d6d9dac048ad2fff3c3fe13924b07377e767eed230ba9f
/var/lib/docker/aufs/layers/f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579
/var/lib/docker/aufs/layers/23ce5a473b101d85f0e9465debe5a0f3b8a2079b99528a797b02052d06bc11d8
/var/lib/docker/image/aufs/layerdb/sha256/d1c659b8e3d0e893e95c8eedc755adcb91a1c2022e1090376b451f7206f9b1c0/cache-id

$ sudo cat /var/lib/docker/image/aufs/layerdb/sha256/d1c659b8e3d0e893e95c8eedc755adcb91a1c2022e1090376b451f7206f9b1c0/diff
sha256:b5185949ba02a6e065079660b0536672c9691fb0e0cb1fd912b2c7b29c91d625

$ docker-find sha256:b5185949ba02a6e065079660b0536672c9691fb0e0cb1fd912b2c7b29c91d625
+ docker=/var/lib/docker
+ sudo find /var/lib/docker '(' -path '/var/lib/docker/aufs/diff/*' -o -path '/var/lib/docker/aufs/mnt/*' ')' -prune -o -print
+ grep sha256:b5185949ba02a6e065079660b0536672c9691fb0e0cb1fd912b2c7b29c91d625
+ sudo find /var/lib/docker '(' -path '/var/lib/docker/aufs/diff/*' -o -path '/var/lib/docker/aufs/mnt/*' ')' -prune -o -type f -print0
+ sudo xargs -0 -P20 grep -l sha256:b5185949ba02a6e065079660b0536672c9691fb0e0cb1fd912b2c7b29c91d625
/var/lib/docker/image/aufs/layerdb/sha256/d1c659b8e3d0e893e95c8eedc755adcb91a1c2022e1090376b451f7206f9b1c0/diff
arestoragaufs kinbug

Most helpful comment

# du -sh /var/lib/docker/aufs/diff/
1.9T    /var/lib/docker/aufs/diff/

All 126 comments

# docker --version
Docker version 1.10.3, build 99b71ce

# docker info
Containers: 3
 Running: 0
 Paused: 0
 Stopped: 3
Images: 29
Server Version: 1.10.3
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 99
 Dirperm1 Supported: false
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 3.13.0-83-generic
Operating System: <unknown>
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 125.9 GiB
Name: dev34-devc
ID: VKMX:YMJ2:3NGV:5J6I:5RYM:AVBK:QPOZ:ODYE:VQ2D:AF2J:2LEM:TKTE
WARNING: No swap limit support

I should also show that docker lists no containers, volumes, or images:

$ docker images -a
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

$ docker volume ls
DRIVER              VOLUME NAME

strange; especially because of;

Containers: 3
 Running: 0
 Paused: 0
 Stopped: 3
Images: 29

which doesn't match the output of docker images / docker ps.

What operating system are you running on?

Operating System: <unknown>

@tonistiigi any idea?

That was afterward. I guess some processes kicked off in the meantime.

The state I'm referring to (I have now) is:

$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0

And I still have:

$ sudo du -hs /var/lib/docker/aufs/diff/a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
296M    /var/lib/docker/aufs/diff/a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea

We're on Ubuntu Lucid with an upgraded kernel =/

$ uname -a
Linux dev34-devc 3.13.0-83-generic #127-Ubuntu SMP Fri Mar 11 00:25:37 UTC 2016 x86_64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 10.04.1 LTS
Release:        10.04
Codename:       lucid

It seems an interesting issue.
is it possible to have way to reproduce it ? @bukzor

Surely it's possible, but I don't know how.
Please try running the below script on one of your active docker hosts and see what's left.
In our case, there's always plenty of diffs left behind.

``` #!bash

!/bin/bash

set -eu

echo "WARNING:: This will stop ALL docker processes and remove ALL docker images."
read -p "Continue (y/n)? "
if [ "$REPLY" != "y" ]; then
echo "Aborting."
exit 1
fi

xdocker() { exec xargs -P10 -r -n1 --verbose docker "$@"; }

set -x

remove containers

docker ps -q | xdocker stop
docker ps -aq | xdocker rm

remove tags

docker images | sed 1d | grep -v '^' | col 1 2 | sed 's/ /:/' | xdocker rmi

remove images

docker images -q | xdocker rmi
docker images -aq | xdocker rmi

remove volumes

docker volume ls -q | xdocker volume rm
```

One possible way I see this happening is that if there are errors on aufs unmounting. For example, if there are EBUSY errors then probably the image configuration has already been deleted before.

@bukzor Would be very interesting if there was a reproducer that would start from an empty graph directory, pull/run images and get it into a state where it doesn't fully clean up after running your script.

That would be interesting, but sounds like a full day's work.
I can't commit to that.

Here's some more data regarding the (arbitrarily selected) troublesome diff above, a800.

``` #!sh
$ docker-find a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea | sudo xargs -n1 wc -l | sort -rn

  • sudo find /nail/var/lib/docker '(' -path '/nail/var/lib/docker/aufs/diff/' -o -path '/nail/var/lib/docker/aufs/mnt/' ')' -prune -o -print
  • grep a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
  • sudo find /nail/var/lib/docker '(' -path '/nail/var/lib/docker/aufs/diff/' -o -path '/nail/var/lib/docker/aufs/mnt/' ')' -prune -o -type f -print0
  • sudo xargs -0 -P20 grep -l a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
    15 /nail/var/lib/docker/aufs/layers/f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579
    14 /nail/var/lib/docker/aufs/layers/f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579-init
    13 /nail/var/lib/docker/aufs/layers/993e4988c510ec3ab4f6d139740a059df40585576f8196817e573a9684554c5c
    12 /nail/var/lib/docker/aufs/layers/cf9fd1c4a64baa39b6d6d9dac048ad2fff3c3fe13924b07377e767eed230ba9f
    11 /nail/var/lib/docker/aufs/layers/4e6afb958b5ee6dea6d1a886d19fc9c780d4ecc4baeebfbde31f9bb97732d10d
    10 /nail/var/lib/docker/aufs/layers/23ce5a473b101d85f0e9465debe5a0f3b8a2079b99528a797b02052d06bc11d8
    9 /nail/var/lib/docker/aufs/layers/95e68d59a8704f2bb52cc1306ca910ddb7af8956eb7c57970fcf7d8b3d9baddb
    8 /nail/var/lib/docker/aufs/layers/ac6b8ff4c0e7b91230ebf4c1caf16f06c1fdceff6111fd58f4ea50fc2dd5050b
    7 /nail/var/lib/docker/aufs/layers/fd895b6f56aedf09c48dba97931a34cea863a21175450c31b6ceadde03f7b3da
    6 /nail/var/lib/docker/aufs/layers/d5bbef5adf2efb6f15d4f96c4bee21beb955255d1ec17baf35de66e98e6c7328
    5 /nail/var/lib/docker/aufs/layers/9646360df378b88eae6f1d6288439eebd9647d5b9e8a471840d4a9d6ed5d92a4
    4 /nail/var/lib/docker/aufs/layers/a8001a0e9515cbbda89a54120a89bfd9a3d0304c8d2812401aba33d22a2358ea
    0 /nail/var/lib/docker/image/aufs/layerdb/sha256/d1c659b8e3d0e893e95c8eedc755adcb91a1c2022e1090376b451f7206f9b1c0/cache-id
So we see there's a chain of child layers, with `f3286009193` as the tip.

$ docker-find f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579'$'

  • sudo find /nail/var/lib/docker '(' -path '/nail/var/lib/docker/aufs/diff/' -o -path '/nail/var/lib/docker/aufs/mnt/' ')' -prune -o -print
  • grep --color 'f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579$'
    /nail/var/lib/docker/aufs/layers/f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579
  • sudo find /nail/var/lib/docker '(' -path '/nail/var/lib/docker/aufs/diff/' -o -path '/nail/var/lib/docker/aufs/mnt/' ')' -prune -o -type f -print0
  • sudo xargs -0 -P20 grep --color -l 'f3286009193f95ab95a16b2561331db06803ac536cea921d9aa64e1564046579$'
    /nail/var/lib/docker/image/aufs/layerdb/mounts/eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e/mount-id
So that layer was used in mount `eb809c0321`. I don't find any references to that mount anywhere:

$ docker-find eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e

  • sudo find /nail/var/lib/docker '(' -path '/nail/var/lib/docker/aufs/diff/' -o -path '/nail/var/lib/docker/aufs/mnt/' ')' -prune -o -print
  • grep --color eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e
    /nail/var/lib/docker/image/aufs/layerdb/mounts/eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e
    /nail/var/lib/docker/image/aufs/layerdb/mounts/eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e/mount-id
    /nail/var/lib/docker/image/aufs/layerdb/mounts/eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e/init-id
    /nail/var/lib/docker/image/aufs/layerdb/mounts/eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e/parent
  • sudo find /nail/var/lib/docker '(' -path '/nail/var/lib/docker/aufs/diff/' -o -path '/nail/var/lib/docker/aufs/mnt/' ')' -prune -o -type f -print0
  • sudo xargs -0 -P20 grep --color -l eb809c0321a2501e61763333bc0dfb33ea0539c15957587f5de003ad21b8275e
    ```

Is there any way to find what container that mount was used for?
The doc only says the mount ID is no longer equal to the container ID, which isn't very helpful.
https://docs.docker.com/engine/userguide/storagedriver/aufs-driver/

@bukzor eb809c0321 is the container id. What docs mean is that aufs id (f3286009193f in your case) is not container id.

/cc @dmcgowan as well

@tonistiigi OK.

Then obviously the mount has outlived its container.

At what point in the container lifecycle is the mount cleaned up?
Is this the temporary writable aufs for running/stopped containers?

@bukzor (rw) mount is deleted on container deletion. Unmount happens on container process stop. Diff folders are a place where the individual layer contents are stored, it doesn't matter is the layer is mounted or not.

@bukzor The link between the aufs id and container id can be found at image/aufs/layerdb/mounts/<container-id>/mount-id. From just knowing an aufs id the easiest way to find the container id is to grep the image/aufs/layerdb directory for it. If nothing is found, then the cleanup was not completed cleanly.

Running into similar issue.

We're running daily CI in the docker daemon server. /var/lib/docker/aufs/diff takes quite a mount of disk capacity, which it shouldn't be.

Still 2gb in aufs/diff after trying everything reasonable suggested here or in related threads (including @bukzor's bash script above).

Short of a proper fix, is there any straightforward way of removing the leftover mounts without removing all other images at the same time? (If no containers are running currently, I guess there should be no mounts, right?)

I am experiencing the same issue. I am using this machine to test a lot of containers, then commit/delete. My /var/lib/docker/aufs directory is currently 7.9G heavy. I'm going to have to move this directory to another mount point, because storage on this one is limited. :(

# du -sh /var/lib/docker/aufs/diff/
1.9T    /var/lib/docker/aufs/diff/

@mcallaway Everything in aufs/diff is going to be fs writes performed in a container.

I have the same issue. All containers which I have are in running state, but there are lots of aufs diff directories which don't relate to these containers and relate to old removed containers. I can remove them manually, but it is not an option. There should be a reason for such a behavior.

I use k8s 1.3.5 and docker 1.12.

Running of the docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v /etc:/etc spotify/docker-gc helped.

I have the same issue. I'm using Gitlab CI with dind (docker in docker).

IMHO when image in the registry was updated within the same tag and it was pulled, then related container was restarted, old container and image are not GCed unless you run spotify/docker-gc.

Can someone else confirm this?

@kayrus correct, docker will not automatically assume that an "untagged" image should also be _removed_. Containers could still be using that image, and you can still start new containers from that image (referencing it by its ID). You can remove "dangling" images using docker rmi $(docker images -qa -f dangling=true). Also, docker 1.13 will get data management commands (see https://github.com/docker/docker/pull/26108), which allow you to more easily cleanup unused images, containers, etc.

@thaJeztah does /var/lib/docker/aufs/diff/ actually contain the "untagged" images?

@kayrus yes they are part of the images (tagged, and untagged)

getting a similar issue, no containers/images/volumes, ~13Gb of diffs

$ docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 1030
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: null host bridge overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-32-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.861 GiB
Name: gitrunner
ID: GSAW:6X5Z:SHHU:NZIM:O76D:P5OE:7OZG:UFGQ:BOAJ:HJFM:5G6W:5APP
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8
$ docker volume ls
DRIVER              VOLUME NAME
$ docker images -a
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
$
$ df -h
Filesystem                                 Size  Used Avail Use% Mounted on
...
/dev/mapper/gitrunner--docker-lib--docker   18G   15G  2.6G  85% /var/lib/docker
/var/lib/docker# sudo du -sm aufs/*
13782   aufs/diff
5       aufs/layers
5       aufs/mnt
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 1
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: xfs
 Dirs: 1122

Same issue here. I understand 1.13 may get data management commands but in the meantime, I just want to safely delete the contents of this directory without killing Docker.

This is relatively blocking at this point.

Same here. Still no official solution?

I've brought this up a few different times in (Docker Community) Slack. Each time a handful of people run through a list of garbage collection scripts/cmds I should run as a solution.

While those have helped (read: not solved - space is still creeping towards full) in the interim, I think we can all agree that's not the ideal long term fix.

@jadametz 1.13 has docker system prune.
Beyond that, I'm not sure how else Docker can help (open to suggestion). The images aren't just getting to the system on their own, but rather through pulls, builds, etc.

In terms of actual orphaned layers (no images on the system referencing them), we'll need to address that separately.

I have exactly the same issue!

docker info Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 0 Server Version: 1.12.1 Storage Driver: aufs Root Dir: /var/lib/docker/aufs Backing Filesystem: extfs Dirs: 2501 Dirperm1 Supported: false Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host null overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: apparmor Kernel Version: 3.13.0-96-generic Operating System: Ubuntu 14.04.2 LTS OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 14.69 GiB Name: ip-172-31-45-4 ID: R5WV:BXU5:AV6T:GZUK:SAEA:6E74:PRSO:NQOH:EPMQ:W6UT:5DU4:LE64 Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ WARNING: No swap limit support Insecure Registries: 127.0.0.0/8

No images, containers or volumes. 42Gb in aufs/diff

Anything to help clear this directory safely would be very useful! Tried everything in this thread without any success. Thanks.

@adamdry only third-party script: https://github.com/docker/docker/issues/22207#issuecomment-252560212

Thanks @kayrus I did indeed try that and it increased my total disk usage slightly and didn't appear to do anything to the aufs/diff directory.

I also tried docker system prune which didn't run. And I tried docker rmi $(docker images -qa -f dangling=true) which didn't find any images to remove.

For anyone interested I'm now using this to clean down all containers, images, volumes and old aufs:

### FYI I am a Docker noob so I don't know if this causes any underlying issues but it does work for me - use at your own risk ###

Lots of inspiration taken from here: http://stackoverflow.com/questions/30984569/error-error-creating-aufs-mount-to-when-building-dockerfile

docker rm -f $(docker ps -a -q) && docker rmi -f $(docker images -q) && docker rmi -f $(docker images -a -q)
service docker stop
rm -rf /var/lib/docker/aufs
rm -rf /var/lib/docker/image/aufs
rm -f /var/lib/docker/linkgraph.db
service docker start

@adamdry Best to not use -f when doing rm/rmi as it will hide errors in removal.
I do consider the current situation... where -f hides an error and then we are left with some left-over state that is completely invisible to the user... as a bug.

I'm also seeing this on a completely new and unsurprising installation:

root@builder:/var/lib/docker# docker info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 1.12.4
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 63
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: overlay host null bridge
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options:
Kernel Version: 3.16.0-4-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.625 GiB
Name: builder
ID: 2WXZ:BT74:G2FH:W7XD:VVXM:74YS:EA3A:ZQUK:LPID:WYKF:HDWC:UKMJ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No memory limit support
WARNING: No swap limit support
WARNING: No kernel memory limit support
WARNING: No oom kill disable support
WARNING: No cpu cfs quota support
WARNING: No cpu cfs period support
Insecure Registries:
 127.0.0.0/8
root@builder:/var/lib/docker# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
root@builder:/var/lib/docker# docker images -a
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
root@builder:/var/lib/docker# du -hd2
4.0K    ./swarm
6.0M    ./image/aufs
6.0M    ./image
4.0K    ./trust
28K ./volumes
4.0K    ./containers
276K    ./aufs/layers
292K    ./aufs/mnt
1.5G    ./aufs/diff <-------------------------
1.5G    ./aufs
4.0K    ./tmp
72K ./network/files
76K ./network
1.5G    .
root@builder:/var/lib/docker# 

@robhaswell Seeing as it's a new install, do you want to try this? https://github.com/docker/docker/issues/22207#issuecomment-266784433

@adamdry I've already deleted /var/lib/docker/aufs as it was blocking my work. What do you expect your instructions to achieve? If they stop the problem from happening again in the future, then I can try to recreate the issue and try your instructions. However if the purpose is just to free up the space then I've already achieved that.

@robhaswell Yeah it was to free up disk space but I had follow up issues when trying to rebuild my images but by following all the steps in that script it resolved those issues.

During build, if the build process is interrupted during layer build process (which also contains a blob to be copied), followed by stopping the container, it leaves behind data in /var/lib/docker/aufs/diff/. It showed up a dangling image. Cleaning it up too didn't release the space. Is it possible to include it as a part of docker system prune? Only deleting the blob data inside this folder frees up the space, which I am not sure will cause any issue or not.

Docker version : 1.13.0-rc1

During build, if the build process is interrupted during layer build process (which also contains a blob to be copied), followed by stopping the container, it leaves behind data

This could also be the cause of my problems - I interrupt a lot of builds.

During docker pull, observed the following two cases:

  1. if the process is interrupted when it says download (which downloads the image layer in /var/lib/docker/tmp/) , it cleans up all the data in that folder
  2. If the process is interrupted when it says extracting(which i suppose is extracting the layer from tmp to /var/lib/docker/aufs/diff/), it cleans up the tmp and diff blob data both.

During image build process,

  1. On interrupting the process when "Sending build context to docker daemon" ( which copies blob data in my case in /var/lib/docker/tmp/), it remains there forever and cannot be cleaned by any command except manually deleting it. I am not sure how the apt get updates in image are handled.
  2. While the layer is being built which contains a blob data, say a large software setup, if the process is interrupted, the docker container keeps working on the image. In my case only 1 layer blob data, which is already available in tmp folder, makes up the whole image. But, if the container is stopped using the docker stop command, two cases happen:
    a. if the mount process is still happening, it will leave behind data in tmp and diff folder.
    b. If the data copied in diff folder, it will remove the data from tmp folder and leave data in diff folder and maybe mount folder.

We have an automated build process, which needs a control to stop any build process gracefully. Recently, a process got killed by kernel due to out of memory error on a machine which was of low configuration.

If one image is to be built of 2 layers, 1 layer is built and 2nd is interrupted, Docker system prune seems to clean up the data for the container of layer which was interrupted and container stopped. But it doesn't clean up the data of the previous layers in case of interrupt. Also, it didn't reflect the total disk space claimed. Ran these tests on AWS, ubuntu 14.04, x86_64 bit system with aufs filesystem. Ran the docker prune test with docker 1.13.0 rc3 and docker 1.12

@thaJeztah
Please let me know if i am misinterpreting anything.

I opened an issue for the /var/lib/docker/tmp files not being cleaned up; https://github.com/docker/docker/issues/29486

Docker system prune seems to clean up the data for the container of layer which was interrupted and container stopped. But it doesn't clean up the data of the previous layers in case of interrupt.

I tried to reproduce that situation, but wasn't able to see that with a simple case;

Start with a clean install empty /var/lib/docker, create a big file for
testing, and a Dockerfile;

mkdir repro && cd repro
fallocate -l 300M bigfile
cat > Dockerfile <<EOF
FROM scratch
COPY ./bigfile /
COPY ./bigfile /again/
COPY ./bigfile /and-again/
EOF

start docker build, and cancel while building, but _after_ the build
context has been sent;

docker build -t stopme .
Sending build context to Docker daemon 314.6 MB
Step 1/4 : FROM scratch
 --->
Step 2/4 : COPY ./bigfile /
 ---> 28eb6d7b0920
Removing intermediate container 98876b1673bf
Step 3/4 : COPY ./bigfile /again/
^C

check content of /var/lib/docker/aufs/

du -h /var/lib/docker/aufs/
301M    /var/lib/docker/aufs/diff/9127644c356579741348f7f11f50c50c9a40e0120682782dab55614189e82917
301M    /var/lib/docker/aufs/diff/81fd6b2c0cf9a28026cf8982331016a6cd62b7df5a3cf99182e7e09fe0d2f084/again
301M    /var/lib/docker/aufs/diff/81fd6b2c0cf9a28026cf8982331016a6cd62b7df5a3cf99182e7e09fe0d2f084
601M    /var/lib/docker/aufs/diff
8.0K    /var/lib/docker/aufs/layers
4.0K    /var/lib/docker/aufs/mnt/9127644c356579741348f7f11f50c50c9a40e0120682782dab55614189e82917
4.0K    /var/lib/docker/aufs/mnt/81fd6b2c0cf9a28026cf8982331016a6cd62b7df5a3cf99182e7e09fe0d2f084
4.0K    /var/lib/docker/aufs/mnt/b6ffb1d5ece015ed4d3cf847cdc50121c70dc1311e42a8f76ae8e35fa5250ad3-init
16K /var/lib/docker/aufs/mnt
601M    /var/lib/docker/aufs/

run the docker system prune command to clean up images, containers;

docker system prune -a
WARNING! This will remove:
    - all stopped containers
    - all volumes not used by at least one container
    - all networks not used by at least one container
    - all images without at least one container associated to them
Are you sure you want to continue? [y/N] y
Deleted Images:
deleted: sha256:253b2968c0b9daaa81a58f2a04e4bc37f1dbf958e565a42094b92e3a02c7b115
deleted: sha256:cad1de5fd349865ae10bfaa820bea3a9a9f000482571a987c8b2b69d7aa1c997
deleted: sha256:28eb6d7b09201d58c8a0e2b861712701cf522f4844cf80e61b4aa4478118c5ab
deleted: sha256:3cda5a28d6953622d6a363bfaa3b6dbda57b789e745c90e039d9fc8a729740db

Total reclaimed space: 629.1 MB

check content of /var/lib/docker/aufs/

du -h /var/lib/docker/aufs/
4.0K    /var/lib/docker/aufs/diff
4.0K    /var/lib/docker/aufs/layers
4.0K    /var/lib/docker/aufs/mnt/b6ffb1d5ece015ed4d3cf847cdc50121c70dc1311e42a8f76ae8e35fa5250ad3-init
8.0K    /var/lib/docker/aufs/mnt
20K /var/lib/docker/aufs/

I do see that the -init mount is left behind, I'll check if we can solve
that (although it's just an empty directory)

The only difference in the dockerfile I had used was ( to create different layers)
FROM scratch
COPY ["./bigfile", "randomNoFile1", /]
COPY ["./bigfile", "randomNoFile2", /]
EOF

I am not sure if it makes a difference.

No, the problem isn't about the empty init folders. In my case, it was te blob. However, i can recheck it on monday and update.

Also, was using a 5GB file, created it by reading bytes from dev urandom.
In your case, the same file is added 2 times. Would that create single layer and mount the 2nd layer from it or would it be 2 separate layers? In my case, its always 2 separate layers.

@thaJeztah
Thank you for such a quick response on the issue. Addition of this feature would be of great help!

@monikakatiyar16 I tried to reproduce this as well with canceling the build multiple times during both ADD and RUN commands but couldn't get anything to leak to aufs/diff after deletion. I couldn't quite understand what container you are stopping because containers should not be running during ADD/COPY operations. If you can put together a reproducer that we could run that would be greatly appreciated.

Its possible that I could be doing something wrong. Since I am travelling on weekend, I will reproduce it and update all the required info here on Monday.

@tonistiigi @thaJeztah
I feel you are right. There are actually no containers that are listed as active and running. Instead there are dead containers. Docker system prune didn't work in my case, might be since the process didn't get killed with Ctrl+C. Instead, it kept running at the background. In my case, that would be the reason, it couldn't remove those blobs.

When I interrupt the process using Ctrl+C, the build process gets killed, but a process for docker-untar remains alive in the background, which keeps working on building the image. (Note: /var/lib/docker is soft linked to /home/lib/docker to use the EBS volumes for large data on AWS)

root 12700 10781 7 11:43 ? 00:00:04 docker-untar /home/lib/docker/aufs/mnt/d446d4f8a7dbae162e7578af0d33ac38a63b4892905aa86a8d131c1e75e2828c

I have attached the script I had been using for creating large files and building the image (gc_maxpush_pull.sh)

Also attached the behaviour of build process for building an image-interrupting it with Ctrl+C (DockerBuild_WOProcessKill) and building image -interrupting it with Ctrl+C - killing the process (DockerBuild_WithProcessKill)

Using the commands -

To create large file : ./gc_maxpush_pull.sh 1 5gblayer 0 512 1

To build images : ./gc_maxpush_pull.sh 1 5gblayer 1 512 1

DockerBuild.zip

Steps to replicate :

  1. Create a large file of 5GB
  2. Start the build process and interrupt it only after Sending build context is over and it's actually copying the blob.
  3. It completes building the image after a while and shows it up in docker images (as in case 1 attached by me - DockerBuild_WOProcessKill)
  4. If the process is killed, it takes a while and leaves the blob data in /diff (which it should on killing process abruptly as attached in file - DockerBuild_WithProcessKill)

If what I am assuming is correct, then this might not be an issue with docker prune, instead with killing of docker build that is somehow not working for me.

Is there a graceful way of interrupting or stopping the build image process, which also takes care of cleaning up the copied data (as handled in docker pull)?

Previously, I was not killing the process. I am also curious what docker-untar does and why it mounts it to /mnt and /diff folders both and later clean out /mnt folder?

Tested this with Docker version 1.12.5, build 7392c3b on AWS

docker info
Containers: 2
Running: 0
Paused: 0
Stopped: 2
Images: 0
Server Version: 1.12.5
Storage Driver: aufs
Root Dir: /home/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 4
Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: overlay bridge null host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-105-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.859 GiB
Name: master
ID: 2NQU:D2C5:5WPL:IIDR:P6FO:OAG7:GHW6:ZJMQ:VDHI:B5CI:XFZJ:ZSZM
Docker Root Dir: /home/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
127.0.0.0/8

@monikakatiyar16 When I manually kill untar process during build I get Error processing tar file(signal: killed): in the build output. Leaving behind the container in docker ps -a is the correct behavior, same thing happens in any build error and lets you debug the problems that caused the build to fail. I have no problem with deleting this container though and if I do that all data in /var/lib/docker/aufs is cleaned up as well.

@tonistiigi Yes you are correct. I was able to delete the volume associated with the container and it cleaned up everything, after killing the docker-untar process. Docker system prune also works in this case.

The actual issue which left over volumes was the case, when without killing the docker-untar process, i tried removing the docker container along with the volumes - which gave the following error :

docker rm -v -f $(docker ps -a -q)
Error response from daemon: Driver aufs failed to remove root filesystem 97931bf059a0ec219efd3f762dbb173cf9372761ff95746358c08e2b61f7ce79: rename /home/lib/docker/aufs/diff/359d27c5b608c9dda1170d1e34e5d6c5d90aa2e94826257f210b1442317fad70 /home/lib/docker/aufs/diff/359d27c5b608c9dda1170d1e34e5d6c5d90aa2e94826257f210b1442317fad70-removing: device or resource busy

Daemon logs:

Error removing mounted layer 78fb899aab981557bc2ee48e9738ff4c2fcf2d10a1984a62a77eefe980c68d4a: rename /home/lib/docker/aufs/diff/d2605125ef072de79dc948f678aa94dd6dde562f51a4c0bd08a210d5b2eba5ec /home/lib/docker/aufs/diff/d2605125ef072de79dc948f678aa94dd6dde562f51a4c0bd08a210d5b2eba5ec-removing: device or resource busy ERRO[0956] Handler for DELETE /v1.25/containers/78fb899aab98 returned error: Driver aufs failed to remove root filesystem 78fb899aab981557bc2ee48e9738ff4c2fcf2d10a1984a62a77eefe980c68d4a: rename /home/lib/docker/aufs/diff/d2605125ef072de79dc948f678aa94dd6dde562f51a4c0bd08a210d5b2eba5ec /home/lib/docker/aufs/diff/d2605125ef072de79dc948f678aa94dd6dde562f51a4c0bd08a210d5b2eba5ec-removing: device or resource busy ERRO[1028] Error unmounting container 78fb899aab981557bc2ee48e9738ff4c2fcf2d10a1984a62a77eefe980c68d4a: no such file or directory

It seems that the order right now to be followed to interrupt a docker build is :
Interrupt docker build > Kill docker untar process > remove container and volume : docker rm -v -f $(docker ps -a -q)

For docker v1.13.0-rc4, it can be Interrupt docker build > Kill docker untar process > docker system prune -a

This seems to work perfectly. There are no issues of cleanup, instead the only issue is the docker-untar process not being killed along with the docker-build process.

I will search/update/log a new issue for graceful interrupt of docker build that also stops docker-untar process along with it.

(Verified this with docker v1.12.5 and v1.13.0-rc4)

Update : On killing docker-untar while Sending build context to docker daemon, it gives an error in build : Error response from daemon: Error processing tar file(signal: terminated) , but during layer copy it doesn't (for me)

Thanks for being so patient and for giving your time!

I'm seeing /var/lib/docker/aufs consistently increase in size on a docker swarm mode worker. This thing is mostly autonomous being managed by the swarm manager and very little manual container creation aside from some maintenance commands here and there.

I do run docker exec on service containers; not sure if that may be a cause.

My workaround to get this resolved in my case was to start up another worker, set the full node to --availability=drain and manually move over a couple of volume mounts.

ubuntu@ip-172-31-18-156:~$ docker --version
Docker version 1.12.3, build 6b644ec

This has hit our CI server for ages. This needs to be fixed.

@orf thanks

Same issue here. Neither container, volumes and image removing, nore Docker 1.13 cleanup commands have any effect.

I also confirm I did some image build cancels. Maybe that leaves folders that can't be reacher either.
I'll use the good old rm method for now, but this is clearly a bug.

Files in the /var/lib/docker/aufs/diff fills up 100% space for /dev/sda1 filesystem of 30G

root@Ubuntu:/var/lib/docker/aufs/diff# df -h

Filesystem Size Used Avail Use% Mounted on
udev 14G 0 14G 0% /dev
tmpfs 2.8G 273M 2.5G 10% /run
/dev/sda1 29G 29G 0 100% /
tmpfs 14G 0 14G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 14G 0 14G 0% /sys/fs/cgroup
/dev/sdb1 197G 60M 187G 1% /mnt
tmpfs 2.8G 0 2.8G 0% /run/user/1000

du -h -d 1 /var/lib/docker/aufs/diff | grep '[0-9]G>'
shows

4.1G /var/lib/docker/aufs/diff/a0cde42cbea362bbb2a73ffbf30059bcce7ef0256d1d7e186264f915d15
14G /var/lib/docker/aufs/diff/59aee33d8a607b5315ce103cd99f17b4dfdec73c9a2f3bb2afc7d02bfae
20G /var/lib/docker/aufs/diff

Also tried, docker system prune, that did not help.

Has anyone found a solution for this ongoing issue of super large files in diff before this bug is fixed in the code?

Yes, the method has already been given, but here is a apocalypse snippet that just destroy everything I put in place here at work (except local folders for the volumes). To put in the bashrc or another bash config file.

```
alias docker-full-cleanup='func_full-cleanup-docker'

func_full-cleanup-docker() {

echo "WARN: This will remove everything from docker: volumes, containers and images. Will you dare? [y/N] "
read choice

if [ ( "$choice" == "y" ) -o ( "$choice" == "Y" ) ]
then
sudo echo "> sudo rights check [OK]"
sizea=sudo du -sh /var/lib/docker/aufs

echo "Stopping all running containers"
containers=`docker ps -a -q`
if [ -n "$containers" ]
then
  docker stop $containers
fi

echo "Removing all docker images and containers"
docker system prune -f

echo "Stopping Docker daemon"
sudo service docker stop

echo "Removing all leftovers in /var/lib/docker (bug #22207)"
sudo rm -rf /var/lib/docker/aufs
sudo rm -rf /var/lib/docker/image/aufs
sudo rm -f /var/lib/docker/linkgraph.db

echo "Starting Docker daemon"
sudo service docker start

sizeb=`sudo du -sh /var/lib/docker/aufs`
echo "Size before full cleanup:"
echo "        $sizea"
echo "Size after full cleanup:"
echo "        $sizeb"

fi
}```

I ran rm -rf command to remove the files from the diff folder for now. Probably will have to look into the script if the diff folder occupies the entire dis space again.
Hope to see this issue fixed in the code, instead of work arounds.

Hi, I have same issue in docker 1.10.2, I'm running kubernetes. this is my docker version:

Containers: 7
 Running: 0
 Paused: 0
 Stopped: 7
Images: 4
Server Version: 1.10.2
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 50
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Plugins:
 Volume: local
 Network: bridge null host
Kernel Version: 4.4.0-31-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.954 GiB
Name: ubuntu-k8s-03
ID: NT23:5Y7J:N2UM:NA2W:2FHE:FNAS:56HF:WFFF:N2FR:O4T4:WAHC:I3PO
Debug mode (server): true
 File Descriptors: 10
 Goroutines: 23
 System Time: 2017-02-14T15:25:00.740998058+09:00
 EventsListeners: 0
 Init SHA1: 3e247d0d32543488f6e70fbb7c806203f3841d1b
 Init Path: /usr/lib/docker/dockerinit
 Docker Root Dir: /var/lib/docker
WARNING: No swap limit support

I'm trying to track all unused diff directory under /var/lib/docker/aufs/diff and /var/lib/docker/aufs/mnt/ by analyzing layer files under /var/lib/docker/image/aufs/imagedb, here is the script I used:

https://gist.github.com/justlaputa/a50908d4c935f39c39811aa5fa9fba33

But I met problem when I stop and restart the docker daemon, seems I make some inconsistent status of docker:

/var/log/upstart/docker.log:

DEBU[0277] Cleaning up old shm/mqueue mounts: start.
DEBU[0277] Cleaning up old shm/mqueue mounts: done.
DEBU[0277] Clean shutdown succeeded
Waiting for /var/run/docker.sock
DEBU[0000] docker group found. gid: 999
DEBU[0000] Server created for HTTP on unix (/var/run/docker.sock)
DEBU[0000] Using default logging driver json-file
INFO[0000] [graphdriver] using prior storage driver "aufs"
DEBU[0000] Using graph driver aufs
INFO[0000] Graph migration to content-addressability took 0.00 seconds
DEBU[0000] Option DefaultDriver: bridge
DEBU[0000] Option DefaultNetwork: bridge
INFO[0000] Firewalld running: false
DEBU[0000] /sbin/iptables, [--wait -t nat -D PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t nat -D OUTPUT -m addrtype --dst-type LOCAL ! --dst 127.0.0.0/8 -j DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t nat -D OUTPUT -m addrtype --dst-type LOCAL -j DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t nat -D PREROUTING]
DEBU[0000] /sbin/iptables, [--wait -t nat -D OUTPUT]
DEBU[0000] /sbin/iptables, [--wait -t nat -F DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t nat -X DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t filter -F DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t filter -X DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t filter -F DOCKER-ISOLATION]
DEBU[0000] /sbin/iptables, [--wait -t filter -X DOCKER-ISOLATION]
DEBU[0000] /sbin/iptables, [--wait -t nat -n -L DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t nat -N DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t filter -n -L DOCKER]
DEBU[0000] /sbin/iptables, [--wait -t filter -n -L DOCKER-ISOLATION]
DEBU[0000] /sbin/iptables, [--wait -t filter -C DOCKER-ISOLATION -j RETURN]
DEBU[0000] /sbin/iptables, [--wait -I DOCKER-ISOLATION -j RETURN]
/var/run/docker.sock is up
DEBU[0000] Registering ipam driver: "default"
DEBU[0000] releasing IPv4 pools from network bridge (dcfcc71060f02440ae53da5ee0f083ca51c33a290565f1741f451754ae6b4257)
DEBU[0000] ReleaseAddress(LocalDefault/10.254.69.0/24, 10.254.69.1)
DEBU[0000] ReleasePool(LocalDefault/10.254.69.0/24)
DEBU[0000] Allocating IPv4 pools for network bridge (159d0a404ff6564b4fcfe633f0c8c123c0c0606d28ec3b110272650c5fc1bcb6)
DEBU[0000] RequestPool(LocalDefault, 10.254.69.1/24, , map[], false)
DEBU[0000] RequestAddress(LocalDefault/10.254.69.0/24, 10.254.69.1, map[RequestAddressType:com.docker.network.gateway])
DEBU[0000] /sbin/iptables, [--wait -t nat -C POSTROUTING -s 10.254.69.0/24 ! -o docker0 -j MASQUERADE]
DEBU[0000] /sbin/iptables, [--wait -t nat -C DOCKER -i docker0 -j RETURN]
DEBU[0000] /sbin/iptables, [--wait -t nat -I DOCKER -i docker0 -j RETURN]
DEBU[0000] /sbin/iptables, [--wait -D FORWARD -i docker0 -o docker0 -j DROP]
DEBU[0000] /sbin/iptables, [--wait -t filter -C FORWARD -i docker0 -o docker0 -j ACCEPT]
DEBU[0000] /sbin/iptables, [--wait -t filter -C FORWARD -i docker0 ! -o docker0 -j ACCEPT]
DEBU[0000] /sbin/iptables, [--wait -t filter -C FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT]
DEBU[0001] /sbin/iptables, [--wait -t nat -C PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
DEBU[0001] /sbin/iptables, [--wait -t nat -A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER]
DEBU[0001] /sbin/iptables, [--wait -t nat -C OUTPUT -m addrtype --dst-type LOCAL -j DOCKER ! --dst 127.0.0.0/8]
DEBU[0001] /sbin/iptables, [--wait -t nat -A OUTPUT -m addrtype --dst-type LOCAL -j DOCKER ! --dst 127.0.0.0/8]
DEBU[0001] /sbin/iptables, [--wait -t filter -C FORWARD -o docker0 -j DOCKER]
DEBU[0001] /sbin/iptables, [--wait -t filter -C FORWARD -o docker0 -j DOCKER]
DEBU[0001] /sbin/iptables, [--wait -t filter -C FORWARD -j DOCKER-ISOLATION]
DEBU[0001] /sbin/iptables, [--wait -D FORWARD -j DOCKER-ISOLATION]
DEBU[0001] /sbin/iptables, [--wait -I FORWARD -j DOCKER-ISOLATION]
WARN[0001] Your kernel does not support swap memory limit.
DEBU[0001] Cleaning up old shm/mqueue mounts: start.
DEBU[0001] Cleaning up old shm/mqueue mounts: done.
DEBU[0001] Loaded container 0790b33ec8e5345ac944d560263b8e13cb75f80dd82cd25753c7320bbcb2747c
DEBU[0001] Loaded container 0e36a6c9319e6b7ca4e5b5408e99d77d51b1f4e825248c039ba0260e628c483d
DEBU[0001] Loaded container 135fb2e8cad26d531435dcd19d454e41cf7aece289ddc7374b4c2a984f8b094a
DEBU[0001] Loaded container 2c28de46788ce96026ac8e61e99c145ec55517543e078a781e8ce6c8cddec973
DEBU[0001] Loaded container 35eb075b5815e621378eb8a7ff5ad8652819ec851eaa4f7baedb1383dfa51a57
DEBU[0001] Loaded container 6be37a301a8f52040adf811041c140408224b12599aa55155f8243066d2b0b69
DEBU[0001] Loaded container d98ac7f052fef31761b82ab6c717760428ad5734df4de038d80124ad5b5e8614
DEBU[0001] Starting container 2c28de46788ce96026ac8e61e99c145ec55517543e078a781e8ce6c8cddec973
ERRO[0001] Couldn't run auplink before unmount: exit status 22
ERRO[0001] error locating sandbox id d4c538661db2edc23c79d7dddcf5c7a8886c9477737888a5fc2641bc5e66da8b: sandbox d4c538661db2edc23c79d7dddcf5c7a8886c9477737888a5fc2641bc5e66da8b not found
WARN[0001] failed to cleanup ipc mounts:
failed to umount /var/lib/docker/containers/2c28de46788ce96026ac8e61e99c145ec55517543e078a781e8ce6c8cddec973/shm: invalid argument
ERRO[0001] Failed to start container 2c28de46788ce96026ac8e61e99c145ec55517543e078a781e8ce6c8cddec973: error creating aufs mount to /var/lib/docker/aufs/mnt/187b8026621da2add42330c9393a474fcd9af2e4567596d61bcd7a40c85f71da: invalid argument
INFO[0001] Daemon has completed initialization
INFO[0001] Docker daemon                                 commit=c3959b1 execdriver=native-0.2 graphdriver=aufs version=1.10.2
DEBU[0001] Registering routers
DEBU[0001] Registering HEAD, /containers/{name:.*}/archive

and when I try to create new containers by docker run, it failed with message:

docker: Error response from daemon: error creating aufs mount to /var/lib/docker/aufs/mnt/f9609c0229baa2cdc6bc07c36970ef4f192431c1b1976766b3ea23d72c355df3-init: invalid argument.
See 'docker run --help'.

and the daemon log shows:

DEBU[0173] Calling POST /v1.22/containers/create
DEBU[0173] POST /v1.22/containers/create
DEBU[0173] form data: {"AttachStderr":false,"AttachStdin":false,"AttachStdout":false,"Cmd":["/hyperkube","kubelet","--api-servers=http://localhost:8080","--v=2","--address=0.0.0.0","--enable-server","--hostname-override=172.16.210.87","--config=/etc/kubernetes/manifests-multi","--cluster-dns=10.253.0.10","--cluster-domain=cluster.local","--allow_privileged=true"],"Domainname":"","Entrypoint":null,"Env":[],"HostConfig":{"Binds":["/sys:/sys:ro","/dev:/dev","/var/lib/docker/:/var/lib/docker:rw","/var/lib/kubelet/:/var/lib/kubelet:rw","/var/run:/var/run:rw","/etc/kubernetes/manifests-multi:/etc/kubernetes/manifests-multi:ro","/:/rootfs:ro"],"BlkioDeviceReadBps":null,"BlkioDeviceReadIOps":null,"BlkioDeviceWriteBps":null,"BlkioDeviceWriteIOps":null,"BlkioWeight":0,"BlkioWeightDevice":null,"CapAdd":null,"CapDrop":null,"CgroupParent":"","ConsoleSize":[0,0],"ContainerIDFile":"","CpuPeriod":0,"CpuQuota":0,"CpuShares":0,"CpusetCpus":"","CpusetMems":"","Devices":[],"Dns":[],"DnsOptions":[],"DnsSearch":[],"ExtraHosts":null,"GroupAdd":null,"IpcMode":"","Isolation":"","KernelMemory":0,"Links":null,"LogConfig":{"Config":{},"Type":""},"Memory":0,"MemoryReservation":0,"MemorySwap":0,"MemorySwappiness":-1,"NetworkMode":"host","OomKillDisable":false,"OomScoreAdj":0,"PidMode":"host","PidsLimit":0,"PortBindings":{},"Privileged":true,"PublishAllPorts":false,"ReadonlyRootfs":false,"RestartPolicy":{"MaximumRetryCount":0,"Name":"always"},"SecurityOpt":null,"ShmSize":0,"UTSMode":"","Ulimits":null,"VolumeDriver":"","VolumesFrom":null},"Hostname":"","Image":"gcr.io/google_containers/hyperkube:v1.1.8","Labels":{},"NetworkingConfig":{"EndpointsConfig":{}},"OnBuild":null,"OpenStdin":false,"StdinOnce":false,"StopSignal":"SIGTERM","Tty":false,"User":"","Volumes":{},"WorkingDir":""}
ERRO[0173] Couldn't run auplink before unmount: exit status 22
ERRO[0173] Clean up Error! Cannot destroy container 482957f3e4e92a0ba56d4787449daa5a8708f3b77efe0c603605f35d02057566: nosuchcontainer: No such container: 482957f3e4e92a0ba56d4787449daa5a8708f3b77efe0c603605f35d02057566
ERRO[0173] Handler for POST /v1.22/containers/create returned error: error creating aufs mount to /var/lib/docker/aufs/mnt/f9609c0229baa2cdc6bc07c36970ef4f192431c1b1976766b3ea23d72c355df3-init: invalid argument

does anyone know whether my approach is correct or not? and why the problem happens after I delete those folders?

I opened #31012 to at least make sure we don't leak these dirs in any circumstances.
We of course also need to look at the various causes of the busy errors

This was biting me as long as I can remember. I got pretty much the same results as described above when I switched to overlay2 driver some days ago and nuked the aufs folder completely (docker system df says 1.5Gigs, df says 15Gigs).

I had about 1T of diffs using storage. After restarting my docker daemon - I recovered about 700GB. So I guess stopping the daemon prunes these?

Restarting does nothing for me, unfortunately.

Service restart did not help. This is a serious issue. Removing all container/images does not remove those diffs.

Stopping the daemon would not prune these.

If you remove all containers and you still have diff dirs, then likely you have some leaked rw layers.

We just encountered this issue. /var/lib/docker/aufs/diff took up 28G and took our root filesystem to 100%, which caused our GitLab server to stop responding. We're using docker for GitLab CI. To fix this, I used some of the commands @sogetimaitral suggested above to delete the temp files, and we're back up and running. I restarted the server and sent in a new commit to trigger CI, and everything appears to be working just as it should.

I'm definitely concerned this is going to happen again. What's the deal here? Is this a docker bug that needs to be fixed?

  1. Yes there is a bug (both that there are issues on removal and that --force on rm ignores these issues)
  2. Generally one should not be writing lots of data to the container fs and instead use a volume (even a throw-away volume). A large diff dir would indicate that there is significant amounts of data being written to the container fs.

If you don't use "--force" on remove you would not run into this issue (or at least you'd see you have a bunch of "dead" containers and know how/what to clean up.).

I'm not manually using docker at all. We're using gitlab-ci-multi-runner. Could it be a bug on GitLab's end then?

It looks like (by default) it force-removes containers; https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/blob/dbdbce2848530df299836768c8ea01e209a2fe40/executors/docker/executor_docker.go#L878. Doing so can result in failures to remove the container being ignored, and leading to the orphaned diffs.

Ok, then that tells me that this is a gitlab-ci-multi-runner bug. Is that a correct interpretation? I'm happy to create an issue for them to fix this.

It's a combination I guess; "force" remove makes it easier to handle cleanups (i.e., cases where a container isn't stopped yet, etc), at the same time (that's the "bug" @cpuguy83 mentioned), it can also hide actual issues, such as docker failing to remove the containers filesystem (which can have various reasons). With "force", the container is removed in such cases. Without, the container is left around (but marked "dead")

If the gitlab runner can function correctly without the force remove, that'll probably be good to change (or make it configurable)

I am using Drone and have the same issue. I didn't check the code how containers are removed, but i guess it force removes as well.

Could it be a Docker in Docker issue? I am starting Drone with docker-compose.

I decided to go ahead and submit a gitlab-ci-multi-runner issue just to loop the devs in: https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/issues/2304

Honestly we worked around this by running Spotify's docker gc with drone CI.

El El mar, mar. 28, 2017 a las 3:38 PM, Geoffrey Fairchild <
[email protected]> escribió:

I decided to go ahead and submit a gitlab-ci-multi-runner issue just to
loop the devs in:
https://gitlab.com/gitlab-org/gitlab-ci-multi-runner/issues/2304

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/docker/docker/issues/22207#issuecomment-289926298,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC6Wz197zkjWWOlq1-JOibiQP-xJym9Eks5rqYvegaJpZM4IMGt2
.

@sedouard Thanks for this tip! Running docker-gc hourly from spotify solved the problem for me.

We are getting this issue running from Gitlab CI (not running in docker), using commands to build images / run containers, (not Gitlab CI Docker integration). We are not running any form of force removal, simply docker run --rm ... and docker rmi image:tag

EDIT: sorry, actually the original problem is the same. The difference is that running spotify/docker-gc does _not_ fix the problem.


As you can see below, I have 0 images, 0 containers, nothing!
docker system info agrees with me, but mentions Dirs: 38 for the aufs storage.

That's suspicious! If you look at /var/lib/docker/aufs/diff/, we see that there's actually 1.7 GB of data there, over 41 directories. And that's my personal box, on the production server it's 19 GB.

How do we clean this? using spotify/docker-gc does not remove these.

philippe@pv-desktop:~$ docker images -a
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

philippe@pv-desktop:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES

philippe@pv-desktop:~$ docker system info
Containers: 0
 Running: 0
 Paused: 0
 Stopped: 0
Images: 0
Server Version: 17.03.1-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 38
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-72-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 31.34 GiB
Name: pv-desktop
ID: 2U5D:CRHS:RUQK:YSJX:ZTRS:HYMV:HO6Q:FDKE:R6PK:HMUN:2EOI:RUWO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: silex
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

philippe@pv-desktop:~$ ls -alh /var/lib/docker/aufs/diff/
total 276K
drwxr-xr-x 40 root root 116K Apr 13 15:32 .
drwxr-xr-x  5 root root 4.0K Sep 18  2015 ..
drwxr-xr-x  4 root root 4.0K Jun 17  2016 005d00efb0ba949d627ad439aec8c268b5d55759f6e92e51d7828c12e3817147
drwxr-xr-x  8 root root 4.0K May  2  2016 0968e52874bbfaa938ffc869cef1c5b78e2d4f7a670e19ef47f713868b9bfbdf
drwxr-xr-x  4 root root 4.0K Jun 20  2016 188233e6dcc37e2308e69807ffd19aca3e61be367daae921f2bcb15a1d6237d0
drwxr-xr-x  6 root root 4.0K Jun 20  2016 188233e6dcc37e2308e69807ffd19aca3e61be367daae921f2bcb15a1d6237d0-init
drwxr-xr-x 21 root root 4.0K Apr  8  2016 250ecb97108a6d8a8c41f9d2eb61389a228c95f980575e95ee61f9e8629d5180
drwxr-xr-x  2 root root 4.0K Dec 22  2015 291f16f99d9b0bc05100e463dbc007ef816e0cf17b85d20cf51da5eb2b866810
drwxr-xr-x  2 root root 4.0K May  2  2016 3054baaa0b4a7b52da2d25170e9ce4865967f899bdf6d444b571e57be141b712
drwxr-xr-x  2 root root 4.0K Feb  5  2016 369aca82a5c05d17006b9dca3bf92d1de7d39d7cd908ed665ef181649525464e
drwxr-xr-x  3 root root 4.0K Jun 17  2016 3835a1d1dfe755d9d1ada6933a0ea7a4943caf8f3d96eb3d79c8de7ce25954d2
(...strip)

philippe@pv-desktop:~$ du -hs /var/lib/docker/aufs/diff/
1.7G    /var/lib/docker/aufs/diff/

philippe@pv-desktop:~$ docker system prune -a
WARNING! This will remove:
    - all stopped containers
    - all volumes not used by at least one container
    - all networks not used by at least one container
    - all images without at least one container associated to them
Are you sure you want to continue? [y/N] y
Total reclaimed space: 0 B

Can I safely rm -r /var/lib/docker/aufs and restart the docker deamon?

Running spotify/docker-gc does not clean those orphans.

EDIT: thanks @CVTJNII!

Stopping the Docker daemon and erasing all of /var/lib/docker will be safer. Erasing /var/lib/docker/aufs will cause you to lose your images anyway so it's better to start with a clean /var/lib/docker in my opinion. This is the "solution" I've been using for several months for this problem now.

Starting with 17.06 there should no longer be any new orphaned diffs.
Instead you may start seeing containers with the state Dead, this happens if there was an error during removal that is non-recoverable and may require an admin to deal with it.

In addition, removal is a bit more robust, and less prone to error due to race conditions or failed unmounts.

@cpuguy83: great news, can you explain what the admin would need to do if that happens?

@Silex It depends on the cause.
Typically what has happened is there is a device or resource busy error due to some mount being leaked into a container. If you are running something like cadvisor this is pretty much a guarantee as the instructions say to mount the whole docker dir into the cadvisor container.

This can be tricky, you may have to stop the offending container(s) and then remove the dead container.

If you are on a newer kernel (3.15+) it is unlikely that you would see the issue anymore, though there still may be some edge case.

Docker version 17.06.0-ce, build 02c1d87

I tried remove all images, volumes, networks and containers but it not helped.
Also tried commands:

docker system prune -af
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v /etc:/etc:ro spotify/docker-gc

Still remain files:

root@Dark:/var/lib/docker/aufs# ls -la *
diff:
total 92
drwx------ 12 root root 45056 Jul 28 17:28 .
drwx------  5 root root  4096 Jul  9 00:18 ..
drwxr-xr-x  4 root root  4096 Jul 10 01:35 78f8ecad2e94fedfb1ced425885fd80bb8721f9fd70715de2ce373986785b882
drwxr-xr-x  6 root root  4096 Jul 10 01:35 78f8ecad2e94fedfb1ced425885fd80bb8721f9fd70715de2ce373986785b882-init
drwxr-xr-x  5 root root  4096 Jul 10 01:35 7caa9688638ea9669bac451b155b65b121e99fcea8d675688f0c76678ba85ccd
drwxr-xr-x  6 root root  4096 Jul 10 01:35 7caa9688638ea9669bac451b155b65b121e99fcea8d675688f0c76678ba85ccd-init
drwxr-xr-x  4 root root  4096 Jul 12 14:45 b7b7770aae461af083e72e5e3232a62a90f934c83e38830d06365108e302e7ac
drwxr-xr-x  6 root root  4096 Jul 12 14:45 b7b7770aae461af083e72e5e3232a62a90f934c83e38830d06365108e302e7ac-init
drwxr-xr-x  4 root root  4096 Jul 10 01:35 d5752b27b341e17e730d3f4acbec04b10e41dc01ce6f9f98ff38208c0647f2e4
drwxr-xr-x  6 root root  4096 Jul 10 01:35 d5752b27b341e17e730d3f4acbec04b10e41dc01ce6f9f98ff38208c0647f2e4-init
drwxr-xr-x  6 root root  4096 Jul 10 01:35 e412d3c6f0f5f85e23d7a396d47c459f5d74378b474b27106ab9b82ea829dbfb
drwxr-xr-x  6 root root  4096 Jul 10 01:35 e412d3c6f0f5f85e23d7a396d47c459f5d74378b474b27106ab9b82ea829dbfb-init

layers:
total 52
drwx------ 2 root root 45056 Jul 28 17:28 .
drwx------ 5 root root  4096 Jul  9 00:18 ..
-rw-r--r-- 1 root root     0 Jul 10 01:35 78f8ecad2e94fedfb1ced425885fd80bb8721f9fd70715de2ce373986785b882
-rw-r--r-- 1 root root     0 Jul 10 01:35 78f8ecad2e94fedfb1ced425885fd80bb8721f9fd70715de2ce373986785b882-init
-rw-r--r-- 1 root root     0 Jul 10 01:35 7caa9688638ea9669bac451b155b65b121e99fcea8d675688f0c76678ba85ccd
-rw-r--r-- 1 root root     0 Jul 10 01:35 7caa9688638ea9669bac451b155b65b121e99fcea8d675688f0c76678ba85ccd-init
-rw-r--r-- 1 root root     0 Jul 12 14:45 b7b7770aae461af083e72e5e3232a62a90f934c83e38830d06365108e302e7ac
-rw-r--r-- 1 root root     0 Jul 12 14:45 b7b7770aae461af083e72e5e3232a62a90f934c83e38830d06365108e302e7ac-init
-rw-r--r-- 1 root root     0 Jul 10 01:35 d5752b27b341e17e730d3f4acbec04b10e41dc01ce6f9f98ff38208c0647f2e4
-rw-r--r-- 1 root root     0 Jul 10 01:35 d5752b27b341e17e730d3f4acbec04b10e41dc01ce6f9f98ff38208c0647f2e4-init
-rw-r--r-- 1 root root     0 Jul 10 01:35 e412d3c6f0f5f85e23d7a396d47c459f5d74378b474b27106ab9b82ea829dbfb
-rw-r--r-- 1 root root     0 Jul 10 01:35 e412d3c6f0f5f85e23d7a396d47c459f5d74378b474b27106ab9b82ea829dbfb-init

mnt:
total 92
drwx------ 12 root root 45056 Jul 28 17:28 .
drwx------  5 root root  4096 Jul  9 00:18 ..
drwxr-xr-x  2 root root  4096 Jul 10 01:35 78f8ecad2e94fedfb1ced425885fd80bb8721f9fd70715de2ce373986785b882
drwxr-xr-x  2 root root  4096 Jul 10 01:35 78f8ecad2e94fedfb1ced425885fd80bb8721f9fd70715de2ce373986785b882-init
drwxr-xr-x  2 root root  4096 Jul 10 01:35 7caa9688638ea9669bac451b155b65b121e99fcea8d675688f0c76678ba85ccd
drwxr-xr-x  2 root root  4096 Jul 10 01:35 7caa9688638ea9669bac451b155b65b121e99fcea8d675688f0c76678ba85ccd-init
drwxr-xr-x  2 root root  4096 Jul 12 14:45 b7b7770aae461af083e72e5e3232a62a90f934c83e38830d06365108e302e7ac
drwxr-xr-x  2 root root  4096 Jul 12 14:45 b7b7770aae461af083e72e5e3232a62a90f934c83e38830d06365108e302e7ac-init
drwxr-xr-x  2 root root  4096 Jul 10 01:35 d5752b27b341e17e730d3f4acbec04b10e41dc01ce6f9f98ff38208c0647f2e4
drwxr-xr-x  2 root root  4096 Jul 10 01:35 d5752b27b341e17e730d3f4acbec04b10e41dc01ce6f9f98ff38208c0647f2e4-init
drwxr-xr-x  2 root root  4096 Jul 10 01:35 e412d3c6f0f5f85e23d7a396d47c459f5d74378b474b27106ab9b82ea829dbfb
drwxr-xr-x  2 root root  4096 Jul 10 01:35 e412d3c6f0f5f85e23d7a396d47c459f5d74378b474b27106ab9b82ea829dbfb-init
# docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              0                   0                   0B                  0B
Containers          0                   0                   0B                  0B
Local Volumes       0                   0                   0B                  0B

How can it be deleted?

@haos616 try stopping all running containers first, and then run docker system prune -af.
This did the trick for me.
Didn't work while I had a container running.

If it's an upgrade from a previous version of docker, it's possible those diffs were generated / left behind by that version. Docker 17.06 won't remove a container if layers failed to be removed (when using --force); older versions did, which could lead to orphaned layers

@julian-pani I did so in the beginning but it does not help.

# docker system df
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              0                   0                   0B                  0B
Containers          0                   0                   0B                  0B
Local Volumes       0                   0                   0B                  0B

@thaJeztah No. I cleaned the Docker one or two months ago. Then the version was already 17.06. I used command docker system prune -af. It removed everything.

Running https://github.com/spotify/docker-gc as a container worked for me, but it went a step extra and deleted some of my required images too :(

So I've put a small wrapper script as below to be safe

#!/bin/sh
docker images -q > /etc/docker-gc-exclude    # Save all genuine images as exclude
docker run --rm -v /var/run/docker.sock:/var/run/docker.sock -v /etc:/etc:ro spotify/docker-gc

thanks again to spotify

IIUC, the spotify script just calls docker rm and docker rmi - did it actually remove orphaned diffs?

Just some feedback for the community, I've read through all of this and none of the solutions actually seem to work consistently or reliably. My "fix" was simply to double the amount of disc space on my AWS instances. And I know all too well that's a crappy fix but it is the best workaround I've found to Docker's bloated aufs. This really, really needs to be fixed.

@fuzzygroup 17.06 should no longer create orphaned diffs, but it won't clean up the old ones yet.

I could cleanup with this script. I don't see why it wouldn't work, but who knows.
Anyway it's working fine for me. It will delete all images, containers, and volumes... As it should not run very often, I find it a minor side effect. But it's up to you to use it or not.

https://gist.github.com/Karreg/84206b9711cbc6d0fbbe77a57f705979

https://stackoverflow.com/q/45798076/562769 seems to be related. I've posted a quick fix.

FYI, still seeing this with 17.06.1-ce

Containers: 20
 Running: 0
 Paused: 0
 Stopped: 20
Images: 124
Server Version: 17.06.1-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 185
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Security Options:
 apparmor
Kernel Version: 4.4.0-83-generic
Operating System: Ubuntu 14.04.5 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.796GiB
Name: gitlab-cirunner
ID: PWLR:R6HF:MK3Y:KN5A:AWRV:KHFY:F36D:WASF:7K7B:U7FY:2DJA:DBE2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support

/var/lib/docker/aufs/diff contains lots of directories with the -init-removing and -removing prefix:

ffd5477de24b0d9993724e40175185038a62250861516030a33280898243e742-removing
ffd900de0634992e99c022a16775805dfd0ffd1f6c89fece7deb6b1a71c5e38c-init-removing
ffd900de0634992e99c022a16775805dfd0ffd1f6c89fece7deb6b1a71c5e38c-removing

FYI, still seeing this with 17.06.1-ce

Still seeing what, exactly?
There should not be any way that a diff dir can leak, though diff dirs will still exist if they existed on upgrade, they'll still exist.

Still seeing orphaned diffs as far as I can tell. docker system prune didn't remove them, neither did docker-gc. Manually running rm -rf /var/lib/docker/aufs/diff/*-removing seems to be working.

Yes, docker will not clean up old orphaned dirs yet.

By old you mean those created from a previous version of docker with this issue?

This is a fresh install of Docker we did about two weeks ago, those orphans must have been created since then, so it seems that docker must still be creating those orphans?

I mean, in the last half an hour I've got 112 new diffs with -removing, since I rm'ed them manually.

$ ls /var/lib/docker/aufs/diff/ | grep removing | wc -l
112

You said "17.06 should no longer create orphaned diffs, but it won't clean up the old ones yet.", but surely this cannot be correct, or am I missing something? Are those tagged with -removing not orphaned?

@orf On a newer kernel, it's highly unexpected to have any issue at all during removal. Are you mounting /var/lib/docker into a container?

I'll check in the aufs driver to see if there's a specific issue there with it reporting a successful remove when it really wasn't.

We are not mounting /var/lib/docker into a container.

$ uname -a
Linux gitlab-cirunner 4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

We are running 14.04 LTS

Let me know if there is anything I can provide to help debug this.

For other reasons (swarm mode networking) I moved off 14.04 for Docker
machines.
On Mon, Aug 21, 2017 at 8:23 AM orf notifications@github.com wrote:

We are not mounting /var/lib/docker into a container.

$ uname -a
Linux gitlab-cirunner 4.4.0-83-generic #106~14.04.1-Ubuntu SMP Mon Jun 26 18:10:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

We are running 14.04 LTS

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/moby/moby/issues/22207#issuecomment-323773033, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AADRIfE2B2HNpbsKyTOj1CwGzulRT2C0ks5saaDPgaJpZM4IMGt2
.

This appears to be worse with 17.06.01-ce. I updated a build machine to this version and immediately started seeing the *-init-removing and the *-removing directories left around as part of the build process. I stopped the service, removed the /var/lib/docker directory, restarted the service and after a few builds was close to out of disk space again. I stopped the service again, ran apt-get purge docker-ce, removed /var/lib/docker again and installed the 17.06.0-ce version. Not getting the extra directories in /var/lib/docker/aufs/diff and disk space is representative of images that are on the build machine. I've reproduced the behavior on my development machine as well - just building an image seems to create these extra directories for each layer of the image so I would run out of disk space really quick. Again, reverting to 17.06.0-ce seems to not have the problem so I'm going to stay there for now.

@mmanderson Thanks for reporting. Taking a look at changes in the AUFS driver.

@mmanderson Do you have any containers in the Dead state in docker ps -a?

All of my docker build servers are running out of space.
image
I have upgraded within the last week or so to Docker version 17.06.1-ce, build 874a737. I believe that nothing else has changed and that this issue either emerged or manifested as part of the upgrade process. The aufs diff directory is massive and I already pruned all images and dangling volumes.

issue-22207.txt
@cpuguy83 No containers in any state. Here is what I just barely did to demonstrate this with 17.06.01-ce:

  1. Started with a fresh install of docker 17.06.01-ce on Ubuntu 16.04.03 LTS (i.e. docker not installed and no /var/lib/docker directory). After install verified an empty /var/lib/docker/aufs/diff directory.
  2. Ran a docker build with a fairly simple dockerfile based on ubuntu:latest - all it does is pull statsd_exporter from github and extract it into /usr/bin (see attached file).
  3. After running the build run docker ps -a to show no containers in any state. There are several *-remaining folder in the /var/lib/docker/aufs/diff folder.
  4. Run docker system df to verify images, container, and volumes. Result is
TYPE                TOTAL               ACTIVE              SIZE                RECLAIMABLE
Images              2                   0                   132.7MB             132.7MB (100%)
Containers          0                   0                   0B                  0B
Local Volumes       0                   0                   0B                  0B
  1. Running du -sch /var/lib/docker/*/ shows 152M for /var/lib/docker/aufs/
  2. Run docker rmi $(docker images -q) to remove the built image layers. Running docker system df after this shows all zeros. Running du -sch /var/lib/docker/*/ shows 152M for /var/lib/docker/aufs/ and there are *-remaining folders for all of the folders that didn't have them before along with the existing *-remaining folders that are still there.

@erikh is this the issue you are experiencing?

@cpuguy83 After uninstalling 17.06.01-ce, removing the /var/lib/docker directory and installing 17.06.0-ce I try to run the same build. The build fails because of the ADD from remote URL's bug that was fixed in 17.06.01. However I don't get any *-remaining directories for the steps that do complete and after cleaning up everything with docker system prune and docker rmi $(docker image -q) the /var/lib/docker/aufs/diff directory is again empty and the space is freed.

Thanks all, this is a regression in 17.06.1...
PR to fix is here: https://github.com/moby/moby/pull/34587

awesome, thanks for the quick patch @cpuguy83! /cc @erikh

@rogaha! yes, thanks to you and @cpuguy83!

Thank you so much @Karreg for your excellent script. After getting rid of all the old ophaned diffs and freeing huge amounts of lost disk space again we are using it now regularily to clean our VMs before installing new docker images. Great help and an almost perfect workaround for this issue now. @TP75

Looks like Docker, Inc. have some contracts with computer data storage manufacturers.

@Karreg's script worked fine for me and I freed all the space in the diffs directory.

Having the same issue.
Docker Host Details

root@UbuntuCont:~# docker info
Containers: 3
Running: 0
Paused: 0
Stopped: 3
Images: 4
Server Version: 17.06.1-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 14
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 6e23458c129b551d5c9871e5174f6b1b7f6d1170
runc version: 810190ceaa507aa2727d7ae6f4790c76ec150bd2
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-93-generic
Operating System: Ubuntu 16.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 3.358GiB
Name: UbuntuCont
ID: QQA5:DC5S:C2FL:LCC6:XY6E:V3FR:TRW3:VMOQ:QQKD:AP2M:H3JA:I6VX
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false

root@UbuntuCont:/var/lib/docker/aufs/diff# ls
031c85352fe85f07fede77dee0ac9dc2c7723177a819e72c534e1399208c95fa
09d53040e7e6798b5987ea76fe4f84f0906785b94a392a72e8e41a66cd9f242d
09d53040e7e6798b5987ea76fe4f84f0906785b94a392a72e8e41a66cd9f242d-init
0fb1ffc90969e9706801e2a18870f3ecd857a58f1094fbb968b3fa873e4cf2e4
10549179bd21a9c7af018d4ef305bb9196413b9662fce333b607104c40f38781
10d86a48e03cabf9af2c765dc84824809f24674ac339e4b9ffe572f50bd26b9c-init-removing
10d86a48e03cabf9af2c765dc84824809f24674ac339e4b9ffe572f50bd26b9c-removing
2e226946e8e6c2b3613de2afcff4cbb9890b6d9bd365fdda121a51ae96ec5606
2e226946e8e6c2b3613de2afcff4cbb9890b6d9bd365fdda121a51ae96ec5606-init
3601f6953132f557df8b52e03016db406168d3d6511d7ff5c08a90925ea288da-init-removing
3601f6953132f557df8b52e03016db406168d3d6511d7ff5c08a90925ea288da-removing
4b29141243aea4e70472f25a34a91267ab19c15071862c53e903b99740603d4c-init-removing
4b29141243aea4e70472f25a34a91267ab19c15071862c53e903b99740603d4c-removing
520e3fcf82e0fbbb48236dd99b6dee4c5bb9073d768511040c414f205c787dc5-init-removing
520e3fcf82e0fbbb48236dd99b6dee4c5bb9073d768511040c414f205c787dc5-removing
59cbb25a4858e7d3eb9146d64ff7602c9abc68509b8f2ccfe3be76681481904f
5d1c661b452efce22fe4e109fad7a672e755c64f538375fda21c23d49e2590f6
605893aba54feee92830d56b6ef1105a4d2166e71bd3b73a584b2afc83319591
63bd53412210f492d72999f9263a290dfee18310aa0494cb92e0d926d423e281-init-removing
63bd53412210f492d72999f9263a290dfee18310aa0494cb92e0d926d423e281-removing
72146e759ab65c835e214e99a2037f4b475902fdbe550c46ea0d396fb5ab2779-init-removing
72146e759ab65c835e214e99a2037f4b475902fdbe550c46ea0d396fb5ab2779-removing
8147e0b06dcbce4aa7eb86ed74f4ee8301e5fe2ee73c3a80dcb230bd0ddfcc26-init-removing
8147e0b06dcbce4aa7eb86ed74f4ee8301e5fe2ee73c3a80dcb230bd0ddfcc26-removing
a72735551217bb1ad01b77dbdbb9b8effa9f41315b0c481f8d74b5606c50deb4
aa58f2000b9f7d1ed2a6b476740c292c3c716e1d4dc04b7718580a490bba5ee8
b552cb853e33a8c758cb664aec70e2c4e85eacff180f56cbfab988a8e10c0174-removing
cd80c351b81ed13c4b64d9dfdc20c84f6b01cbb3e26f560faf2b63dae12dec55-init-removing
cd80c351b81ed13c4b64d9dfdc20c84f6b01cbb3e26f560faf2b63dae12dec55-removing
fe903be376821b7afee38a016f9765136ecb096c59178156299acb9f629061a2
fe903be376821b7afee38a016f9765136ecb096c59178156299acb9f629061a2-init

@kasunsjc please read the posts just above yours.

I confirm upgrading to 17.06.2-ce solved this issue. I didn't have to manually the directories either (last time) after the upgrade.

17.06.2-ce _appears_ to have fixed this for me as well. No more -removing directories in there, got a decent amount of space back.

I'm assuming that the -init directories I have in aufs/diff are unrelated (some of them are pretty old). They are all small, though, so it hardly matters.

Updating to 17.07.0 solved the issue here too, not even docker system prune --all -f would remove the directories before but after upgrading they got autoremoved on reboot.

Confirming this issue was resolved on Ubuntu 16.04 with 17.06.2-ce. As soon as it was updated, the space cleared.

Was this page helpful?
0 / 5 - 0 ratings