With the current approach on docker entrypoint for updating the files to the new custom UID/GID it takes forever to finish the process, which timeouts in a reasonable production container health check.
Why not just use chown -rf mastodon:mastodon /mastodon/public/system
instead of finding and filtering non-mastodon files?
master
(If you're a user, don't worry about this).@wonderfall
The goal is not to chown /mastodon/public/system
. It would take a long time to do that (believe me, I tried every combination possible). So find
won't even go there with -path path -prune -o
(-not -path path
will exclude it but it will go there, so it will take time), it doesn't take time since it updates all permissions but not /public/system
which likely contains a lot of data.
So if I understand well, this commandes takes a long time for you?
Can you run something like time docker run -ti --rm mastodon true
?
Can you send me your docker info
?
I'm having the same issue, here's some info:
docker info
Containers: 53
Running: 53
Paused: 0
Stopped: 0
Images: 130
Server Version: 17.05.0-ce
Storage Driver: aufs
Root Dir: /var/lib/docker/aufs
Backing Filesystem: extfs
Dirs: 477
Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.8.0-46-generic
Operating System: Ubuntu 17.04
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.66GiB
Name: concorde.dissidence.ovh
ID: UEKG:ZQYV:I6EF:VIMY:TV5W:HXRD:SEDE:GPHJ:LKUV:OFGB:NQDY:VVNM
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: fmauneko
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
time docker run --rm -it gargron/mastodon:v1.4rc2 true
Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run --rm -it gargron/mastodon:v1.4rc2 true 0.01s user 0.01s system 0% cpu 1:29.87 total
No problem with :
# docker info
Containers: 34
Running: 34
Paused: 0
Stopped: 0
Images: 51
Server Version: 17.05.0-ce
Storage Driver: btrfs
Build Version: Btrfs v4.7.3
Library Version: 101
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.11.1
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.16GiB
Name: drogon
ID: TGPJ:KNGK:XV7N:LHP3:BXZG:AHLJ:QOR3:AL6F:ZHME:LHFZ:YBRP:MT4M
Docker Root Dir: /docker
Debug Mode (client): false
Debug Mode (server): false
Username: wonderfall
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: true
# time docker run -ti --rm mastodon true
Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run -ti --rm mastodon true 0.01s user 0.01s system 0% cpu 5.303 total
Only thing I see is the storage driver, and indeed on my desktop computer which has btrfs:
docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 15
Server Version: 17.05.0-ce
Storage Driver: btrfs
Build Version: Btrfs v4.10.2
Library Version: 102
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.10.13-1-ARCH
Operating System: Arch Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.55GiB
Name: izanami
ID: LY2X:EOSA:YURV:H2OK:MVIU:HWPB:DXCX:GTUJ:DDKJ:2CUC:IZZT:RNFA
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
time docker run --rm -it gargron/mastodon:v1.4rc2 true
Creating mastodon user (UID : 991 and GID : 991)...
Updating permissions...
Executing process...
docker run --rm -it gargron/mastodon:v1.4rc2 true 0,02s user 0,02s system 0% cpu 12,941 total
I tried on my Mac, which is using aufs
, it takes around 5 seconds. Perhaps it's because of the SSD. But aufs
is clearly less performant than btrfs
.
# time find /mastodon -path /mastodon/public/system -prune -o -not -user mastodon -not -group mastodon -print0 | xargs -0 chown -f mastodon:mastodon
real 0m 4.26s
user 0m 0.10s
sys 0m 0.45s
# I ran another container to try a "non-optimised" command
# time chown -R 991:991 *
real 0m 5.45s
user 0m 0.18s
sys 0m 1.55s
Yeah so I guess we can accurately say that this is not an issue with Mastodon, but that it's linked to the Docker storage driver choice.
I agree with @fmauNeko, unfortunately we can't do more here.
If you're curious, this article also explains the reasons of why we should use an entrypoint rather than hardcode something in the Dockerfile : https://denibertovic.com/posts/handling-permissions-with-docker-volumes/ (just found it but it seems we had exactly the same idea! that's also what I'm doing for all my images you can find at wonderfall/dockerfiles)
cc @xataz if you have an idea.
You are right, took 9m20s to update permissions on my cloud server (overlay2 storage driver) and 7s on my local machine (aufs storage driver). I thought find command would cost more, but its not the case. Looks like a storage driver issue. I'll look into improving my cloud docker setup.
I tried to change the storage driver from overlay2 to aufs on my debian jessie VC1S scaleway instance, but the docker daemon fail to start, my kernel dosen't support aufs.
I will build my image without the chown command in the entrypoint script.
But i think this is better to tell admins to do the command before pulling new images in a release notes than to force all admins with overlayfs to build they own images if they dont want to wait 30 min (yes with 3 conteiners it takes time) to start their containers.
It's weird to have such bad performance on overlay2, which should be better than aufs. I'm also running on scaleway (VC1M).
Ok i can overwrite the entrypoint script : https://docs.docker.com/compose/compose-file/#entrypoint
Maybe we should add this in the documentation?
@katarpilar I believe would be best to figure out why we have such bad performance on our cloud setup and add the solution to Troubleshoot docs. I assume a lot of admins use scaleway services.
FYI I just got 13s run on another aufs non-ssd cloud service.
That's the reason why I gave up overlay2. The performances are terrible, not with this command in particular, but in general.
That being said aufs
is still the standard storage-driver for Docker so I thought no one would complain (but I was wrong :sad: ). Note that overlayfs shouldn't be really recommended for production environments, despite it has been seen as a potential successor to aufs
:
As promising as OverlayFS is, it is still relatively young. Therefore caution should be taken before using it in production Docker environments.
Source : Docker documentation
Speaking of OverlayFS, I noticed this performance issue a while ago. I came up with a hack which consists of refreshing (somehow) the files in the layers (so basically this could fix your issue), but it stopped working with a Docker update.
Well I have bad performance with aufs myself, so that's strange.
Same issue (overlay2) :
# docker info
Containers: 14
Running: 14
Paused: 0
Stopped: 0
Images: 61
Server Version: 17.05.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9048e5e50717ea4497b757314bad98ea3763c145
runc version: 9c2d8d184e5da67c95d601382adf14862e4f2228
init version: 949e6fa
Kernel Version: 4.9.0-0.bpo.2-amd64
Operating System: Debian GNU/Linux 8 (jessie)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.26GiB
Name: tortank
ID: JHJB:GWQI:ZGEE:WEZ4:RLH5:RVRR:UWPA:ZFAH:5MK4:LBAM:4DRA:ZZ36
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: oxynux
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: true
I am also having HORRIBLE load times with permissions running locally on macOS. 9+ minutes
@malicioustoker is it overlay2 storage driver? you can get the information with docker info
command.
Yeah, it is using Overlay2
Containers: 12
Running: 6
Paused: 0
Stopped: 6
Images: 56
Server Version: 17.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
seccomp
Profile: default
Kernel Version: 4.9.27-moby
Operating System: Alpine Linux v3.5
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 15.65 GiB
Name: moby
ID: JN5A:LMHV:2GUL:FGKP:GTOH:5IWI:V3TP:GTUD:TEWR:5RBY:EWZ3:U6UY
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 64
Goroutines: 63
System Time: 2017-05-22T15:10:42.611514466Z
EventsListeners: 1
No Proxy: *.local, 169.254/16
Registry: https://index.docker.io/v1/
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Sent from my iPhone
On May 22, 2017, at 7:47 AM, Miguel Peixe notifications@github.com wrote:
@malicioustoker is it overlay2 storage driver? you can get the information with docker info command.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
How can I change it from Overlay2 to something else? That's just the default option when Docker is installed on macOS
@malicioustoker Interesting, I still have aufs
on my macOS machine (that said I don't have the latest version yet).
But you can change your storage-driver easily :
I really think we should open an issue at Docker (moby/moby) rather than arguing about this change, because what else can we do? overlay2
shouldn't have performance issues, while btrfs
is as fast as it would be on a classic filesystem.
I do understand your frustration if it's taking too long (10 minutes ??? Come on!), and I suffered from this bug during months before finally giving up overlay2
. The thing is overlay2
will overtake aufs
in the future (btrfs
, devicemapper
which shouldn't be used and zfs
remain as alternative options), so I'm concerned too.
@Waterfall trying that now - thanks 😊 I also sent you an email yesterday - do you have time to chat about something. You're the only other person I know running Mastodon off macOS
Sent from my iPhone
On May 22, 2017, at 8:51 AM, Wonderfall notifications@github.com wrote:
@malicioustoker Interesting, I still have aufs on my machine (that said I don't have the latest version yet).
But you can change your storage-driver easily :
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
For Scaleway users, this is how you change to aufs:
sudo modprobe aufs
. If it exits empty, means its there.{
"storage-driver": "aufs"
}
Restart docker service and that's it.
Changing to aufs fixed the issue - it now takes no more than 5 seconds to change permissions - thanks everyone!
Can someone try the --squash
option? Someone still using overlay2 I mean.
docker build --squash -t mastodon .
For this to work you'll have to enable experimental features in Docker, put this in /etc/docker/daemon.json
:
{
"experimental": true
}
What's the difference between Overlay2 and the other file system? If Overlay2 is /suppose/ to be better, I'll switch back and test this command out for you if you'd like
Sent from my iPhone
On May 24, 2017, at 7:30 AM, Wonderfall notifications@github.com wrote:
Can someone try the --squash option ? Someone still using overlay2 I mean.
docker build --squash -t mastodon .
For this to work you'll have to enable experimental features in Docker, put this in /etc/docker/daemon.json :{
"experimental": true
}
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
OverlayFS implements union mount, it's supposed to be faster, and it's in Linux kernel upstream. It will overtake aufs
for these reasons, once it's mature (and it already happened in RHEL/CentOS).
That said there are other alternatives :
overlay2
already improves a lot from overlay
, but still, OverlayFS is a bit immature, that's why stable Docker still uses aufs
as a default (Edge Docker use overlay2
now AFAIK).
It's only advantage now is that it's already in the upstream Kernel source.
If you need to setup a new server for Docker though, use btrfs
or zfs
, as they are natively Copy-on-write filesystems.
im on overlay2 too and it takes over 30 minutes for me. im all morning for upgrading because the recreate and migrate commands start chown in the entrypoint.
and spinning up the containers start 3 chown jobs... a full hour before just getting mastodon up.
@xsteadfastx you have 2 options :
@katarpilar i choosed overlay2 because the docker docs say i should do... ;-)
i know i can overwrite the entrypointscript but maybe we can have a discussion about running this chown on everystart on every container.
The reason why the chown is done in the entrypoint and not in the Dockerfile: https://github.com/tootsuite/mastodon/issues/3194#issuecomment-302946031
@fmauNeko i know this and i also do this on my docker images. but maybe this could be a task that you can run as a command... like the asset compiling or db migration... if its needed....
@xsteadfastx Actually it's needed every time, because the chown changes the mastodon source files, not the data volumes, which are explicitly ignored by the find command at https://github.com/tootsuite/mastodon/blob/master/docker_entrypoint.sh#L11.
This is needed because Dockerfile's COPY at https://github.com/tootsuite/mastodon/blob/master/Dockerfile#L46 create files with UID 0.
The way its built on the entrypoint is common practice.
I'm not sure where docker recommended overlay2 for production environments, but I've learned that this is not a correct statement. I think its safe to say this is not a mastodon problem, this is a storage driver problem. Use dd to compare IO performance.
They recommend it because it's in the mainline kernel, and they made it the default driver on the edge versions. The stable versions are staying with aufs as a default.
But from my experience I'd say that the best for production environments performance-wise would be zfs and btrfs.
I know its common practice but running the same thing for each command over and over again is not the best. Running it through dockers compose starts the fund three times just to get it up. And after one run this should not be needed no more.
For me its more a maintenance command if it takes up to a hour and runs for every single command on a pretty normal docker setup like using overlay2.
I understand why this is needed but I don't know if its the best way.
The command was designed with this issue in mind : it won't execute chown where it's not needed. It's almost instant on every file system except overlay2
(find
will be very slow too...).
As I suggested earlier, can someone using overlay2
try to squash the image during the build process? There will be no Docker cache, but the command can be much faster since we're using a single layer in the final image.
Otherwise, I suggest that someone should open an issue at Moby/Moby. This shouldn't be a Mastodon issue. I believe it's a serious performance issue and if something should change, that would be at overlay2
. Or I'm missing something else, the way it works, etc. but anyway, the final user doesn't care, I don't know why he should be forced to use a buggy feature. And I don't know why we should revert a common & good practice because among the several choices, there's only one with a serious issue.
ok i will try to switch to aufs.
it looks like aufs is not possible on a ubuntu 16.04 LTS. too bad. so im stuck with hours of chowning files.
Do you have the linux-image-extra package installed for your kernel branch/version ? If so, what's your docker info
?
@Wonderfall tried sqashing and didnt helped at all
@xsteadfastx No you're not stuck. You can still use btrfs
:
/docker
./etc/docker/daemon.json
:/var/lib/docker
to /docker
storage-driver
(or add it) value to "btrfs"
@fmauNeko yep... installed...
Containers: 21
Running: 7
Paused: 0
Stopped: 14
Images: 165
Server Version: 17.03.1-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.4.0-78-generic
Operating System: Ubuntu 16.04.2 LTS
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 3.736 GiB
Name: rorschach
ID: 46XI:SRHG:O332:EDNX:4Q6P:ZOEW:XRA2:EGIU:2ANA:I2AZ:JOJ7:NLUD
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
@xsteadfastx try installing linux-image-extra-virtual
than sudo modprobe aufs
@xsteadfastx also, don't forget to add to /etc/docker/daemon.json
{
"storage-driver": "aufs"
}
@miguelpeixe no kernel support for aufs on ubuntu 16.04
@Wonderfall i dont have a spare partition for btrfs... too bad... else i would test it right away
Hum, that's weird, I'm using aufs on all my 16.04 servers, including my local setup.
@xsteadfastx
now, I'm updating. overlay2 -> aufs is ok. (sorry, Ubuntu 16.10)
$ sudo apt-get update
$ sudo apt-get install linux-image-extra-$(uname -r) linux-image-extra-virtual
$ sudo modprobe aufs
$ cat /proc/filesystems | grep aufs
$ nano /etc/docker/daemon.json
{
"storage-driver": "aufs"
}
$ service docker restart
$ docker info
ok i have to say sorry... modprobe aufs after installing linux-image-extra-virtual did the trick... sorry for this discussion about the docker image... it works pretty well i was just in a bad mood because it tooks hours to upgrade mastodon.
thanks for all the help.
There seems to be no "fix" for this issue yet and some us don't have aufs compatible kernels or the luxury of creating a VM or attaching an extra partition with brtfs or zfs for the sake of prototyping.
So my question here would be, how harmful exactly is just doing:
chown mastodon:mastodon /mastodon/public/system
instead of: find /mastodon -path /mastodon/public/system -prune -o -not -user mastodon -not -group mastodon -print0 | xargs -0 chown -f mastodon:mastodon
Would it suffice to just add a check to the script and when overlay or overlay2 is detected run a warning and just chown the entire system directory ? It seems like a good compromise considering the speed "bug" is with docker (or overlay2... depending on how you think about it) and may not be fixed in a while, however docker is slowly migrating to overlay and linunx distros are slowly removing aufs support out of the default shipped kernels.
My current fix is to run the mastodon instance with the original chown (as to not risk any issues) and "manually" modify the script inside the image to the chown of the directory for running any other tasks (e.g. creating admin users) much faster. But that is hardly the most convenient think to do, since it require an file edit every time I want to reboot my instance.
Well it won't be recursive, and I'm not sure chown -R would be faster than the current solution.
Just tried it on my 2 CPU, 8 GB RAM server and it look 49 mins... 😦
What do you think about
https://github.com/tootsuite/mastodon/pull/6510
For me it would be a nice workaround.
Not sure how big of an issue it is, but while running this commit, I now get this error
Step 16/19 : COPY --chown=${UID}:${GID} . /mastodon
ERROR: Service 'web' failed to build: unable to convert uid/gid chown string to host mapping: can't find uid for user ${UID}: no such user: ${UID}
-- Reverend Glen
On Tuesday, Feb 20, 2018 at 8:25 AM, Eugen Rochko
Closed #3194 (https://github.com/tootsuite/mastodon/issues/3194) via #6514 (https://github.com/tootsuite/mastodon/pull/6514).
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (https://github.com/tootsuite/mastodon/issues/3194#event-1482803060), or mute the thread (https://github.com/notifications/unsubscribe-auth/ABXY3gH-U_w3RUgjmLjQLVCzqkDJIpOIks5tWvHugaJpZM4NheBb).
@moritzheiber
The solution here would be to use the user/group _names_ instead of the variables.
I'll come up with a PR.
@malicioustoker This should be fixed now.
Can confirm that it has been fixed, thanks!
-- Reverend Glen
On Tuesday, Feb 20, 2018 at 10:14 AM, Moritz Heiber
@malicioustoker (https://github.com/malicioustoker) This should be fixed now.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub (https://github.com/tootsuite/mastodon/issues/3194#issuecomment-367067928), or mute the thread (https://github.com/notifications/unsubscribe-auth/ABXY3iOji2jL1gAsLBeH0_mfj8DY4l-aks5tWwtmgaJpZM4NheBb).
Most helpful comment
ok i have to say sorry... modprobe aufs after installing linux-image-extra-virtual did the trick... sorry for this discussion about the docker image... it works pretty well i was just in a bad mood because it tooks hours to upgrade mastodon.
thanks for all the help.