Description
I have a local (insecure) registry for debugging purposes. But I cannot run manifest inspect against it, even with --insecure.
$ docker -D manifest inspect --insecure 172.17.0.4:5000/foo:latest
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.4:5000
DEBU[0000] endpoints for 172.17.0.4:5000/foo:latest: []
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.4:5000
DEBU[0000] endpoints for 172.17.0.4:5000/foo:latest: []
open /etc/docker/certs.d/172.17.0.4:5000: permission denied
Steps to reproduce the issue:
Just run docker -D manifest inspect --insecure 172.17.0.4:5000/foo:latest.
The registry and the image don't actually need to exist to hit the problem since it occurs before any sockets are opened etc (according to my quick look over the strace logs).
Describe the results you received:
open /etc/docker/certs.d/172.17.0.4:5000: permission denied
Describe the results you expected:
The manifest to be printed.
Additional information you deem important (e.g. issue happens only occasionally):
The directory /etc/docker/certs.d doesn't exist, but in any case the permissions on /etc/docker are 0700, also it seems odd to rely only on a system wide directory here. I couldn't find any setting or config file option which would redirect this to e.g. ~/.docker/certs.d.
Output of docker version:
$ docker version
Client:
Version: 18.06.0-ce
API version: 1.38
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:09:33 2018
OS/Arch: linux/amd64
Experimental: true
Server:
Engine:
Version: 18.06.0-ce
API version: 1.38 (minimum version 1.12)
Go version: go1.10.3
Git commit: 0ffa825
Built: Wed Jul 18 19:07:38 2018
OS/Arch: linux/amd64
Experimental: true
Also reproduced with 96dba79d99d69df9adc67b230d9dd39849733ef2 (recent master).
Output of docker info:
Containers: 3
Running: 3
Paused: 0
Stopped: 0
Images: 178
Server Version: 18.06.0-ce
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: d64c661f1d51c48782c9cec8fda7604785f93587
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
apparmor
seccomp
Profile: default
Kernel Version: 4.14.0-3-amd64
Operating System: Debian GNU/Linux buster/sid
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 15.53GiB
Name: bokrug
ID: AU33:BO7D:7VGM:MOLB:RSDF:IBRV:GCWT:THHM:OEVM:TX3C:BNLM:WHAR
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
File Descriptors: 41
Goroutines: 67
System Time: 2018-09-10T10:09:15.802266701+01:00
EventsListeners: 0
Username: ijc25
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
WARNING: No swap limit support
Additional environment details (AWS, VirtualBox, physical, etc.):
Native Linux (Debian) running docker-ce packages from download.docker.com apt repo.
Taking a look
@ijc -- What registry are you running? If it's one you created using registry:2, can you list the config details? It's really odd that a) the list of endpoints is empty, and b) you're getting that hosts dir output if you used the insecure flag.
Also, unfortunately, you can't override that certs dir location AFAIK. See @dnephin's comment at https://github.com/docker/cli/pull/138#issuecomment-341560126. (Just FYI) If anyone knows otherwise, lmk.
@clnperez it's a container I've had for a while, my notes claim I created it with:
docker volume create registry
docker pull registry:2
docker run -d -p 5000:5000 --name registry -v registry:/var/lib/registry registry:2
I've not (knowingly) reconfigured anything so I think this is the default:
# cat /etc/docker/registry/config.yml
version: 0.1
log:
fields:
service: registry
storage:
cache:
blobdescriptor: inmemory
filesystem:
rootdirectory: /var/lib/registry
http:
addr: :5000
headers:
X-Content-Type-Options: [nosniff]
health:
storagedriver:
enabled: true
interval: 10s
threshold: 3
_But_ I don't think it is getting so far as to be even opening a socket, it hits a client side error before it gets that far.
Let me know if you need anything else.
@ijc thanks. just one more thing -- the hash of the registry:2 image you've got running.
@clnperez
$ docker container inspect registry | jq .[0].Image
"sha256:b2b03e9146e1c7197e63f67d0d48b87b2b18a6e40660f9d89e6d0b450b6bfa38"
@ijc, looks like that one isn't in the registry any more can you give me output from
docker run --rm registry:2 -v
@clnperez I just wanted to reiterate something I've said a couple of times now: The failure is occurring before the CLI has made any network calls at all, so I don't think the version of the registry it is talking to can be in any way relevant (infact it doesn't even need to exist or be running).
$ grep -E 'socket|accept|listen|bind|172.17.0.4|AF_INET|gethostbyaddr' /tmp/docker-manifest-inspect.strace.*
/tmp/docker-manifest-inspect.strace.17582:execve("/usr/bin/docker", ["docker", "-D", "manifest", "inspect", "--insecure", "172.17.0.4:5000/foo:latest"], 0x7ffd843df480 /* 55 vars */) = 0
/tmp/docker-manifest-inspect.strace.17582:socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
/tmp/docker-manifest-inspect.strace.17582:connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
/tmp/docker-manifest-inspect.strace.17582:socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
/tmp/docker-manifest-inspect.strace.17582:connect(3, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
/tmp/docker-manifest-inspect.strace.17582:socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
/tmp/docker-manifest-inspect.strace.17582:socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 5
/tmp/docker-manifest-inspect.strace.17586:openat(AT_FDCWD, "/home/ijc/.docker/manifests/172.17.0.4-5000_foo-latest", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
/tmp/docker-manifest-inspect.strace.17586:openat(AT_FDCWD, "/etc/docker/certs.d/172.17.0.4:5000", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
/tmp/docker-manifest-inspect.strace.17586:openat(AT_FDCWD, "/etc/docker/certs.d/172.17.0.4:5000", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
You can see that there are no socket calls other than the Unix domain ones for /var/run/nscd and nothing else network related.
For some reason (unknown to me) my /etc/docker has permissions 0700, hence it is seeing EACCESS and not EEXIST then looking at /etc/docker/certs.d/, it might be as simple as some code on the client side just handling those two the same way?
I've reported the 0700 permissions thing as an engine issue in https://github.com/moby/moby/issues/37840.
@ijc i read what you were saying and i don't think it needs reiterating. i'm sorry if i didn't acknowledge what you'd said. but what's still strange to me is that your list of endpoints is empty. the permissions one is probably something that the manifest command can/should work around, but what i'm seeing in the current code doesn't make sense that it got to that point.
However the empty list of endpoints is arising it's nothing to do with the version or configuration of the registry on the server side, since the client never even tries to talk to the server, but you were asking questions specific to the server side of things.
From strace I can see where the endpoints message is printed, the context is:
openat(AT_FDCWD, "/etc/docker/certs.d/172.17.0.4:5000", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
write(2, "\33[37mDEBU\33[0m[0000] endpoints fo"..., 66) = 66
write(2, "\33[37mDEBU\33[0m[0000] hostDir: /et"..., 66) = 66
openat(AT_FDCWD, "/etc/docker/certs.d/172.17.0.4:5000", O_RDONLY|O_CLOEXEC) = -1 EACCES (Permission denied)
write(2, "\33[37mDEBU\33[0m[0000] endpoints fo"..., 66) = 66
So it does seem like the empty endpoints is pretty directly related to the EACCES.
If I correct the permissions on /etc/docker to 0755 then instead I see:
$ docker -D manifest inspect --insecure 172.17.0.3:5000/foo:latest
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.3:5000
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.3:5000
DEBU[0000] endpoints for 172.17.0.3:5000/foo:latest: [{false https://172.17.0.3:5000 v2 false false true 0xc4204c6c00} {false https://172.17.0.3:5000 v1 false false true 0xc4204c6d80}]
DEBU[0000] error with repo endpoint {%!s(*registry.RepositoryInfo=&{{172.17.0.3:5000 foo} 0xc4202e09c0 false }) {%!s(bool=false) %!s(*url.URL=&{https <nil> 172.17.0.3:5000 false }) %!s(registry.APIVersion=2) %!s(bool=false) %!s(bool=false) %!s(bool=true) %!s(*tls.Config=&{<nil> <nil> [] map[] <nil> <nil> <nil> <nil> <nil> [] 0 <nil> true [49196 49200 49195 49199 49162 49161 49172 49171 53 47] true false [123 26 177 2 12 180 4 49 139 85 145 57 51 219 174 34 158 52 4 193 205 74 40 116 204 145 78 129 114 212 233 189] <nil> 769 0 [] false 0 <nil> {{0 0} 1} {{0 0} 0 0 0 0} [{[227 213 248 79 41 196 201 98 205 114 18 228 223 136 13 1] [208 147 212 136 191 26 213 163 113 236 105 77 100 109 10 13] [197 138 17 48 72 221 91 95 152 147 185 166 152 158 79 144]}]})}}: failed to configure transport: error pinging v2 registry: Get https://172.17.0.3:5000/v2/: http: server gave HTTP response to HTTPS client
DEBU[0000] skipping v1 endpoint https://172.17.0.3:5000
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.3:5000
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.3:5000
DEBU[0000] endpoints for 172.17.0.3:5000/foo:latest: [{false https://172.17.0.3:5000 v2 false false true 0xc4204c7500} {false https://172.17.0.3:5000 v1 false false true 0xc4204c7680}]
DEBU[0000] error with repo endpoint {%!s(*registry.RepositoryInfo=&{{172.17.0.3:5000 foo} 0xc4202e11a0 false }) {%!s(bool=false) %!s(*url.URL=&{https <nil> 172.17.0.3:5000 false }) %!s(registry.APIVersion=2) %!s(bool=false) %!s(bool=false) %!s(bool=true) %!s(*tls.Config=&{<nil> <nil> [] map[] <nil> <nil> <nil> <nil> <nil> [] 0 <nil> true [49196 49200 49195 49199 49162 49161 49172 49171 53 47] true false [34 46 124 210 178 22 157 181 68 91 245 180 234 75 45 56 200 226 120 142 124 158 92 153 189 194 125 158 90 70 211 68] <nil> 769 0 [] false 0 <nil> {{0 0} 1} {{0 0} 0 0 0 0} [{[244 139 193 158 233 153 230 227 247 97 164 238 52 208 83 230] [70 233 228 182 253 84 246 155 55 129 117 2 192 1 249 2] [159 27 220 251 253 6 92 11 166 225 249 108 109 85 103 137]}]})}}: failed to configure transport: error pinging v2 registry: Get https://172.17.0.3:5000/v2/: http: server gave HTTP response to HTTPS client
DEBU[0000] skipping v1 endpoint https://172.17.0.3:5000
no such manifest: 172.17.0.3:5000/foo:latest
(Note that between my initial report and now the registries P address changed because I restarted).
If I hit a non-existent server address with the perms corrected then I see:
``console
$ docker -D manifest inspect --insecure 172.17.0.4:5000/foo:latest
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.4:5000
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.4:5000
DEBU[0000] endpoints for 172.17.0.4:5000/foo:latest: [{false https://172.17.0.4:5000 v2 false false true 0xc4200b1200} {false https://172.17.0.4:5000 v1 false false true 0xc4200b1380}]
DEBU[0000] not continuing on error (*url.Error) Get https://172.17.0.4:5000/v2/foo/manifests/latest: dial tcp 172.17.0.4:5000: connect: connection refused
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.4:5000
DEBU[0000] hostDir: /etc/docker/certs.d/172.17.0.4:5000
DEBU[0000] endpoints for 172.17.0.4:5000/foo:latest: [{false https://172.17.0.4:5000 v2 false false true 0xc42009d380} {false https://172.17.0.4:5000 v1 false false true 0xc42009d500}]
DEBU[0000] not continuing on error (*url.Error) Get https://172.17.0.4:5000/v2/foo/manifests/latest: dial tcp 172.17.0.4:5000: connect: connection refused
Get https://172.17.0.4:5000/v2/foo/manifests/latest: dial tcp 172.17.0.4:5000: connect: connection refused
````
Both of those seem pretty sensible (there is nofoo:latest` image on my registry).
I was initially unable to recreate this, but forgetting that by default loopback (which is what I was using) is in the engine's insecure registries list. I mapped it to host's actual IP and, with the permissions set to 700, can recreate.
Also, I realized what you were trying to get at after I sent that last message. I was looking for something the wrong way. So, apologies for being a little slow there.
Thanks for all that detail.
Opened https://github.com/moby/moby/pull/37847 to address this issue
@thaJeztah @vdemeester I think this one should stay open until the permissions handling on the CLI side is fixed too.
Had some time to look at this today and figured out what I missed. Now I need to figure out the fix. Will try to get it done this week.
Thanks for using the manifest command @ijc, and the bug report!
Most helpful comment
@thaJeztah @vdemeester I think this one should stay open until the permissions handling on the CLI side is fixed too.