fetch of the apk index just hangs. I hit this now on a Ubuntu server and Docker for Windows
Step 1/14 : FROM maven:3.3.9-jdk-8-alpine
---> dd9d4e1cd9db
Step 2/14 : RUN apk update && apk upgrade && apk add --no-cache --update ca-certificates bash wget curl tree libxml2-utils putty git && rm -rf /var/lib/apt/lists/* && rm -rf /var/cache/apk/*
---> Running in 536cbd484c36
fetch http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz
Docker version 17.03.1-ce, build c6d412e
Are you able to start another container shell and curl dl-cdn.alpinelinux.org? Sounds like a networking issue somewhere.
Seems like a DNS issue. Not sure why, I've set correct dns settings in %programdata%\docker\config\daemon.json
nslookup dl-cdn.alpinelinux.org
nslookup: can't resolve '(null)': Name does not resolve
Name: dl-cdn.alpinelinux.org
Address 1: 151.101.48.249
Got around this by running using https
RUN sed -i 's/http\:\/\/dl-cdn.alpinelinux.org/https\:\/\/alpine.global.ssl.fastly.net/g' /etc/apk/repositories
Doesn't seem like a DNS issue, since it has resolved. Unfortunately, for while, I'm at the same point: names are resolved, but can't connect to anything.
In another issue I thought this could be a DNS issue because the CDN POP IP addresses may change more frequently. If DNS is being cached somewhere and TTL not honored then an outdated IP address may be returned for dl-cdn.alpinelinux.org.
This is why I need debugging information to help pinpoint it. When the issue happens, I need the curl -v -s http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz > /dev/null output from inside a container and the host so we can compare.
Also facing this thing from time to time. Here鈥檚 a typical output of GitLab CI when fetching fails:

Manually stopping and retrying the stuck CI job helps, but there鈥檚 no guarantee of reliability
/ # curl -v -s http://dl-cdn.alpinelinux.org/alpine/v3.5/main/x86_64/APKINDEX.tar.gz > /de
v/null
* Trying 151.101.84.249...
* TCP_NODELAY set
* Connected to dl-cdn.alpinelinux.org (151.101.84.249) port 80 (#0)
> GET /alpine/v3.5/main/x86_64/APKINDEX.tar.gz HTTP/1.1
> Host: dl-cdn.alpinelinux.org
> User-Agent: curl/7.56.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: application/octet-stream
< Last-Modified: Mon, 04 Dec 2017 09:08:18 GMT
< ETag: "5a251082-b3195"
< Accept-Ranges: bytes
< Content-Length: 733589
< Accept-Ranges: bytes
< Date: Mon, 04 Dec 2017 16:03:17 GMT
< Via: 1.1 varnish
< Connection: keep-alive
< X-Served-By: cache-bma7035-BMA
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1512403397.178165,VS0,VE64
<
{ [8688 bytes data]
* Connection #0 to host dl-cdn.alpinelinux.org left intact
/ #
Success while another process hangs
Step 3/12 : RUN apk add --no-cache --update icu-dev libxml2-dev openldap-dev php7-xdebug && mv /usr/lib/php7/modules/xdebug.so /usr/local/lib/php/extensions/no-debug-non-zts-20160303 && rm -f /etc/php7/conf.d/xdebug.ini && docker-php-ext-install intl ldap soap && curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin --filename=composer
---> Running in 1e9ed33cd4b3
fetch http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz
One more with matching url
/ # curl -v -s http://dl-cdn.alpinelinux.org/alpine/v3.4/main/x86_64/APKINDEX.tar.gz > /de
v/null
* Trying 151.101.84.249...
* TCP_NODELAY set
* Connected to dl-cdn.alpinelinux.org (151.101.84.249) port 80 (#0)
> GET /alpine/v3.4/main/x86_64/APKINDEX.tar.gz HTTP/1.1
> Host: dl-cdn.alpinelinux.org
> User-Agent: curl/7.56.1
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx
< Content-Type: application/octet-stream
< Last-Modified: Thu, 23 Nov 2017 09:53:36 GMT
< ETag: "5a169aa0-a53fe"
< Accept-Ranges: bytes
< Content-Length: 676862
< Accept-Ranges: bytes
< Date: Mon, 04 Dec 2017 16:06:47 GMT
< Via: 1.1 varnish
< Connection: keep-alive
< X-Served-By: cache-bma7022-BMA
< X-Cache: MISS
< X-Cache-Hits: 0
< X-Timer: S1512403608.785535,VS0,VE75
<
{ [2896 bytes data]
* Connection #0 to host dl-cdn.alpinelinux.org left intact
/ #
I ran into this issue in kubernetes. I bounced the kube-dns pod to flush any records it might be caching. This fixed the problem for me.
EDIT: Actually it didn't. Still having the problem.
Are you running Docker-in-Docker by any chance? I have this issue only in dind containers.
I only have this with dind + kubernetes. However it doesn't happen if I use '--network host' or '--net host'. I am using weave overlay network.
@andremarianiello thanks for info, I am also using dind+kubernetes (though with flannel). Have you enabled hostNetwork: true for dind pods?
@nogoegst No I haven't. It worked without doing that.
@andremarianiello so you mean you set --network host for dockerized docker daemon? Where did you set it?
I added it to my docker client commands, e.g. 'docker build --network host ...'
I've see the k8s issue quite a bit. wireshark shows fastly getting stuck sending oversized packets with a do not fragment flag. I don't think this is OP's issue though as its docker for windows.
I recently started running into a similar issue to this as well though. On linux but the behavior is the same, apk fails fetching and mostly on the index. Again, pulled up wireshark and recreated the problem. I see things going smoothly then the apk process seems to stop ACK'ing segments from the fastly server. Fastly starts throttling and resending segments and it lags out.
I've never recreated this with curl, but it looks like apk uses a built in BSD libfetch for its HTTP communications so maybe there's a bug in there?
My network communication understanding is just enough to get me this far so here's a link to the wireshark log of the communications. hopefully an alpine dev has a better understanding and can parse out a clue or find the problem.
It seems like fastly is filtering ICMP need to frag packets, which means that PMTU does not work. This can be a problem is your traffic goes via a network link that has MTU lower than 1500 (typically tunnels/vpns, PPPoE and similar). This can be worked around by enabling tcp mss clamping in the network.
Yeah I was treating this as a different issue because it has slightly different characteristics and not the same as #279.
The Wireshark link in https://github.com/gliderlabs/docker-alpine/issues/307#issuecomment-387613710 shows a different traffic behavior. Instead of the traffic doesn't get killed at the bridge, it is never ACK'd by libfetch and Fastly's TCP session gets stuck trying to get recover. I don't know if it's even fastly's fault as on the surface it seems to be doing the right thing.
... misc traffic
ack: container -> bridge -> fastly
Transmission: container <- bridge <- fastly
Transmission: container <- bridge <- fastly
ack: container -> bridge -> fastly
Transmission: container <- bridge <- fastly
Transmission: container <- bridge <- fastly
ack: container -> bridge -> fastly
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 2
Transmission: container <- bridge <- fastly 3
Transmission: container <- bridge <- fastly 4
Transmission: container <- bridge <- fastly 5
... some number of other packets
Transmission: container <- bridge <- fastly X
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
Transmission: container <- bridge <- fastly 1
.... repeat
observation, networking is hard.
I added it to my docker client commands, e.g. 'docker build --network host ...'
where exactly did you put this? I'm facing this on kubernetes right now, where gitlab spins up a container with docker running in docker...driving me nuts for the lasst 4 hours.
@evanrich My gitlab CI was using docker:dind as a service container, and my main build container had a docker client in it which I used to connect to the service container. My repo has a dockerfile in it that I need to be built by the gitlab runner. My .gitlab-ci.yaml file contained the command
docker build .
This builds my docker image. One of my layers in the dockerfile runs apk update. This command hangs, causing the docker build command and the CI as a whole to fail.
However, if I modify my .gitlab-ci.yaml file to have
docker build --network host .
docker will run the apk update command from my dockerfile without hanging.
I believe that the problem is that in docker the MTU is lower than on the host. The way this is supposed to work is via path MTU discovery, but fastly appears to block the PMTU icmp packet (I guess it is a part of their DDoS defence). The way to "fix" this properly is to enable MSS clamping on the host.
https://blog.ipspace.net/2013/01/tcp-mss-clamping-what-is-it-and-why-do.html
The other alternative is to use a different mirror that does not block the PMTU traffic.
@ncopa How can we check to see if our docker mtu is lower than our host mtu?
@evanrich My gitlab CI was using docker:dind as a service container, and my main build container had a docker client in it which I used to connect to the service container. My repo has a dockerfile in it that I need to be built by the gitlab runner. My .gitlab-ci.yaml file contained the command
docker build .This builds my docker image. One of my layers in the dockerfile runs
apk update. This command hangs, causing thedocker buildcommand and the CI as a whole to fail.
However, if I modify my .gitlab-ci.yaml file to havedocker build --network host .docker will run the
apk updatecommand from my dockerfile without hanging.
are you not using auto devops? I haven't specified a .gitlab-ci.yml file yet, I seem to have worked around part of it via switching to alpine.global.ssl.fastly.net, but i get this
Status: Downloaded newer image for golang:alpine
---> 95ec94706ff6
Step 2/13 : RUN sed -i 's/http\:\/\/dl-cdn.alpinelinux.org/https\:\/\/alpine.global.ssl.fastly.net/g' /etc/apk/repositories
---> Running in a3de349b32f8
Removing intermediate container a3de349b32f8
---> 39505fc0c5f2
Step 3/13 : RUN apk update; apk add git gcc build-base; go get -v github.com/cloudflare/cloudflared/cmd/cloudflared
---> Running in 548789a2500b
fetch https://alpine.global.ssl.fastly.net/alpine/v3.8/main/x86_64/APKINDEX.tar.gz
fetch https://alpine.global.ssl.fastly.net/alpine/v3.8/community/x86_64/APKINDEX.tar.gz
v3.8.1-22-g24d67bab3a [https://alpine.global.ssl.fastly.net/alpine/v3.8/main]
v3.8.1-16-g96e1e57fed [https://alpine.global.ssl.fastly.net/alpine/v3.8/community]
OK: 9539 distinct packages available
(1/25) Installing binutils (2.30-r5)
and it just hangs at installing binutils every time. Found this: https://github.com/gliderlabs/docker-alpine/issues/279 . seems to be a wide spread issue in k8s due to lower mtu.
I was able to get slightly further with changing my mirror from a fastly mirror to mirror.clarkson.edu using
RUN sed -i 's/http\:\/\/dl-cdn.alpinelinux.org/http\:\/\/mirror.clarkson.edu/g' /etc/apk/repositories
builds are running, will update when they finish.
Edit: Just finished successfully... build 174 (that's how many times it's taken trying to get this to work"
Removing intermediate container 5c42267a84e9
---> 339cedacd0cf
Step 12/13 : EXPOSE 54/udp
---> Running in 8308f4f1cb00
Removing intermediate container 8308f4f1cb00
---> b917125f9e41
Step 13/13 : EXPOSE 34411/tcp
---> Running in 5d3115c32a0f
Removing intermediate container 5d3115c32a0f
---> 33616623b643
Successfully built 33616623b643
Successfully tagged registry.evanrichardsonphotography.com/docker/cloudflared/master:a66a757bee6a6de2276ed4a8d3a8de121efc8705
Pushing to GitLab Container Registry...
The push refers to repository [registry.evanrichardsonphotography.com/docker/cloudflared/master]
75ddfc9ca656: Preparing
ff665015151e: Preparing
434f9e907dc9: Preparing
e834c1681702: Preparing
676adc5a23cc: Preparing
e834c1681702: Layer already exists
676adc5a23cc: Layer already exists
434f9e907dc9: Pushed
ff665015151e: Pushed
75ddfc9ca656: Pushed
a66a757bee6a6de2276ed4a8d3a8de121efc8705: digest: sha256:75efdf757e24da3a27a3674f49508e9f85d0d115e921231ae52835f56a28e1b7 size: 1368
Job succeeded
On Kubernetes one should run these containers with hostNetwork: true.
If you come here from Drone CI and their drone plugin, set the MTU that fits you in the settings of the plugin. Probably could save you some hours of debugging and desperate attempts:
kind: pipeline
type: kubernetes
name: default
steps:
- name: dockerize
image: plugins/docker
settings:
...
mtu: 1000
Hi here, is there anyone still suffering this issue? Seems the issue has gone some how.
I can't reproduce the "hangs" now.
20 days ago it still was present, see above
20 days ago it still was present, see above
It still present last week, what about these 2days? could u have a try 馃憖
I tried and it worked for these two days. However hangs again now.
If you come here from Drone CI and their drone plugin, set the MTU that fits you in the settings of the plugin. Probably could save you some hours of debugging and desperate attempts:
kind: pipeline type: kubernetes name: default steps: - name: dockerize image: plugins/docker settings: ... mtu: 1000
Hi, thanks for this. This help my build. I remember I found article about MTU that maybe useful to give more information
https://medium.com/@liejuntao001/fix-docker-in-docker-network-issue-in-kubernetes-cc18c229d9e5
@smnbbrv Thanks a lot for the mtu hint, now drone is finally building ...
I have apk fetch hangs indefinitely, it's not because of mtu but network glitch.
My alpine and apk version:
~# cat /etc/os-release
NAME="Alpine Linux"
ID=alpine
VERSION_ID=3.11.3
PRETTY_NAME="Alpine Linux v3.11"
HOME_URL="https://alpinelinux.org/"
BUG_REPORT_URL="https://bugs.alpinelinux.org/"
~# apk --version
apk-tools 2.10.4, compiled for x86_64.
I can reproduce the issue by just shutting down my eth when apk fetch is downloading apks.
~# apk fetch -R linux-lts
Downloading linux-firmware-sun-20191215-r0
Downloading linux-firmware-microchip-20191215-r0
Downloading linux-firmware-rtl_nic-20191215-r0
Downloading xz-libs-5.2.4-r0
Downloading linux-firmware-keyspan-20191215-r0
Downloading linux-firmware-mwlwifi-20191215-r0
Downloading linux-firmware-cpia2-20191215-r0
Downloading libcrypto1.1-1.1.1d-r3
Downloading linux-firmware-ti-connectivity-20191215-r0
Downloading linux-firmware-slicoss-20191215-r0
Downloading linux-firmware-korg-20191215-r0
Downloading linux-firmware-atmel-20191215-r0
Downloading linux-firmware-tehuti-20191215-r0
Downloading linux-firmware-nvidia-20191215-r0
Downloading linux-firmware-netronome-20191215-r0
5% ###########
Then I bring eth back and check network connectivity is fine.
But the apk fetch does not fail nor resume downloading.
UPDATE 2020-06-16:
Since ssbarnea mentioned about IPv6, I checked my environment,
~# sysctl net.ipv6.conf.all.disable_ipv6
net.ipv6.conf.all.disable_ipv6 = 1
And I am running the test from a physical server, not inside container.
I was able to narrow down the issue and is IPv6. If docker host has IPv6 enabled you are pretty much f** as apk fetch from inside container will get stuck trying to fetch from dl-cdn.alpinelinux.org which will return "dualstack" results, but we all know that IPv6 does not work in containers.
APK gets fully stuck without ever timing out or trying to to use IPv4 addresses, which will likely work.
That problem is a huge PITA as normal debugging techiniques will not give any usable results:
--network host does not matterping or nslookup on dl-cdn.alpinelinux.org from inside container works toowget works (curl is absent from base image)I can confirm that https://stackoverflow.com/a/41497555/99834 hack works on both docker and podman, mainly adding --dns-opt='options single-request' --sysctl net.ipv6.conf.all.disable_ipv6=1 when running/building the containers.
Old problem, but it still happens!
For me, none of the options worked!
I will mention some of the steps that alleviated the problem and allowed me to generate the image, even after 2 or 3 attempts, which is already good, since I could not even generate the image!
1# Repository change, for any mirror, add a RUN line or Joining an existing RUN:
echo "http://dl-4.alpinelinux.org/alpine/v3.12/main" > /etc/apk/repositories \
&& apk update ...
The Official List is here: https://mirrors.alpinelinux.org/
2 # The one that best behaved was to change the DNS of the Image, add a RUN line or Joining an existing RUN:
RUN printf "nameserver 208.67.222.222\nnameserver 8.8.4.4\nnameserver 1.1.1.1\nnameserver 9.9.9.9\nnameserver 8.8.8"> /etc/resolv.conf \
&& apk update && apk add ...
* This Line must be included for all RUNs that update.
3 # Change the Docker DNS:
In Ubuntum, just edit the file: / etc / default / docker
Ex: sudo gedit / etc / default / docker
E Include in the file, the Line:
DOCKER_OPTS = "- dns 208.67.222.222 --dns 8.8.8.8 --dns 1.1.1.1 --dns 8.8.4.4 --dns 208.67.220.220 --dns 9.9.9.9"
Running the offical Drone helm chart on k3os (v0.11) I had to set the MTU to 1450 for my build to finish and not stall on fetching the apkindex.
- name: docker-build
image: plugins/docker
settings:
mtu: 1450
I had a similar issue. We have a docker-in-docker build container within a Rancher 2 / Kubernetes environment. I had to decrease the MTU of the inner docker service by adding "mtu": 1200 into /etc/docker/daemon.json. The host servers MTU is 1500.
daemon.json
{
"mtu": 1200
}
I had a similar issue. We have a docker-in-docker build container within a Rancher 2 / Kubernetes environment. I had to decrease the MTU of the inner docker service by adding
"mtu": 1200into/etc/docker/daemon.json. The host servers MTU is 1500.daemon.json
{ "mtu": 1200 }
did the trick, thx!
Most helpful comment
I added it to my docker client commands, e.g. 'docker build --network host ...'