Sourcegraph: dockerd refuses to handle "More than 100 concurrent queries" in its internal DNS resolver

Created on 4 May 2018  路  4Comments  路  Source: sourcegraph/sourcegraph

  • Issue type: bug report
  • Sourcegraph version: 2.7.6 (14795_2018-05-01_bff0d51)
  • OS Version: Amazon Linux 2 LTS Candidate AMI 2017.12.0.20180328.1 x86_64 HVM GP2 (ami-07eb707f)
  • Docker version:
Client:
 Version:      18.05.0-ce-rc1
 API version:  1.37
 Go version:   go1.9.5
 Git commit:   33f00ce
 Built:        Thu Apr 26 01:03:26 2018
 OS/Arch:      linux/amd64
 Experimental: false
 Orchestrator: swarm

Server:
 Engine:
  Version:      18.05.0-ce-rc1
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.5
  Git commit:   33f00ce
  Built:        Thu Apr 26 01:01:34 2018
  OS/Arch:      linux/amd64
  Experimental: false

When using docker run --network=lsp, where the lsp is a bridge network, dockerd will force /etc/resolv.conf to use 127.0.0.11 which is its internal DNS resolver. That way, it can connect various containers together by name.

Unfortunately, dockerd's internal DNS resolver has a hardcoded limit of 100 concurrent DNS queries:

https://github.com/moby/moby/blob/b159da1/vendor/github.com/docker/libnetwork/resolver.go#L70
https://github.com/moby/moby/blob/b159da1/vendor/github.com/docker/libnetwork/resolver.go#L467-L475

dockerd will log the following errors:

May  4 15:04:37 ip-10-64-11-81 dockerd: time="2018-05-04T15:04:37.372778431Z" level=error msg="More than 100 concurrent queries from 172.18.0.2:47001"
May  4 15:04:43 ip-10-64-11-81 dockerd: time="2018-05-04T15:04:43.447534478Z" level=error msg="More than 100 concurrent queries from 172.18.0.2:56258"

And gitserver will log these errors (edited to redact sensitive information):

14:03:52 gitserver | 2018-05-04T14:03:52+0000 lvl=eror msg="Failed to update" repo=github.example.com/slaw/docker-sourcegraph error="exit status 128" output="fatal: unable to access 'https://[email protected]/slaw/docker-sourcegraph/': Could not resolve host: github.example.com\n"

Steps to reproduce:

  1. Enable around 3000 repos
  2. Wait until gitserver tries to update the cloned repos
  3. gitserver will fail to update the cloned repos because it Could not resolve host

Expected behavior:

gitserver should throttle itself so it doesn't try updating more than 100 repositories at a time, when running under Docker in bridged networking mode.

gitserver already throttles itself while cloning, so there is a good precedent for this behavior.

bug

All 4 comments

Thanks for the great bug report! We actually already have a fix committed to master that will throttle updates per the existing gitMaxConcurrentClones so this should be fixed in the next release, which will probably be 2.8 the week of May 21.

If you want to try our latest developer build, you can use the sourcegraph/server:insiders docker image.

@nicksnyder sourcegraph/server:insiders has indeed fixed this issue. I'll wait until 2.8 releases before closing this bug.

Just wanted to give you a heads up that our 2.8 release is now scheduled for May 21. Thanks for helping us improve the insiders build!

Closing this since 2.8 is now released.

Was this page helpful?
0 / 5 - 0 ratings