Describe the bug
Something in the Ubuntu 1804 (20200130 update) release broke our tests. We're not sure what. Starting on 1/31 (20200130 was released 1/30), our smoke tests running in Node.js in docker-compose started throwing ETIMEDOUT errors when making external requests like:
UnhandledPromiseRejectionWarning: RequestError: Error: connect ETIMEDOUT <ip>:<port>
<ip>:<port> being one of our ELBs. So we suspect there's a DNS resolution issue.
We've confirmed this isn't a code change by opening a pull request that reverts back to a commit prior to 1/30, and it still fails with the above error.
We've also tested with Ubuntu 16.04 20200130 and it does not have this issue.
Area for Triage:
Deployment/Release
Question, Bug, or Feature?:
Bug
Virtual environments affected
Expected behavior
Expect to not see this DNS resolution issue like we did in the 20200119 release. _Ideally we could pin to this release using the runs-on property like runs-on: ubuntu-18.04@20200119._
Actual behavior
See the "Describe the bug" section above. I don't have a public repo to share that reproduces this bug.
DNS resolution confirmed. We added a step just before the smoke tests to append a line to the /etc/hosts file for the external hostname that we request. Logic looks something like this:
- name: Update /etc/hosts with External IP address
run: |
external_host=$(<ugly, fragile logic to get the hostname dynamically>)
ip_address="$(dig +short ""$external_host"" | head -n 1)"
echo "appended line to /etc/hosts: $ip_address $external_host"
docker exec "$container_making_external_request" \
sh -c "echo '$ip_address $external_host' >> /etc/hosts"
Smoke tests are passing consistently now, but this is less than ideal.
Hello, @sean-krail
The AzP.20200211.ubuntu18.1 has been deployed. Please try to check your test once more time.
Hi @al-cheb, it's still failing with the same ETIMEDOUT error
Hello, @sean-krail
Could you please replace dns server to 8.8.8.8 and verify the test:
- name: Set 8.8.8.8 as dns server
run: |
sudo sed -i 's/#DNS=/DNS=8.8.8.8 8.8.4.4/g' /etc/systemd/resolved.conf
sudo systemctl daemon-reload
sudo systemctl restart systemd-networkd
sudo systemctl restart systemd-resolved
That worked for us, @al-cheb!
@al-cheb - is this something that will be fixed on the virtual environment itself? Or do we need to permanently modify our workflow? CC: @sean-krail
Hello, @sean-krail and @hiradp
I will try to escalate the issue and provide you an answer asap. Is it possible to provide dns hostnames that you have faced the issue?
@al-cheb, we're no longer able to reproduce the issue with the latest ubuntu build ubuntu18/20200217.1. We never tested ubuntu18/20200211.1 fyi. I'm not sure what changed that would've fixed this issue for us. We'll continue to monitor it on our end.
Closing this for now since it doesn't repro and we didn't find anything on our side to explain it. @sean-krail if it pops up again please let us know. thank you.