Virtual-environments: Windows runners frequently cannot resolve www.cygwin.com

Created on 9 Apr 2020  Â·  25Comments  Â·  Source: actions/virtual-environments

Describe the bug
I frequently get

> powershell -Command "(New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')" 
Exception calling "DownloadFile" with "2" argument(s): "The remote name could not be resolved: 'www.cygwin.com'"
At line:1 char:1
+ (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup- ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : WebException

See, e.g., the "Install cygwin" step of https://github.com/mit-plv/fiat-crypto/runs/570956988?check_suite_focus=true

Area for Triage:

Question, Bug, or Feature?:

Virtual environments affected

  • [ ] macOS 10.15
  • [ ] Ubuntu 16.04 LTS
  • [ ] Ubuntu 18.04 LTS
  • [ ] Windows Server 2016 R2
  • [x] Windows Server 2019

Expected behavior
I expect the runners to not be so flaky about finding cygwin.com.

Actual behavior
https://github.com/mit-plv/fiat-crypto/runs/570956988?check_suite_focus=true

Common Tools Windows bug investigate

Most helpful comment

The team at www.cygwin.com has updated their DNS to use an explicit A record which has resolved this issue. Unfortunately that means that IPv6 connectivity was lost since the CNAME was resolving AAAA records as well, but only an A record was added after removing the CNAME.

I have reached out to the cygwin mailing list again informing them of the issue. The intermittent connectivity problems referenced in this issue should be fixed, but I will leave this issue open to track getting IPv6 connectivity back up for www.cygwin.com.

All 25 comments

Hi @JasonGross!
Could you please add this before the download step
nslookup www.cygwin.com
and this right after the download
nslookup www.cygwin.com 8.8.8.8
It will help us to determine if there is an issue with the particular DNS or with cygwin.com itself

From https://github.com/mit-plv/fiat-crypto/runs/574218618?check_suite_focus=true

d:\a\fiat-crypto\fiat-crypto>nslookup www.cygwin.com 
Non-authoritative answer:

Server:  UnKnown
Address:  168.63.129.16

Name:    server2.sourceware.org
Address:  2620:52:3:1:0:246e:9693:128c
Aliases:  www.cygwin.com


d:\a\fiat-crypto\fiat-crypto>powershell -Command "(New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')" 
Exception calling "DownloadFile" with "2" argument(s): "The remote name could not be resolved: 'www.cygwin.com'"
At line:1 char:1
+ (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup- ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : WebException


d:\a\fiat-crypto\fiat-crypto>nslookup www.cygwin.com 8.8.8.8 
Non-authoritative answer:

Server:  dns.google
Address:  8.8.8.8

Name:    server2.sourceware.org
Addresses:  2620:52:3:1:0:246e:9693:128c
      8.43.85.97
Aliases:  www.cygwin.com

For reference, when the download succeeds, it looks like this (https://github.com/mit-plv/fiat-crypto/runs/574218103?check_suite_focus=true):

d:\a\fiat-crypto\fiat-crypto>nslookup www.cygwin.com 
Non-authoritative answer:

Server:  UnKnown
Address:  168.63.129.16

Name:    server2.sourceware.org
Addresses:  2620:52:3:1:0:246e:9693:128c
      8.43.85.97
Aliases:  www.cygwin.com


d:\a\fiat-crypto\fiat-crypto>powershell -Command "(New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')" 

d:\a\fiat-crypto\fiat-crypto>nslookup www.cygwin.com 8.8.8.8 
Non-authoritative answer:

Server:  dns.google
Address:  8.8.8.8

Name:    server2.sourceware.org
Addresses:  2620:52:3:1:0:246e:9693:128c
      8.43.85.97
Aliases:  www.cygwin.com

So it looks like it is a problem with the particular DNS, which sometimes reports 8.43.85.97 in addition to 2620:52:3:1:0:246e:9693:128c, and sometimes does not.

On the other hand, here's a run where there is no difference in nslookup:

d:\a\fiat-crypto\fiat-crypto>nslookup www.cygwin.com 
Non-authoritative answer:

Server:  UnKnown
Address:  168.63.129.16

Name:    server2.sourceware.org
Addresses:  2620:52:3:1:0:246e:9693:128c
      8.43.85.97
Aliases:  www.cygwin.com


d:\a\fiat-crypto\fiat-crypto>powershell -Command "(New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')" 
Exception calling "DownloadFile" with "2" argument(s): "The remote name could not be resolved: 'www.cygwin.com'"
At line:1 char:1
+ (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup- ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : WebException


d:\a\fiat-crypto\fiat-crypto>nslookup www.cygwin.com 8.8.8.8
Non-authoritative answer:

Server:  dns.google
Address:  8.8.8.8

Name:    server2.sourceware.org
Addresses:  2620:52:3:1:0:246e:9693:128c
      8.43.85.97
Aliases:  www.cygwin.com

https://github.com/mit-plv/fiat-crypto/runs/578537990?check_suite_focus=true

@JasonGross thanks for the data! The last attempt looks really strange. Could you please try to download using the following retry logic? Will it ever download some time(and how many attempts will it take) or loop infinite?

while ($true)
    {
        try
        {
            (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')
            break
        }
        catch
        {
            Write-Host "There is an error during package downloading:`n $_"
        }
    }

I'm having some trouble getting this logic to work with nslookup; see https://github.com/mit-plv/fiat-crypto/pull/745/checks?check_run_id=583090618 . In particular, when I move the nslookup command from cmd to powershell, I get

Non-authoritative answer:

Server:  UnKnown
Address:  168.63.129.16

Name:    server2.sourceware.org
Address:  8.43.85.97
Aliases:  www.cygwin.com

nslookup : Non-authoritative answer:
At D:\a\_temp\11ad56e9-c870-4d15-b171-ed284b67f5a5.ps1:15 char:1
+ nslookup www.cygwin.com 8.8.8.8
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (Non-authoritative answer::String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError
##[error]Process completed with exit code 1.

and then the powershell script quits. What do you advice doing here?

Looks odd, I've tried just now and everything works fine for me. Maybe there were some google dns issues?

jobs:
  build:
    runs-on: windows-latest
    steps:
    - name: retry download
      run: |
        while ($true)
        {
            try
            {
                (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')
                break
            }
            catch
            {
                Write-Host "There is an error during package downloading:`n $_"
            }
        }
        nslookup www.cygwin.com 8.8.8.8
      shell: powershell

@miketimofeev What if you also do nslookup before the loop? In https://github.com/JasonGross/test-windows-cygwin-download/runs/586283926?check_suite_focus=true , I get

Non-authoritative answer:

Server:  UnKnown
Address:  168.63.129.16

Name:    server2.sourceware.org
Addresses:  2620:52:3:1:0:246e:9693:128c
      8.43.85.97
Aliases:  www.cygwin.com

There is an error during package downloading:
 Exception calling "DownloadFile" with "2" argument(s): "The remote name could not be resolved: 'www.cygwin.com'"
There is an error during package downloading:
 Exception calling "DownloadFile" with "2" argument(s): "The remote name could not be resolved: 'www.cygwin.com'"
nslookup : Non-authoritative answer:
At D:\a\_temp\0818ab92-143d-459e-8cbb-363a122d583d.ps1:15 char:1
+ nslookup www.cygwin.com 8.8.8.8
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (Non-authoritative answer::String) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

##[error]Process completed with exit code 1.

on

name: CI (Windows)

on:
  push:
  pull_request:
  schedule:
    - cron: '0 0 1 * *'

jobs:
  build:

    runs-on: windows-latest

    steps:
    - uses: actions/checkout@v2
    - name: Download cygwin
      run: |
        nslookup www.cygwin.com
        while ($true)
        {
            try
            {
                (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')
                break
            }
            catch
            {
                Write-Host "There is an error during package downloading:`n $_"
            }
        }
        nslookup www.cygwin.com 8.8.8.8
      shell: powershell

(First try on https://github.com/JasonGross/test-windows-cygwin-download/ )

@JasonGross yeah, got it. Powershell treats Non-authoritative answer: as an error, so let's use direct cmd calling instead

    steps:
      - uses: actions/checkout@v2
      - name: Download cygwin
        run: |
          & cmd /c 'nslookup www.cygwin.com 2>&1'
          while ($true)
          {
              try
              {
                  (New-Object Net.WebClient).DownloadFile('http://www.cygwin.com/setup-x86_64.exe', 'setup-x86_64.exe')
                  break
              }
              catch
              {
                  Write-Host "There is an error during package downloading:`n $_"
              }
          }
          & cmd /c 'nslookup www.cygwin.com 8.8.8.8 2>&1'
        shell: powershell

Thanks! Note that it looks like you already have something of an answer to your question, though. On the dedicated repository, it looks like the download failed twice in a row and then succeeded.

It's worth checking the InnerException to see if there are more details. Can you change the catch block to recurse the the InnerException properties and dump exception types and messages?

If you suggest code, I'm happy to use it and report back on the details. I've literally never written powershell before (this code was taken from elsewhere), so I don't actually know how to do that (and don't have time right now to dig into how to do it).

@chkimes @JasonGross Looks like there is no inner exception. I've tried to catch it like this and got nothing except the usual error(The remote name could not be resolved: 'www.cygwin.com'):

    catch [System.Net.WebException] 
    {
        if ($_.Exception.InnerException) 
        {
            $_.Exception.InnerException.Message
        }
        else 
        {
            $_.Exception.Message
        }
    }

@JasonGross we took packet captures from inside the VM and found differences in the DNS response for succeeding and failed requests. These differences are not observable when using other DNS providers.

I have escalated to the Azure DNS team.

As a workaround, consider forcing a different DNS server:

Get-NetAdapter | Set-DnsClientServerAddress -ServerAddresses 8.8.8.8
cmd /c "ipconfig /flushdns 2>&1"

I am not able to repro this issue any longer - does anyone else see a consistent repro or can we now close this issue?

@chkimes I'll try to repro it today

@chkimes yeah, the issue is still here
There is an exception Exception calling "DownloadFile" with "2" argument(s): "The remote name could not be resolved: 'www.cygwin.com'"

Thanks - I got a hit from a scheduled run so I have provided the VM and timing information to Azure.

For reference: Internal ticket number 184150533

I had a chat with the DNS team, they believe this is related to a bug in WinDNS that is being triggered by an unusual DNS configuration for www.cygwin.com.

https://www.digwebinterface.com/?hostnames=www.cygwin.com%0D%0Aserver2.sourceware.org.%0D%0Abing.com%0D%0Agoogle.com&type=A&trace=on&ns=resolver&useresolver=8.8.4.4&nameservers=

See the above traces, with bing.com and google.com included for comparison.

sourceware.org.     86400   IN  NS  server3.sourceware.org.
sourceware.org.     86400   IN  NS  server2.sourceware.org.
sourceware.org.     86400   IN  NS  sourceware.org.
sourceware.org.     86400   IN  NS  ns.elastic.org.
;; Received 291 bytes from 199.19.54.1#53(199.19.54.1) in 45 ms

server2.sourceware.org. 86400   IN  A   8.43.85.97
sourceware.org.     86400   IN  NS  ns.elastic.org.
sourceware.org.     86400   IN  NS  server3.sourceware.org.
sourceware.org.     86400   IN  NS  server2.sourceware.org.
sourceware.org.     86400   IN  NS  sourceware.org.
;; Received 247 bytes from 8.43.85.97#53(8.43.85.97) in 30 ms

www.cygwin.com is CNAME'd to server2.sourceware.org. That domain has an interesting setup where it has an NS record for itself. Note that examples bing.com and google.com do not do this.

I didn't receive any ETA from the WinDNS team for when this would be fixed. Does anyone happen to know any contacts for cygwin.com or perhaps sourceware.org that we could contact about updating their DNS setup?

You could always email the cygwin mailing list.

Does anyone happen to know any contacts for cygwin.com or perhaps sourceware.org that we could contact about updating their DNS setup?

│17:23:57 @fche | pls let them know RH IT has been requested to change it to an A record │

The team at www.cygwin.com has updated their DNS to use an explicit A record which has resolved this issue. Unfortunately that means that IPv6 connectivity was lost since the CNAME was resolving AAAA records as well, but only an A record was added after removing the CNAME.

I have reached out to the cygwin mailing list again informing them of the issue. The intermittent connectivity problems referenced in this issue should be fixed, but I will leave this issue open to track getting IPv6 connectivity back up for www.cygwin.com.

I pinged the mailing list a few times, but didn't get a response re: IPv6. I'm going to close this issue since the original issue has been worked around.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

raulpopadineti picture raulpopadineti  Â·  3Comments

shogo82148 picture shogo82148  Â·  3Comments

damccorm picture damccorm  Â·  3Comments

Tnze picture Tnze  Â·  4Comments

matthewfeickert picture matthewfeickert  Â·  3Comments