Lightgbm: [Blocking] [ci] CI for R freeze

Created on 7 Aug 2020  路  21Comments  路  Source: microsoft/LightGBM

refer to https://github.com/microsoft/LightGBM/runs/955837580
It seems the installation of Rtools (with R 4.0) freezes.
ping @jameslamb

blocking

All 21 comments

Ok I can look in 10 minutes

ok looking right now

I'm going to push changes to #3277 to test, since that is a small R change that was already approved.

馃槙 I tried removing /VERYSILENT so we could get more logs installing Rtools, but no luck. I don't see any other configuration options that can help here.

https://github.com/microsoft/LightGBM/pull/3277/checks?check_run_id=956113469

Telling R to use MinGW
Downloading R and Rtools
Downloading https://cloud.r-project.org/bin/windows/base/old/3.6.3/R-3.6.3-win.exe
Downloading https://cloud.r-project.org/bin/windows/Rtools/Rtools35.exe
Installing R
Done installing R
Installing Rtools
##[error]The operation was canceled.

Those logs tell us that:

  • R was found and could be downloaded
  • Rtools was found and could be downloaded
  • installation of R succeeds
  • installation of Rtools times out

Since there has not been an Rtools release since June 22, and since we're seeing this issue even on Rtools35, I have these theories:

  1. Some service Rtools relies on (like a CTAN mirror or a package server for pacman) is down
  2. some subtle networking thing has changed in Azure that is blocking traffic to some place Rtools wants to hit

I just looked on the pacman mirrors and I see a LOT of errors recently!!!

image

and other mirrors not reporting errors are way out of sync

image

Given this, @guolinke I think you should use your administrator privileges to merge #3071 . We don't know how long it will take for this to be resolved, and it shouldn't hold up releases for all the other non-R components any longer.

@jameslamb thank you so much.
No problem, I will merge it first.

I just checked and the pacman repos still seem to be down. I hope it's resolved soon.

I have been trying with #3277 and I still see this issue :/

This is difficult because even after removing the /VERYSILENT, we're still not getting any logs. Going to pull out my Windows laptop and try running this code there.

It seems that a lot of the pacman mirrors are working now...maybe that was never the issue to begin with, I don't know. Maybe something in Rtools is asking for a user input and hung waiting on it.

Ok making progress!!! Now I think that the pacman thing was completely unrelated.

I found that the download code we've been using to get Rtools is returning a successful status, but not actually downloading anything!!!

image

and then if you run the code to install Rtools and the file doesn't exist, you get a dialog box that has to be confirmed in a GUI. That is why it's hanging in CI!!

image

I'm investigating how to fix this. Maybe the download link from CRAN is now redirecting and the powershell code in test_r_package_windows.ps1 isn't following that?

hmmmm doesn't seem like it's a redirect.

image

According to your screenshots you are downloading from cran.r-project.org, but our CI tries to download from cloud.r-project.org. Maybe this matters...

https://github.com/microsoft/LightGBM/blob/33af069cc174a4c56c5f1e2f351196a284248f80/.ci/test_r_package_windows.ps1#L105-L106

On my local machine CI code works fine and downloads both R and Rtools sucessfully.

馃槺 馃槺 馃槺 馃槺 馃槺

when I change to cloud, it still does not download a file on my laptop

image

when I switch to Invoke-WebRequest with the same URL, a file is downloaded!

image

I'm so confused 馃槙

I just switched to Invoke-WebRequest on #3277 and R Windows jobs for R 3.6 are now passing, but the ones for R 4.0 are still pending after 25 minutes :/

This is tough because you cannot (as far as I know) pin to one distribution of Rtools as long as it is not "frozen" (https://cran.r-project.org/bin/windows/Rtools/history.html).

image

So the Rtools.exe you get today might not be the one you get from the same link tomorrow.

I just pushed a commit to #3277 that switches from cloud. links to cran. links for R and Rtools. I also added an ls after the downloads, so we can see in the logs if the files were downloaded and that that was the issue.

ok from the logs on #3277, it seems that the files ARE getting downloaded successfully.

So something else is failing for Rtools 4.0 specifically :/

image

My guesses now are:

  1. some package manager (pacman or something else) used in the install process is not responding
  2. some weird networking issue with Azure that is preventing Rtools from completing some request
  3. something in Rtools is popping up a user dialogue and hangs forever waiting for it to be answered

When I run this on my laptop

Start-Process -FilePath C:\Users\James\repos\sandbox\Rtools.exe -NoNewWindow -Wait -ArgumentList "/DIR=C:\Users\James\repos\sandbox"

A dialogue box pops up

image

oh I forgot to add /VERYSILENT above. When I do that

Start-Process -FilePath C:\Users\James\repos\sandbox\Rtools.exe -NoNewWindow -Wait -ArgumentList "/VERYSILENT /DIR=C:\Users\James\repos\sandbox\rtools"

the process just hangs forever :/

Can you please try to ~duplicate (escape) \ signs in DIR argument and~ add SUPPRESSMSGBOXES argument?

https://github.com/r-lib/actions/blob/32bf69a8dfce4486d4f2221a3a6ea9b1dd75eadb/setup-r/lib/installer.js#L270-L272

UPD: OK, according to the screenshot DIR is set correctly.

oooo sure!

I also just found that I can stop the "hangs forever" problem locally by changing the command to this:

 C:\Users\James\repos\sandbox\Rtools.exe /VERYSILENT /DIR=C:\users\James\repos\sandbox\rtools

So maybe something about the new Rtools does not work well with Start-Process on some versions of PowerShell?

hey maybe that is working?

Here is one of the R 4.0 jobs getting past Rtools (https://github.com/microsoft/LightGBM/pull/3277/checks?check_run_id=962455060)

image

I need to step away for a few hours, but hopefully that was it and the build on #3277 will work. @StrikerRUS please feel free to push to that PR if you want, I know that this issue is blocking everything right now. I'll be back on in about 3 hours to try other things.

this was fixed by #3277

Was this page helpful?
0 / 5 - 0 ratings