Arcade: Official Build Failures in dotnet/installer when publishing

Created on 22 Apr 2020  路  29Comments  路  Source: dotnet/arcade

Most helpful comment

Perhaps this will be interesting to @emgarten .

All 29 comments

cc/ @ilyas1974

From reading several logs, the failure here is consistently right when this is called (container ExistsAsync()):
https://github.com/emgarten/Sleet/blob/master/src/SleetLib/FileSystem/AzureFileSystem.cs#L71

I am chasing down a repro now. I have a couple theories and will update the issue as I find out.

Un-assigning myself as I'm investing all my time on this other issue: https://github.com/dotnet/arcade/issues/5328

I believe it was updating the .NET runtime version used here: https://github.com/dotnet/installer/commit/2db1aa8754755daf6947cdd4d964cac8270af2f2 that broke this. It's the first commit this occurred, as well as it updates the dotnet SDK version used.

To evaluate whether this is accurate, I'm queueing a test build which won't publish with that line rolled back to see what happens.

Rolled back that single line with the first commit hash that had the problem in this build: https://dev.azure.com/dnceng/internal/_build/results?buildId=612622&view=results

That didn't work; the other changes were dependent on the newer SDK. I am now rebuilding (with only dev publishing enabled) the previous good commit. This will confirm that the commit (updating dependencies from dotnet/sdk ) was the cause if it's green; Once that's known we can discuss what to do, since we don't actually build the API that's blowing up.

Head's up @jaredpar and @marcpopMSFT: Looks like we likely have another "chicken and egg" situation brewing w/ the latest SDK that Arcade is pointing to. Meaning....this'll come in as a breaking change to installer when we promote. (tomorrow??)

I think this problem will actually get sweeped under the rug once we promote Arcade, as the payload includes the changes to stop publishing to the dotnet-core sleet feed, so it won't be that noisy.

Still probably worth understanding how version 5.0.100-preview.4.20217.10 of the SDK is breaking the Azure APIs that we're using, if that's indeed the culprit.

Update this morning; forgot to add a channel for that update, so spun again

The updated build does indeed prove that the problem is, maddeningly, the .NET 5 preview SDK being used. I am attempting to get a local repro, however it's also possible we could turn off sleet publishing for master and make this go away that way. I am asking around.

@marcpopMSFT / @jaredpar / @markwilkie Heads up:

  • It's now apparent that updating from .NET Core 5.0.100 Preview 4 -> 5 breaks the sleet library, which is using boring API calls in (old) Azure storage client libraries. This may have some actual impact for customers, though it's also possible the changes that triggered it were important and security (e.g. TLS forced changes) related.
  • Arcade has already been planning to turn off Sleet publishing for master branch builds anyways; if you're OK with this (not publishing to dotnetfeed) you could turn this off in your repo and immediately unblock installer builds
  • In the mean time I am still trying to get a standalone repro of this which I can share with the runtime team to see if it's scary from a customer compatiblity standpoint.

I have boiled down a repro and it is indeed specific to running this in the context of an msbuild task on preview 4 .NET Core 5.0.100. I will get this simplified and shared with the msbuild folks after some investigation. Likely just an indirect dependency difference?

I have an assembly that will repro/notrepro depending on whether it runs identical code as an msbuild task in the latest .NET Msbuild.

Both things are checking for an extant container in Microsoft.WindowsAzure.Storage , v 9.03.2.0.

The only interesting difference seems to be the runtime bits; they're 5.00.20.21702 when I use the msbuild, version, 4.700.20.12001 from the same dotnet SDK but targeting 3.1. I'm trying the same thing with the latest public 5.0 SDK to see what I get.

Tried with latest public SDK, 5.00.20.21406, no repro. Will try newer as I figure out how to do it (VS doesn't seem to care much for private preview)

OK final update here. @marcpopMSFT @dsplaisted FYI.

  • I am able to reproduce this just by calling Sleet APIs from a console app and toggling my runtime between 5.0.100-preview.4.20217.10 and whatever the latest public preview 3 is. This is probably interesting.

  • I can confirm Sleet is gone with the next automatically flowed update; simply waiting until build promotion occurs should fix this publish problem.

  • If this is not an option you can revert the dotnet/sdk updates that brought you to preview 4.

I am happy to provide help with repro stuff if it's useful. For now I'm throwing this in tracking/blocked as we understand the issue and a fix is in motion.

Perhaps this will be interesting to @emgarten .

@missymessa FYI, this will keep failing until the above stuff happens.

@MattGal Arcade version 5.0.0-beta.20221.14 was just promoted to .NET Eng - Latest

Looks like this is solved already, but if you need help moving Sleet to a different Azure SDK I'm happy to help.

Since copy to latest is a project specific to installer, you should consider updating the azure sdk that is being used inside the project.

In master we're continuing to see publishing failures, with a new error now:

Unable to upload to Sdk/5.0.100-preview.5.20227.4/productVersion.txt.sha due to System.IO.IOException: The process cannot access the file 'D:\a\1\a\BlobArtifacts\productVersion.txt.sha' because it is being used by another process.

Builds with this issue:
https://dev.azure.com/dnceng/internal/_build/results?buildId=618825&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618682&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618475&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618421&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618343&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=619090&view=results

@mmitche Looks like there's more to do besides https://github.com/dotnet/installer/pull/7290

Actually, I failed to notice: dotnet/installer#7290 was targetted to the preview4 branch and not master. @sfoslund, porting that change to master should fix those issues.

In master we're continuing to see publishing failures, with a new error now:

Unable to upload to Sdk/5.0.100-preview.5.20227.4/productVersion.txt.sha due to System.IO.IOException: The process cannot access the file 'D:\a\1\a\BlobArtifacts\productVersion.txt.sha' because it is being used by another process.

Builds with this issue:
https://dev.azure.com/dnceng/internal/_build/results?buildId=618825&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618682&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618475&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618421&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=618343&view=results
https://dev.azure.com/dnceng/internal/_build/results?buildId=619090&view=results

Those failures should be fixed by porting my p4 change the master. The CopyToLatest issue is different. I'll attempt to upgrade the storage sdk there.

However, we need to understand why this happens because we're not supposed to break older applications.

Great, I just made a PR to port the change to master: dotnet/installer#7320

OK final update here. @marcpopMSFT @dsplaisted FYI.

  • I am able to reproduce this just by calling Sleet APIs from a console app and toggling my runtime between 5.0.100-preview.4.20217.10 and whatever the latest public preview 3 is. This is probably interesting.
  • I can confirm Sleet is gone with the next automatically flowed update; simply waiting until build promotion occurs should fix this publish problem.
  • If this is not an option you can revert the dotnet/sdk updates that brought you to preview 4.

I am happy to provide help with repro stuff if it's useful. For now I'm throwing this in tracking/blocked as we understand the issue and a fix is in motion.

@MattGal Can you package this repro up and file and issue in runtime? This should not happen.

Closing as this seems to be done.

Was this page helpful?
0 / 5 - 0 ratings