Arcade: Restoring from private azure devops feeds is flaky

Created on 12 Sep 2019  路  10Comments  路  Source: dotnet/arcade

We have been seeing multiple issues when restoring from azure devops private feeds.

These errors usually manifest in 401/403/404 errors when attempting to restore packages, and are caused by a mix of bugs in NuGet / The credential provider / MSBuild.

While we have some more information as to when the issues will be fixed, the azure artifacts team recommended a wokaround:

  • Clear the nuget http-cache before attempting to restore any packages from these feeds
  • Set the following environment variables:"
    NUGET_PLUGIN_HANDSHAKE_TIMEOUT_IN_SECONDS=20
    NUGET_PLUGIN_REQUEST_TIMEOUT_IN_SECONDS=20

Most helpful comment

The NuGet changes will be available in the 3.1.200 SDK, we should update Arcade (and dependent repos) to that version when it's released to see if that solves the flakiness.

All 10 comments

After multiple fixes from the NuGet and Credential provider side, we expect things to be much better. During a future servicing release we will attempt the workarounds we introduced to work around this, and if everything works we can work on the issues to remove them and resolve this issue.

Moving to the .NET 5 epic.

Doesn't look like the failure modes we've seen for this.

Doesn't mean it's not a new failure mode. Is it happening consistently? if not it might just have been a problem in the agent?

Will says he hasn't seen it before and it's likely intermittent. I just wasn't sure if it was something we were already tracking or should be tracking.

I think this might just been an agent network problem, looking at the logs for that failed build leg, I see stuff such as:

Failed to download package 'System.Reflection.Metadata.1.4.1' from 'https://dotnetfeed.blob.core.windows.net/dotnet-core/flatcontainer/system.reflection.metadata/1.4.1/system.reflection.metadata.1.4.1.nupkg'.
  Unable to read data from the transport connection: Connection reset by peer.
    Connection reset by peer
  Retrying 'FindPackagesByIdAsync' for source 'https://dotnetfeed.blob.core.windows.net/dotnet-core/flatcontainer/system.runtime.interopservices.runtimeinformation/index.json'.
  The SSL connection could not be established, see inner exception.
    Unable to read data from the transport connection: Connection reset by peer.
    Connection reset by peer
  Retrying 'FindPackagesByIdAsync' for source 'https://dotnetfeed.blob.core.windows.net/dotnet-core/flatcontainer/system.linq.expressions/index.json'.
  The SSL connection could not be established, see inner exception.
    Unable to read data from the transport connection: Connection reset by peer.
    Connection reset by peer
  Retrying 'FindPackagesByIdAsync' for source 'https://dotnetfeed.blob.core.windows.net/dotnet-core/flatcontainer/microsoft.csharp/index.json'.
  The SSL connection could not be established, see inner exception.
    Unable to read data from the transport connection: Connection reset by peer.
    Connection reset by peer
  Retrying 'FindPackagesByIdAsync' for source 'https://dotnetfeed.blob.core.windows.net/dotnet-core/flatcontainer/system.collections/index.json'.
  The SSL connection could not be established, see inner exception.
    Unable to read data from the transport connection: Connection reset by peer.
    Connection reset by peer
  Retrying 'FindPackagesByIdAsync' for source 'https://dotnetfeed.blob.core.windows.net/dotnet-core/flatcontainer/system.objectmodel/index.json'.
  The SSL connection could not be established, see inner exception.
    Unable to read data from the transport connection: Connection reset by peer.
    Connection reset by peer

Which means it had problems restoring from even the sleet feeds, and not just with the artifacts feeds.

The NuGet changes will be available in the 3.1.200 SDK, we should update Arcade (and dependent repos) to that version when it's released to see if that solves the flakiness.

Given the dotnet/runtime bring up, these types of issues should be treated with higher than usual urgency.

I queued the following builds to see if it's feasible to remove the workarounds now:

dotnet-runtime: https://dnceng.visualstudio.com/internal/_build/results?buildId=759850&view=results
dotnet-aspnetcore: https://dnceng.visualstudio.com/internal/_build/results?buildId=759860&view=results

Two mono legsin the runtime build hit the dreaded error where the credential provider crashes.

/Users/runner/work/1/s/.dotnet/sdk/5.0.100-preview.8.20362.3/NuGet.targets(128,5): error : Problem starting the plugin '/Users/runner/.nuget/plugins/netcore/CredentialProvider.Microsoft/CredentialProvider.Microsoft.dll'. Plugin 'CredentialProvider.Microsoft' failed within 5.444 seconds with exit code . [/Users/runner/work/1/s/src/libraries/restore/netfx/netfx.depproj]
##[error].dotnet/sdk/5.0.100-preview.8.20362.3/NuGet.targets(128,5): error : (NETCORE_ENGINEERING_TELEMETRY=Restore) Problem starting the plugin '/Users/runner/.nuget/plugins/netcore/CredentialProvider.Microsoft/CredentialProvider.Microsoft.dll'. Plugin 'CredentialProvider.Microsoft' failed within 5.444 seconds with exit code .
/Users/runner/work/1/s/.dotnet/sdk/5.0.100-preview.8.20362.3/NuGet.targets(128,5): error : Problem starting the plugin '/Users/runner/.nuget/plugins/netcore/CredentialProvider.Microsoft/CredentialProvider.Microsoft.dll'. Plugin 'CredentialProvider.Microsoft' failed within 5.444 seconds with exit code . [/Users/runner/work/1/s/src/libraries/restore/netfx/netfx.depproj]
/Users/runner/work/1/s/.dotnet/sdk/5.0.100-preview.8.20362.3/NuGet.targets(128,5): error :   Plugin 'CredentialProvider.Microsoft' failed within 5.444 seconds with exit code . [/Users/runner/work/1/s/src/libraries/restore/netfx/netfx.depproj]
/Users/runner/work/1/s/.dotnet/sdk/5.0.100-preview.8.20362.3/NuGet.targets(128,5): error :   A task was canceled. [/Users/runner/work/1/s/src/libraries/restore/netfx/netfx.depproj]

followed by multiple 401s when trying to access an authenticated feed.

The plugin failures are consistent, and do not go away if I force the reinstallation of the credential provider. Next test will be with only removing the clearing of the nuget http cache, but leaving the environment variables.

Was this page helpful?
0 / 5 - 0 ratings