Sdk: Running publish can hang after printing the banner

Created on 12 Jul 2018  Â·  5Comments  Â·  Source: dotnet/sdk

Today, I have built an updated Yocto image which contains four SDK based projects. Each publishing step for these four projects hung during the build.

Yocto builds the projects in parallel and each runs through building the client-side packages with node and then runs the equivalent of dotnet publish -c Release -o … -f net471 for each project. The projects all have their private source code tree (it's a copy from the actual GIT repository), so they don't share any code files, bin or obj folders, but there is a common NuGet cache. From the build logs, I can see that the publish command went as far as printing the startup banner

Microsoft (R) Build Engine version 15.7.179.6572 for .NET Core
Copyright (C) Microsoft Corporation. All rights reserved.

and didn't do anything beyond that. The dotnet processes existed (by looking at pstree), but weren't doing anything. There were two kinds of dotnet processes: some attached directly to the init process and those being spawned by the Yocto bitbake infrastructure from the shell.

After killing the entire build process, I restarted it and the publish commands hung again.
Manually running one of the publish commands on the command line instead of the build scripts hung as well.

I then killall dotnet'ed all existing dotnet processes and everything went back to normal. I haven't seen a hang since.

Steps to reproduce

I don't know.

Environment data

dotnet --info output:

.NET Core SDK (reflecting any global.json):
 Version:   2.1.301
 Commit:    59524873d6

Runtime Environment:
 OS Name:     ubuntu
 OS Version:  16.04
 OS Platform: Linux
 RID:         ubuntu.16.04-x64
 Base Path:   /usr/share/dotnet/sdk/2.1.301/

Host (useful for support):
  Version: 2.1.1
  Commit:  6985b9f684

.NET Core SDKs installed:
  1.0.1 [/usr/share/dotnet/sdk]
  2.0.0 [/usr/share/dotnet/sdk]
  2.1.3 [/usr/share/dotnet/sdk]
  2.1.301 [/usr/share/dotnet/sdk]

.NET Core runtimes installed:
  Microsoft.AspNetCore.All 2.1.1 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
  Microsoft.AspNetCore.App 2.1.1 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 1.0.4 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 1.1.1 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.0 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.0.4 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
  Microsoft.NETCore.App 2.1.1 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
  https://aka.ms/dotnet-download
Bug

All 5 comments

cc @rainersigwald

@Tragetaschen To help try to narrow down one possible contributing factor, does this repro if you export MSBUILDDISABLENODEREUSE=1 after all prior dotnet instances have been terminated?

I'll run a stress test next week to see if it reproduces at all and then try the export.

It would also be great if you could enable and capture logging, which might give more clues as to where the hang is happening. Normally I'd suggest using a binary log by passing -bl to MSBuild, but since this is a hang a text log might be safer in the face of unexpected process termination. Something like -filelog -fileloggerparameters:verbosity=diagnostic;LogFile=Publish.log would be helpful.

I tried reproducing it as-is with no success. I looped a hundred times each

  • change source code -> build
  • change source code -> killall dotnet -> build

and all finished successfully.

Either I'm overlooking something or there's a very narrow race :-/

I've not been able to reproduce this with another 100+100 runs.

I blame cosmic rays.

Was this page helpful?
0 / 5 - 0 ratings