Arcade: New MSBuild logger increases build time by 5x in aspnet/Extensions

Created on 14 May 2019 · 10Comments · Source: dotnet/arcade

The latest version of Arcade enables by default a custom MSBuild logger on Windows builds only. In the aspnet/Extensions repo, this impacts build perf impact on Windows builds significantly.

For example, see this PR: https://github.com/aspnet/Extensions/pull/1709. Here are the observed timings:

* This logger is only enabled in Windows so far, so haven't observed build perf changes in *nix yet

Source

natemcmaster

Most helpful comment

Why are we sending telemetry on every build event? That is definitely something we should not be doing for this exact reason.

Networking during build, or PR, should always be treated with immense skepticism. It will fail the question is just a matter of how often. Rather than sending telemetry during our build we should be logging it somewhere, to disk likely, and then have a follow up stage in our pipelines that uploads it to wherever.

jaredpar on 16 May 2019

👍3

All 10 comments

cc @jaredpar @ChadNedzlek

natemcmaster on 14 May 2019

@natemcmaster do you have an ETL trace that I could take a look at?

jaredpar on 14 May 2019

I don’t have a trace, but this repros consistently on CI on the aspnet/Extensions repo. I worked around it by breaking the “don’t-touch-this” rule of eng/common/ and setting $pipelinesLog = $false in tools.ps1

natemcmaster on 14 May 2019

cc/ @rainersigwald

markwilkie on 14 May 2019

Yeah, I suspect switching to INodeLogger is in order. I'll do that, just a sec.

rainersigwald on 16 May 2019

The event forwarding isn't the problem here. Lots of discussion in https://github.com/dotnet/arcade/pull/2811.

How often would ProjectFinished get called for these builds? The telemetry part is making one http call to app insights for every ProjectFinished event. If that is called many times per project for the various targets that project references call that might be the problem.

That was a great insight, @alexperovich! I added a couple of commits to https://github.com/aspnet/Extensions/pull/1719 to disable the telemetry--on Windows I unset one of the is-this-a-build-machine properties, and on *nix I used configured the forwarding logger to NOT send ProjectFinished events. Both builds completed quickly.

You will get a ProjectFinished for the multiplication of:

Project
Target (ProjectReferences call many targets, and so does NuGet restore)
TargetFramework
ProjectReference

I hacked a local copy of the logger to write a line every time ReportToAnalytics was called in the Extensions build: CallsToReportToAnalytics.txt. There are about 13,000.

I think we should fix this in two phases:

Back out the analytics change and broadly deploy the logger goodness
Reimplement the analytics batched and in another thread (logging is async, but does apply some backpressure).

Does that sound reasonable @ChadNedzlek?

rainersigwald on 16 May 2019

Why are we sending telemetry on every build event? That is definitely something we should not be doing for this exact reason.

jaredpar on 16 May 2019

👍3

Regardless of the telemetry implementation, it is slightly tangential to my understanding of the initial intent of the Pipeline logger. As far as I understand things, now is a good time to examine how we're handling telemetry. I'd prefer that we step back, disable the telemetry element from the pipeline logger and agree on a reasonable cohesive approach to telemetry in our builds. The initial implementation of the pipeline logger would still provide value as it surfaces contextual error information to developers / AzDo summary page / GitHub checks.

@markwilkie @ChadNedzlek @Chrisboh thoughts?

chcosta on 16 May 2019

I agree with @jaredpar and @chcosta: back it out, redesign telemetry.

One easy possibility would be to split the telemetry (as currently written) into its own logger, then replay the .binlog produced during the real build through it in a separate step. Or, even better, as a different phase in a release that's independently retryable and doesn't fail the build itself. That would also allow getting telemetry on the first-phase restore.

rainersigwald on 16 May 2019

👍1

My hope is that we can take advantage of the fact that @rainersigwald is local and hash a strawman plan for telemetry out in 30 minutes sometime today. The need for the data is urgent....

markwilkie on 16 May 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Feed package version

chcosta · 25Comments

Official Build Failures in dotnet/installer when publishing

JohnTortugo · 29Comments

Servicing exercise for .NET Core 3

riarenas · 49Comments

Plan to Re-work channel implementation (YAML) creation in Arcade

JohnTortugo · 26Comments

Reconsider requiring global.json to specify the SDK version

nguerrera · 26Comments