Project-system: BuildUpToDateCheck.OnChanged allocates 4% of all data opening Roslyn.sln

Created on 18 Jul 2018  路  25Comments  路  Source: dotnet/project-system

GC time (both Gen0 and Gen1) is hurting the scenario of opening Roslyn.sln, on this path, below is allocating ~4% of all allocations splitting strings:

image

Bug Feature-Up-to-date Tenet-Performance

Most helpful comment

From a VS perspective, I think it's going to be fine to say we only respect imports in $(MSBuildAllProjects). If the up-to-date is overaggressive, we can point them to <UpToDateCheckInput />. Every project in the world is going to pay for those allocations (to split it up); and they aren't cheap. I did a pass on GitHub and couldn't find a bad usage, though I didn't do a full pass - looks like everyone in the world checks in their obj folder, so it mainly picks up NuGet generated files.

All 25 comments

This is caused by splitting the MSBuildAllProjects property.

This might be fixed by https://github.com/Microsoft/msbuild/issues/1299. We should verify.

@drewnoakes Can you verify that this no longer shows up? If you've not used PerfView I can show how use this next week. Jeff said this went into Dev16, so you'll need a recent build of that.

We might want to avoid the split too (to avoid the string[]) if there's no ";" as the MSBuild change means only a single target should appear in it.

@davkean Running VS 16.0 28022.4003.master the MSBuildAllProjects still splits into an array of 36 elements.

I'm not sure how to verify the version of msbuild running though.

image

We might want to avoid the split too (to avoid the string[]) if there's no ";" as the MSBuild change means only a single target should appear in it.

@davkean I wasn't able to observe a situation where MSBuildAllProjects contained a single path. Above I quoted 36 strings for a solution I generated. For Roslyn.sln on Dev16 I saw 54 strings.

My interpretation of the change to MSBuildAllProjects (this comment in particular) is that the last-modified path will always be _first_. So if we are dealing with MSBuild 16.0 or greater, we need only take the first value for our checks.

Any thoughts on the best way to perform this version test?

@jeffkl @andygerlicher - do you know if an MSBuild with these changes has been inserted into Dev16 builds yet?

https://github.com/Microsoft/msbuild/pull/3605 That's the PR that started the change. We also need everyone else to stop setting it when working with MSBuild 16 (either via different targets like with the SDK or a condition). I'll make sure we get a PR into CLI with updated MSBuild and removing the setting from the SDK.

Quick note on the actual change, we ended up going with making that property special. The evaluator will always put the newest project at the beginning of that list even if no one ever sets it. So for up-to-date there's no reason to ever set that just read it as an input.

Oh and to your actual question we inserted to lab/ml within the last few days.

I commented on the PR as well, but overall I think this is perfect. But the _comsumption_ of that property should remain exactly the same (list of 1 or more potential up-to-date candidates).

But the _comsumption_ of that property should remain exactly the same (list of 1 or more potential up-to-date candidates).

@AndyGerlicher #4024 is based on the idea that the first up-to-date candidate is the newest, so we only need to consider that candidate's time stamp. Is that incorrect? Would we need to consider anything other than the first in MSBuild 16?

Because the user can put anything they want in that list and may be using it for their on incremental build invalidation. MSBuild will guarantee that the newest import is at the beginning of that list so there's no point in setting it anymore, but we can't control what the user has put in there.

@drewnoakes What's state of this one?

I'd hoped it would be possible to not consider all paths in the MSBuildAllProjects property, using just the first one as the 'earliest', but projects can and do append their own paths, so we have to check them all.

The implementation was changed in #4012 to use LazyStringSplit which avoids array allocation and reduces the chance of strings ending up in gen1.

I'm not sure if further improvement here is possible. I don't know enough about MSBuildAllProjects and its usage. Can we stop adding any projects to it, or have MSBuild include only the newest, so that property only contains projects from build extensions?

I thought MSBuild changed the logic here to list the newest first?

@AndyGerlicher @rainersigwald Can you help me understand this? If the newest import is always first in that list, why we do need to walk past it? Is this in case someone has put a random file in there? If so, I think I'd happily say to them; don't do that.

@AndyGerlicher Also did we file a bug against NuGet to stop using MSBuildAllProjects in generated props/targets files? I cannot see one.

The problem is that user code (or NuGet packages or something) can _arbitrarily_ insert files into this item and use it as a kind of force-rebuild. For imports that added it according to the normal pattern, you're right--there's never a need to go past the first file. But if a user appends foo.cs for whatever reason, there would be an observable behavior change if it wasn't checked. IIRC we found some cases like that but I don't see them in the bugs at the moment.

For such cases, wouldn't UpToDateCheckInput be a better fit? If we decided to go ahead with the change in #4333, would guidance for breakages be to use UpToDateCheckInput instead?

Yeah, that is a much more direct approach for VS scenarios, and since this doesn't affect build-time behavior (which will still depend on $(MSBuildAllProjects)), I think it's a reasonable answer. Just have to decide if the benefit is worth pushing for that change.

I had been working on updating some places where MSBuildAllProjects is set but haven't had a chance to submit PRs now that 16.0 is a thing:

NuGet: https://github.com/jeffkl/NuGet.Client/commit/cfb31da2723a2fb1b45e325cffb06b0d18f7e172
VSTest: https://github.com/jeffkl/vstest/commit/c7bb28a591e4b96f345b27cc4a72753bcdf4312c
.NET Core SDK: https://github.com/jeffkl/sdk/commit/0510dc13cb9f1aaa291d5756f35eb17a91cf30df

Another related issue: https://github.com/dotnet/sdk/issues/380

From a VS perspective, I think it's going to be fine to say we only respect imports in $(MSBuildAllProjects). If the up-to-date is overaggressive, we can point them to <UpToDateCheckInput />. Every project in the world is going to pay for those allocations (to split it up); and they aren't cheap. I did a pass on GitHub and couldn't find a bad usage, though I didn't do a full pass - looks like everyone in the world checks in their obj folder, so it mainly picks up NuGet generated files.

Thanks @jeffkl!

Was this page helpful?
0 / 5 - 0 ratings