Msbuild: MSBuild File globbing is slow on large directory layouts (e.g. node_modules)

Created on 22 Jun 2017  Â·  28Comments  Â·  Source: dotnet/msbuild

Most helpful comment

  <PropertyGroup>
    <DefaultItemExcludes>$(DefaultItemExcludes);wwwroot\node_modules\**</DefaultItemExcludes>
  </PropertyGroup>

Helped me.

All 28 comments

From internal investigation:

When there's a large number of files, the Build FileMatcher.GetFilesRecursive allocates a huge amount of strings.

Name Inc % Inc
Type System.String 55.8 1,446,726,272

  • microsoft.build.ni![COLD] Microsoft.Build.Shared.FileUtilities.PathsEqual(System.String, System.String) 27.3 708,307,520
    |+ Microsoft.Build!Microsoft.Build.Shared.FileMatcher+<>c__DisplayClass29_0.b__0(class System.String) 27.3 708,307,520
    | + microsoft.build.ni![COLD] Microsoft.Build.Shared.FileMatcher.GetFilesRecursive(System.Collections.Generic.IList1, RecursionState, System.String, Boolean, 27.3 708,307,520 | + microsoft.build.ni!FileMatcher.GetFilesRecursive 27.3 707,986,112 | |+ microsoft.build.ni!FileMatcher.GetFiles 27.3 707,986,112 | | + microsoft.build.ni!EngineFileUtilities.GetFileList 27.3 707,986,112 | | + microsoft.build.ni!EngineFileUtilities.GetFileListEscaped 27.3 707,986,112 | | + microsoft.build.ni!Microsoft.Build.Evaluation.LazyItemEvaluator4+LazyItemOperation[System.__Canon,System.__Canon,System.__Canon,System.__Canon].Apply 27.3 707,986,112
    | | + microsoft.build.ni!Microsoft.Build.Evaluation.LazyItemEvaluator4+MemoizedOperation[System.__Canon,System.__Canon,System.__Canon,System.__Canon].Appl 27.3 707,986,112 | | + microsoft.build.ni!Microsoft.Build.Evaluation.LazyItemEvaluator4+LazyItemList[System.__Canon,System.__Canon,System.__Canon,System.__Canon].ComputeI 27.3 707,986,112
    | | + microsoft.build.ni!Microsoft.Build.Evaluation.LazyItemEvaluator4+LazyItemList[System.__Canon,System.__Canon,System.__Canon,System.__Canon].GetItem 27.3 707,986,112 | | + microsoft.build.ni!Microsoft.Build.Evaluation.LazyItemEvaluator4+<>c[System.__Canon,System.__Canon,System.__Canon,System.__Canon].b_ 27.3 707,986,112
    | | + microsoft.build.ni!System.Linq.Buffer1[Microsoft.Build.Evaluation.LazyItemEvaluator4+ItemData[System.__Canon,System.__Canon,System.__Canon,Syst 27.3 707,986,112
    | | + microsoft.build.ni!System.Linq.OrderedEnumerable1+<GetEnumerator>d__1[Microsoft.Build.Evaluation.LazyItemEvaluator4+ItemData[System.__Canon,Sy 27.3 707,986,112
    | | + microsoft.build.ni!System.Collections.Generic.List1[Microsoft.Build.Evaluation.LazyItemEvaluator4+ItemData[System.__Canon,System.__Canon,Syst 27.3 707,986,112
    | | + microsoft.build.ni!System.Linq.Enumerable.ToList[Microsoft.Build.Evaluation.LazyItemEvaluator4+ItemData[System.__Canon,System.__Canon,System. 27.3 707,986,112 | | + microsoft.build.ni!Microsoft.Build.Evaluation.LazyItemEvaluator4[System.__Canon,System.__Canon,System.__Canon,System.__Canon].GetAllItems() 27.3 707,986,112
    | | + microsoft.build.ni!Microsoft.Build.Evaluation.Evaluator4[System.__Canon,System.__Canon,System.__Canon,System.__Canon].Evaluate() 27.3 707,986,112 | | + microsoft.build.ni!Microsoft.Build.Evaluation.Evaluator4[System.__Canon,System.__Canon,System.__Canon,System.__Canon].Evaluate(Microsoft.B 27.3 707,986,112
    | | + microsoft.build.ni!Project.Reevaluate 27.3 707,986,112
    | | + microsoft.build.ni!Project.ReevaluateIfNecessary 27.3 707,986,112
    | | + microsoft.build.ni!Project.Initialize 27.3 707,986,112

Relevant code is here:
https://github.com/Microsoft/msbuild/blob/master/src/Shared/FileMatcher.cs#L765

                            excludeNextSteps[i].Subdirs.Any(excludedDir => FileUtilities.PathsEqual(excludedDir, subdir)))

This Any() creates an allocation for the delegate (every iteration of the loop), and it runs the PathsEqual function per every exclude item per every sub directory - this can be a lot.
The function to compare two paths allocates at least 4 strings and two char arrays (from ToSlash() and TrimTrailingSlashes() )
https://github.com/Microsoft/msbuild/blob/master/src/Shared/FileUtilities.cs#L1042-L1056

This could be mitigated by running the comparison on less files (ideal). Alternatively this can be written using a custom comparison function which ignores trailing slashes and considers forward/backward slashes to be equivalent. Then the comparisons can be done without allocation

The orthogonal optimization to back out of certain excluded directories is tracked here: #2000

@jasselin It is related, though I suspect #2000 would have a larger impact in that scenario. Since source files in ASP.NET Core projects are included by glob, the rename operation is essentially:

  1. Rename the file (on disk), telling MSBuild nothing about it
  2. Ask MSBuild to reevaluate the project, discovering all files referred to by the globs

The latter step hits these performance limitations.

@rainersigwald Is there any workaround like opting out of globbing and reverting to the old way of including files in a project? I tried setting EnableDefaultItems to True and including every file manually but it doesn't seem to help.

I'm seeing what looks like a similar issue in a slightly different codepath (ComparePathsNoThrow -> NormalizePathForComparisonNoThrow):

Name | Inc % | Inc
-- | -- | --
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|+ microsoft.build!ItemFragment.MatchCount | 40.7 | 18,266
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|+ microsoft.build!Microsoft.Build.Internal.EngineFileUtilities+<>c__DisplayClass6_0.<GetFileSpecMatchTester>b__0(class System.String) | 40.2 | 18,042
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|+ microsoft.build!FileUtilities.ComparePathsNoThrow | 40.1 | 18,021
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ microsoft.build!FileUtilities.NormalizePathForComparisonNoThrow | 38.8 | 17,441
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ microsoft.build!FileUtilities.GetFullPathNoThrow | 24.6 | 11,029
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ OTHER <<mscorlib.ni!Path.Combine>> | 6.0 | 2,690
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ OTHER <<clr!COMString::Replace>> | 2.5 | 1,126
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ microsoft.build!FileUtilities.PathIsInvalid | 1.7 | 762
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ microsoft.build!FileUtilities.get_InvalidFileNameChars | 1.3 | 573
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ OTHER <<?!?>> | 0.4 | 193
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ OTHER <<clr!COMString::IndexOfCharArray>> | 0.3 | 147
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ microsoft.build!FileUtilities.NormalizePath | 0.0 | 8
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|\|+ OTHER <<clr!JIT_NewArr1>> | 0.0 | 3
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ OTHER <<?!?>> | 0.1 | 43
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ OTHER <<mscorlib.ni!Path.Combine>> | 0.0 | 12
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ microsoft.build!FileUtilities.GetFullPathNoThrow | 0.0 | 10
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ microsoft.build!FileUtilities.PathIsInvalid | 0.0 | 8
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ OTHER <<clr!COMString::Replace>> | 0.0 | 7
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|\|+ OTHER <<clr!ThePreStub>> | 0.0 | 1
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|\|+ microsoft.build!FileUtilities.NormalizePathForComparisonNoThrow | 0.0 | 4
\|\|\| \|\|\|\|\| \|\| \|\|\| \|\|\|\|+ microsoft.build!FileUtilities.ComparePathsNoThrow | 0.0 | 3


@jasselin Sorry, missed this:

Is there any workaround like opting out of globbing and reverting to the old way of including files in a project?

Yes, you can always avoid the implicit globbing and explicitly specify the files of your choice, either individually or via your own more-specific glob. ~It looks like you'll have to do it for each item type, unfortunately. I would have also expected that setting EnableDefaultItems to false would do the trick.~

@rainersigwald Setting EnableDefaultItems to false will disable the implicit items for all of the item types. See for example https://github.com/dotnet/sdk/blob/a4bceb67b160c72318a581d87d52796e4fa31794/src/Tasks/Microsoft.NET.Build.Tasks/build/Microsoft.NET.Sdk.DefaultItems.props#L25-L28

You're right! Corrected my comment.

@rainersigwald @dsplaisted Still having the same issue with EnableDefaultItems set to false. It seems devenv.exe is still scanning every subdirectory. Is it MSBuild or Visual Studio related?

edit: msbuild.exe does not seem to scan the directories anymore, but devenv.exe still does.

@jasselin How are you setting EnableDefaultItems?

@rainersigwald @cdmihai We should sync on path manipulation - I'm in the process of rewriting CPS's to avoid much the same issues.

@davkean Yes, to "False".

@jasselin Sorry my question is "_how_ do you set it?"

@davkean Sorry, misread.

In the csproj file.

  <PropertyGroup>
    <AssemblyName>App</AssemblyName>
    <PackageId>App</PackageId>
    <EnableDefaultItems>False</EnableDefaultItems>
  </PropertyGroup>

If it could help diagnose, I found out the whole thing originates from this:;

87,1% MoveNext • 8 268 ms • Microsoft.VisualStudio.ProjectSystem.Items.SourceItemsService+<SetUnvaluatedIncludesCoreWithGlobbingHelperAsync>d__117.MoveNext() 84,4% ReevaluateIfNecessary • 8 011 ms • Microsoft.Build.Evaluation.Project.ReevaluateIfNecessary() 84,4% ReevaluateIfNecessary • 8 011 ms • Microsoft.Build.Evaluation.Project.ReevaluateIfNecessary(ILoggingService, ProjectLoadSettings) 84,4% Reevaluate • 8 011 ms • Microsoft.Build.Evaluation.Project.Reevaluate(ILoggingService, ProjectLoadSettings) 84,4% Evaluate • 8 011 ms • Microsoft.Build.Evaluation.Evaluator4.Evaluate(IEvaluatorData, ProjectRootElement, ProjectLoadSettings, Int32, PropertyDictionary, ILoggingService, IItemFactory, IToolsetProvider, ProjectRootElementCache, BuildEventContext, ProjectInstance)
84,3% Evaluate • 8 003 ms • Microsoft.Build.Evaluation.Evaluator4.Evaluate() 83,3% GetAllItems • 7 910 ms • Microsoft.Build.Evaluation.LazyItemEvaluator4.GetAllItems()
...

@cdmihai @rainersigwald @AndyGerlicher I did some investigation into this, and did a local fix for optimizing ComparePathsNoThrow which significantly speeded things up, but on further investigation I think we have an O(n^2) perf issue with the following code pattern:

<None Update="@(None)" Foo="Bar" />

Adding this line to a very simple project with around 11,000 None items changed the build time from 0.63 seconds to 32 seconds (after applying my other fix, before then it took about 8 minutes).

UpdateOperation.Apply calls ItemSpec.MatchesItem for each item currently in the list. When there is a fragment of type ItemExpressionFragment, then it ends up comparing the item against each referenced item, which when using the pattern <None Update="@(None)" /> is all of the items in the list. So each item ends up being compared with all the items in the list, resulting in an n^2 runtime.

@rainersigwald @davkean @dsplaisted I did some further testing and I am pretty sure EnableDefaultItems is ignored or overridden by Visual Studio. When I update or add something to DefaultItemExcludes, I see (using procmon) that the pattern I added is looked for in every subdirectories. Is there a way to see the MSBuild output when called from within devenv.exe so I can confirm the value used for EnableDefaultItems?

@jasselin Yes; check out https://github.com/dotnet/project-system/blob/master/docs/design-time-builds.md#diagnosing-design-time-builds -- because you're using a new-sdk project, follow the "new project system" instructions.

I wonder if this is related to the VS-ignores-conditions-so-it-can-display-source-files-from-all-flavors-of-the-project behavior?

@rainersigwald Sadly, I think you're right. When I comment the following sections, everything is lightning fast.

https://github.com/dotnet/sdk/blob/a4bceb67b160c72318a581d87d52796e4fa31794/src/Tasks/Microsoft.NET.Build.Tasks/build/Microsoft.NET.Sdk.DefaultItems.props#L25-L33

https://github.com/aspnet/websdk/blob/dev/src/ProjectSystem/Microsoft.NET.Sdk.Web.ProjectSystem.Targets/netstandard1.0/Microsoft.NET.Sdk.Web.ProjectSystem.props#L27-L45

I was able to confirm using the "Build - Design-time" output that EnableDefaultItems is set to False. Is there anything we can do to make this work properly?

edit: That seems to do it, am I wrong?

  <ItemGroup Condition=" $(EnableDefaultItems) != 'true' ">
    <Compile Remove="**/*" />
    <EmbeddedResource Remove="**/*" />
    <None Remove="**/*" />
    <Content Remove="**/*" />
  </ItemGroup>

I have a project with two nodes_modules folders which make Visual Studio extremely slow. After
adding <EnableDefaultItems>false</EnableDefaultItems> the performance is better but then there seems to be a problem when precompiling razor views: the .PrecompiledViews.dll is generated but it is empty.

When setting <EnableDefaultItems>true</EnableDefaultItems> the dll is generated with the razor views included.

Does anyone has the same problem?

@DavidUrting See this comment for a workaround which should addrcess the perf issue you are hitting.

  <PropertyGroup>
    <DefaultItemExcludes>$(DefaultItemExcludes);wwwroot\node_modules\**</DefaultItemExcludes>
  </PropertyGroup>

Helped me.

Does not work for me.

@StarpTech, where is your node_modules located?

@ofir-shapira-como wwwroot/node_modules

@ofir-shapira-como forget it seems to work know, thank you.

There's been a lot of perf work in this area, so we're going to close this. If you have a specific case that's still bad, please open a new issue with repro steps.

Was this page helpful?
0 / 5 - 0 ratings