Msbuild: Globing optimization for excludes with complex glob patterns

Created on 20 Apr 2017  路  4Comments  路  Source: dotnet/msbuild

The current file system walking code that collects files that match a glob accepts a set of exclude patterns so it can prune the FS search tree in one file walk. It has some optimizations to completely back out of a certain directory.

This optimization does not work for complex patterns like **/foo/**. When a FS walk goes into a directory like /a/b/foo, it should backtrack out because the **/foo/** exclude would match all files under /a/b/foo. Instead, the current code walks the entire subtree under /a/b/foo and uses expensive Regex matching to exclude every file in the subtree.

A common scenario where this hits is when npm enabled projects place their node_modules in a subdirectory relative to the project directory. This causes all globs to recurse inside the node_modules directory.

Feature - Globbing Performance-Scenario-General 1 performance

Most helpful comment

The workaround for that is to ensure that there's an exclusion for the (anchored) relative path of node_modules, something like

<DefaultItemExcludes>$(DefaultItemExcludes);path\to\node_modules\**</DefaultItemExcludes>

(Doesn't mean we shouldn't fix this, but if you're hitting it today . . .)

All 4 comments

I just hit this on a .NET Core project with node_modules in a subdirectory. Build times have gone from ~7s on cli 1.x to ~70s on 2.0. That鈥檚 a regression of 900%...

@dplasteid has a nice workaround he can share, though this will become a permanent fixture in affected project files lest folks revert once a fix for this issue is available.

Ping me if you need a real-world repro.

The workaround for that is to ensure that there's an exclusion for the (anchored) relative path of node_modules, something like

<DefaultItemExcludes>$(DefaultItemExcludes);path\to\node_modules\**</DefaultItemExcludes>

(Doesn't mean we shouldn't fix this, but if you're hitting it today . . .)

It would be extremely nice to have this bug fixed as it adds significant overhead to my build.
I really want this to function so I can tell the of globs to avoiding traversing symlinks/reparse points which are present under directories being scanned for a wildcard pattern.

I don't seem to be able to get the workaround to function with reparse points which is disappointing as my build leverages these and so MSBuild falls down into the reparse/symlink directory and starts scanning in the link target which is totally unnecessary and in my use case very expensive.

Was this page helpful?
0 / 5 - 0 ratings