Runtime: TPA list building is contributing large % of time to startup

Created on 11 Nov 2020 · 15Comments · Source: dotnet/runtime

A runtime host is responsible for building up the TRUSTED_PLATFORM_ASSEMBLIES list that defines the assemblies the runtime should be able to resolve by default. To build up this list, all of our hosts are iterating through the relevant directories in the file system 4 times, one for each ".dll", ".ni.dll", ".exe", and ".ni.exe" extensions.

https://github.com/dotnet/runtime/blob/1923bba394c31bc2d4f62a948eac8dbe64339ab2/src/coreclr/src/hosts/corerun/corerun.cpp#L223

https://github.com/dotnet/runtime/blob/6072e4d3a7a2a1493f514cdf4be75a3d56580e84/src/coreclr/src/hosts/coreshim/CoreShim.cpp#L247

https://github.com/dotnet/runtime/blob/c6eac1f2d523ee90f536ad9ed1bcb520025401f6/src/coreclr/src/hosts/unixcoreruncommon/coreruncommon.cpp#L224

https://github.com/dotnet/runtime/blob/6072e4d3a7a2a1493f514cdf4be75a3d56580e84/src/coreclr/src/hosts/coreconsole/coreconsole.cpp#L218

But repeating such I/O four times is a very expensive way to do this, with cheaper ways to provide the same guarantee that the list is ordered the way we want with certain extensions preferred (e.g. just building up the list in a collection before serializing it to a string).

To validate this beyond profiles, I hacked up corerun's AddFilesFromDirectoryToTPAList to just iterate the directory once (with '*'), and the time to execute .\corerun helloworld.dll on my machine dropped from ~75ms to ~40ms.

We should in particular fix the production hosts to whatever extent they have the same problem.

cc: @brianrob, @DamianEdwards

Relates to https://github.com/dotnet/runtime/issues/44598 and https://github.com/dotnet/msbuild/issues/5876

area-Host tenet-performance

Source

stephentoub

👍6

Most helpful comment

@jkotas - I personally agree with both of those - let's break these things. I'm simply describing how the host has been treating compatibility so far. So making these changes should be a very conscious decision - along with defining the new compatibility promise around host itself.

Partially this is probably because some parts of the host must go by much stricter compatibility rules than others (hostfxr versus hostpolicy) - and so I'm probably stuck on "this is about host - let's be super careful".

vitek-karas on 11 Nov 2020

❤1 😄1

All 15 comments

Tagging subscribers to this area: @vitek-karas, @agocke
See info in area-owners.md if you want to be subscribed.

Issue meta data

Issue content:	A runtime host is responsible for building up the TRUSTED_PLATFORM_ASSEMBLIES list that defines the assemblies the runtime should be able to resolve by default. To build up this list, all of our hosts are iterating through the relevant directories in the file system 4 times, one for each ".dll", ".ni.dll", ".exe", and ".ni.exe" extensions. https://github.com/dotnet/runtime/blob/1923bba394c31bc2d4f62a948eac8dbe64339ab2/src/coreclr/src/hosts/corerun/corerun.cpp#L223 https://github.com/dotnet/runtime/blob/6072e4d3a7a2a1493f514cdf4be75a3d56580e84/src/coreclr/src/hosts/coreshim/CoreShim.cpp#L247 https://github.com/dotnet/runtime/blob/c6eac1f2d523ee90f536ad9ed1bcb520025401f6/src/coreclr/src/hosts/unixcoreruncommon/coreruncommon.cpp#L224 https://github.com/dotnet/runtime/blob/6072e4d3a7a2a1493f514cdf4be75a3d56580e84/src/coreclr/src/hosts/coreconsole/coreconsole.cpp#L218 But repeating such I/O four times is a very expensive way to do this, with cheaper ways to provide the same guarantee that the list is ordered the way we want with certain extensions preferred (e.g. just building up the list in a collection before serializing it to a string). To validate this beyond profiles, I hacked up `corerun`'s `AddFilesFromDirectoryToTPAList` to just iterate the directory once, and the time to execute `.\corerun helloworld.dll` on my machine dropped from ~75ms to ~40ms. We should in particular fix the production hosts used by `dotnet` and self-contained apps so as to significantly drop the startup overhead. cc: @brianrob, @DamianEdwards
Issue author:	stephentoub
Assignees:	-
Milestone:	[object Object]

msftbot[bot] on 11 Nov 2020

Yes - this shows up commonly in traces, and would be a great win.

brianrob on 11 Nov 2020

I want to point out that the problem with test hosts (corerun and friends) is VERY different from the problem in the production host (hostpolicy in this case).

Currently the production host doesn't do blind scans of directories (almost ever), instead it relies on .deps.json. The problem is that it must go over all the assets listed in such .deps.json (and probe them in the file system). For typical ASP.NET app the list of assets it must go through is on the order of ~200 files. Which is definitely not fast.
It does do probing, but fortunately enough the current behavior is such that for typical SDK-produces apps, it will very rarely not succeed on the first probe. So it translates to roughly the same number of probes to file system.

This is one of the reasons why startup is faster on Linux, because file system probes on Linux are generally faster than on Windows.

There are probably things we can improve in the current implementation on the small scale... but nothing substantial.

The really great win would be to redesign the TPA and the runtime interface. Currently the interaction is:

Host builds the full list of all assemblies and passes it to the runtime through TPA
Runtime uses TPA and will not talk to the host again

A much better way to do this would be to:

Host only parses the .jsons to internal data structures. But no probing.
It initializes the runtime with empty TPA and provides a callback
Runtime uses the callback instead of maintaining the the TPA - but does it lazily, only when necessary
Host implements the callback by walking the internal data structures and performing localized probing as necessary

For a simple ASP.NET app this would reduce the number of file system probes from ~200 to cca ~40 or so (a guess, I would have to measure this).

Note that starting with .NET 5 we partially do this with single-file. For a stock single-file app the TPA is actually empty, and runtime does make a callback to the host to resolve assemblies from TPA. The host still does probe for everything on startup though (although in case of single-file the probe is very quick, since it doesn't go to file-system but only to the single-file manifest).

The problem with this approach is that it has observable behavioral changes. Specifically around error reporting. Today if the app is missing a file, it will immediately fail on startup. With this change it would only fail if the file is needed at runtime. In a way this is a good change (we've had quite a few issues which were basically hitting the eager failure problem), but it's still a rather noticeable change.

Also this is not a simple change - rewriting the host to be able to resolve assemblies on demand is non-trivial.

vitek-karas on 11 Nov 2020

Also I should add that the production host does not do the .dll, .ni.dll, .exe, .ni.exe probing, it knows the desired file name exactly.

vitek-karas on 11 Nov 2020

Also I should add that the production host does not do the .dll, .ni.dll, .exe, .ni.exe probing, it knows the desired file name exactly.

If the deps file exists. If it doesn't exist, it does appear to still be doing that probing:
https://github.com/dotnet/runtime/blob/040301836ddc1c8d63025ac5e578050b48554563/src/installer/corehost/cli/hostpolicy/deps_resolver.cpp#L544-L551
https://github.com/dotnet/runtime/blob/040301836ddc1c8d63025ac5e578050b48554563/src/installer/corehost/cli/hostpolicy/deps_resolver.cpp#L122-L123
but only doing the file system scan once:
https://github.com/dotnet/runtime/blob/040301836ddc1c8d63025ac5e578050b48554563/src/installer/corehost/cli/hostpolicy/deps_resolver.cpp#L127
and then processing the results, which is the main thing I was commenting on the test hosts not doing.

stephentoub on 11 Nov 2020

The problem is that it must go over all the assets listed in such .deps.json (and probe them in the file system).

The original design point of .deps.json was to avoid file system probing as much as possible. The probing in the file system was added as diagnostic improvement later, without considering the performance consequences.

jkotas on 11 Nov 2020

👍2

The probing is something that has shown up fairly heavily in profiles. I'm wondering if the probing can be removed and we fall back to errors that the loader provides. We could also put the probing behind a flag so that it's still available.

brianrob on 11 Nov 2020

The problem with this approach is that it has observable behavioral changes. Specifically around error reporting. Today if the app is missing a file, it will immediately fail on startup. With this change it would only fail if the file is needed at runtime.

+1. I agree that this would be a good change. It should not require rewriting the host to be lazy.

I am not sure whether a full laziness would be a performance improvement with larger number of assemblies.

jkotas on 11 Nov 2020

We could also put the probing behind a flag so that it's still available.

We should just delete it, unless we have explicit customer asks to keep it around.

jkotas on 11 Nov 2020

👍2

it still probes if there's no .deps.json

That's true, but it's very rare. SDK will never produce an app like that.

delete probing

This is somewhat problematic without changing other semantics. For example today if the app lists System.Console.dll as one of the assemblies which are from the app (so should "ship" in the app), but it's actually missing on disk, we will load it from the framework. In this case some level of probing is "necessary".

There are also "additionalProbingPaths" which are just designed to do probing.

That said we could get around this for the most part. After all we know what TFM the app we're running have. So if we're running a net6 app we could "break" these weird cases and introduce a new behavior. It would not simplify the host, we would still have to keep the old along with the new, but it would help perf.

Another problem:
We fully support running 3.1 apps on 5.0 runtime, so 5.0 hostpolicy runs over 3.1 input. As such deleting stuff is problematic. But we can change behavior based on version of the app - I think that would safe.

vitek-karas on 11 Nov 2020

if the app lists System.Console.dll as one of the assemblies which are from the app (so should "ship" in the app), but it's actually missing on disk, we will load it from the framework

This sounds like a bug to me. If the app says that it comes with System.Console.dll, why are we falling back to loading it from the framework when the .dll is missing on disk?

We fully support running 3.1 apps on 5.0 runtime,

We have lower compatibility promise on major version roll forward. We should only worry about properly authored apps, not worry about bug-for-bug compatibility.

jkotas on 11 Nov 2020

👍1

vitek-karas on 11 Nov 2020

❤1 😄1

I'm trying to cast my mind back to 2.x when we made changes regarding the host's behavior when loading dependencies that were in the app folder and in the framework location (e.g. which one wins in that case). Is the behavior being discussed for removal here part of that? E.g. is there a scenario in which an app declares a dependency on an assembly but it isn't found in the app folder and it relies on it coming from the framework? Would that be the case for transitive dependencies on assemblies that are in the shared framework and as such knocked out (and lifted) during restore/build?

DamianEdwards on 12 Nov 2020

@DamianEdwards The case I described should basically never happen with SDK building the app. If the app exists in the framework and the app has a dependency on it SDK will "unify" and only use the one from the framework - in which case the app will not even mention that assembly in it's .deps.json.

The scenario which was relatively common for ASP.NET was that both the framework and the app have a given assembly and then there's a version resolution algorithm which picks one over the other (generally higher version wins). That scenario would be largely unaffected because in that case the assembly is actually mentioned in both the .deps.json of the app and .deps.json of the framework. So we could do the versioning resolution purely on the data from .deps.json - which is more or less what the host does already - no probing necessary.

The problematic behavior is that host builds a set of probing paths: app, framework1, framework2, additionalProbingPath1, ... and then each asset from .deps.json is probed for using this set (going in order). The first successful match wins.
For SDK built apps the first probe will pretty much always be successful - so there's no reason to probe at all.
The change of behavior comes if the files on disk don't match exactly the .deps.json and thus probing is necessary.

This was VERY important in 2.* for developer scenarios because SDK built the app with all of the nugets listed in .deps.json but NOT copied to the output. It relied on additionalProbingPaths to point to NuGet cache.
In 3.0 we changed this and SDK will always copy all of the dependencies to the output - so the chance that .deps.json doesn't match what's on disk is MUCH lower.

vitek-karas on 12 Nov 2020

Today PowerShell loads 58 dll-s at startup time and we have a test to track this and exclude a regression. I want to say that this list is known in advance _at design time_. Perhaps this list can be converted to _code_ at the design time (or at dotnet publish time) as source generators do (then fallback to probes if needed).