Project system half of dotnet/sdk#2100:
Notes from @abpiskunov from our meeting:
Here is what we got at the meeting today:
Currently there is too much unnecessary data being sent through DT builds for nuget packages tree and assets file is being read multiple times for each configuration (TFM) and then again when new DT build is triggered.
We agreed to redesign it in following way:
- DT build targets would send us only top level resolved packages (with the same metadata they had before, to avoid compatibility issues) and a timestamp for the assets file those packages are received from. SDK can cache this info and avoid reading unchanged assets file.
- In VS we will read top level packages for each configuration (TFM) in the similar way we do now – through DT builds. In addition to that we would have new logic that would be able to parse assets file (we will schedule parsing the assets file and stop/avoid parsing if file have old/mismatching timestamp than what we received form last DT build) and then map to top level packages. This would allow to show top level package quickly and then produce secondary lower level tree data.
By having this new logic we could also try to optimize data structure in VS and keep unique package objects across solution, instead duplicating it per each project (this is orthogonal problem, but we might think about this while doing matching DT changes)
cc @Pilchie @davkean @abpiskunov
At last weekly sync up, there was talk that this would probably be pushed out to 15.8. The sdk side is currently scheduled for 2.1.4xx. At the earliest it could be pulled into 2.1.3xx, but I think we're pretty booked up at this point. cc @livarcocc
Okay - thanks for the heads up. I think it's too bad to not do more here in 15.7, but I guess there is at least the hammer option that you can turn Dependencies node off entirely with targets files.
One of the potential quick workarounds would be to change SDK in such a way that it send same data , but only for top level dependencies, as there no lower level hierarchies at all. This would not require project system changes, only sdk. Keep this change off by default and add an option to turn on this workaround for people who wants it.
We will break this down into tasks and assign those tasks to 16.5 and 16.6.
This work is underway. Capturing design notes here for posterity.
project.assets.json is available in evaluation via the <ProjectAssetsFile> property.AdditionalDependentFileTimes items.IAttachedCollection APIs rather than progression/graph node APIs.IDependencyModel.DependencyIDs will become obsolete and unused. I will sync with WebTools to prevent regressions in web projects (pinging @abpiskunov).In the notes above @abpiskunov wrote:
By having this new logic we could also try to optimize data structure in VS and keep unique package objects across solution, instead duplicating it per each project (this is orthogonal problem, but we might think about this while doing matching DT changes)
@nguerrera @davkean and others, two questions around this:
If the answers are "yes" and "no" respectively, then we can create a globally scoped cache of this data and not need to worry about updating contents within.
I suspect the answer to question 1 is "no" however, considering the case where a package's child references are conditioned on something that can differ from project to project, in which case its graph would vary between projects.
Can we safely assume that for a given target a package's data is identical across all projects in the solution
PrivateAssets, ExcludeAssets, IncludeAssets can all affect a resulting package's data in the assets file for a given project.
Is there a case where data associated with a resolved package version could change over time?
If the package is resolved from a different source, it could be different, but I probably would worry about that case.
In that case:
This is progressing smoothly, with work happening on the feature/dependencies-tree-performance branch. That branch will address several issues once merged.
It's using IAttachedCollection* APIs which are a pleasure to work with, but have no built-in support for search (unlike the graph/progression APIs). I need to explore options around this.
I also need to think through whether we need to change the public API here, specifically considering WebTools dependencies (npm/bower).
How often you see different data for the same package? once in a while or very often? Depending on that sharing package would save huge amount of memory. Also you could consider a "flyweight" share package metadata, but allow projects have some metadata that could override/be different?
@abpiskunov that's proposed in #3335.
With the current work we will only eagerly populate hierarchy items, and there's little data to share there.
For transitive dependencies, these are lazily populated as the user expands. We could share nodes, but I suspect that for 99.99% of sessions there'd be no noticeable benefit. There will be an up-front cost for de-duping nodes too, but that might not be too great.
Flyweight is interesting and I'll take a look to see if it'd help. We're removing redundant metadata from task items, and CPS is pretty good at keeping the sizes of snapshots down using shared key dictionaries.
PR for the SDK changes is up: https://github.com/dotnet/sdk/pull/11358
Most helpful comment
PR for the SDK changes is up: https://github.com/dotnet/sdk/pull/11358