Project-system: Design-time build only provides top-level package/project dependencies, sourcing transitive references from project.assets.json

Created on 30 Mar 2018  Â·  12Comments  Â·  Source: dotnet/project-system

Project system half of dotnet/sdk#2100:

Notes from @abpiskunov from our meeting:

Here is what we got at the meeting today:

  • Currently there is too much unnecessary data being sent through DT builds for nuget packages tree and assets file is being read multiple times for each configuration (TFM) and then again when new DT build is triggered.

  • We agreed to redesign it in following way:

    • DT build targets would send us only top level resolved packages (with the same metadata they had before, to avoid compatibility issues) and a timestamp for the assets file those packages are received from. SDK can cache this info and avoid reading unchanged assets file.
    • In VS we will read top level packages for each configuration (TFM) in the similar way we do now – through DT builds. In addition to that we would have new logic that would be able to parse assets file (we will schedule parsing the assets file and stop/avoid parsing if file have old/mismatching timestamp than what we received form last DT build) and then map to top level packages. This would allow to show top level package quickly and then produce secondary lower level tree data.

By having this new logic we could also try to optimize data structure in VS and keep unique package objects across solution, instead duplicating it per each project (this is orthogonal problem, but we might think about this while doing matching DT changes)

cc @Pilchie @davkean @abpiskunov

Feature Request Feature-Dependency-Node Performance-Scenario-Solution-Open Tenet-Performance Tracking Triage-Approved VS1ES

Most helpful comment

PR for the SDK changes is up: https://github.com/dotnet/sdk/pull/11358

All 12 comments

At last weekly sync up, there was talk that this would probably be pushed out to 15.8. The sdk side is currently scheduled for 2.1.4xx. At the earliest it could be pulled into 2.1.3xx, but I think we're pretty booked up at this point. cc @livarcocc

Okay - thanks for the heads up. I think it's too bad to not do more here in 15.7, but I guess there is at least the hammer option that you can turn Dependencies node off entirely with targets files.

One of the potential quick workarounds would be to change SDK in such a way that it send same data , but only for top level dependencies, as there no lower level hierarchies at all. This would not require project system changes, only sdk. Keep this change off by default and add an option to turn on this workaround for people who wants it.

We will break this down into tasks and assign those tasks to 16.5 and 16.6.

This work is underway. Capturing design notes here for posterity.

  • The path to project.assets.json is available in evaluation via the <ProjectAssetsFile> property.
  • The timestamp of the assets file is available in evaluation via AdditionalDependentFileTimes items.
  • The assets file should provide all transitive reference information for package and projects.
  • The assets file should provide all log message for package (and project?) references (addressing #3780).
  • The assets file will be parsed lazily, only when the "Dependencies" node is expanded to show packages/projects.
  • We will need a mechanism (property?) to instruct the DTB targets to only return top-level items, as those same targets are usable from different versions of Visual Studio.
  • There will be a new snapshot that contains data from the assets file, and it will be available via dataflow.
  • Transitive nodes will (hopefully) be displayed via IAttachedCollection APIs rather than progression/graph node APIs.
  • IDependencyModel.DependencyIDs will become obsolete and unused. I will sync with WebTools to prevent regressions in web projects (pinging @abpiskunov).

In the notes above @abpiskunov wrote:

By having this new logic we could also try to optimize data structure in VS and keep unique package objects across solution, instead duplicating it per each project (this is orthogonal problem, but we might think about this while doing matching DT changes)

@nguerrera @davkean and others, two questions around this:

  1. Can we safely assume that for a given target a package's data is identical across all projects in the solution?
  2. Is there a case where data associated with a resolved package version could change over time?

If the answers are "yes" and "no" respectively, then we can create a globally scoped cache of this data and not need to worry about updating contents within.

I suspect the answer to question 1 is "no" however, considering the case where a package's child references are conditioned on something that can differ from project to project, in which case its graph would vary between projects.

Can we safely assume that for a given target a package's data is identical across all projects in the solution

PrivateAssets, ExcludeAssets, IncludeAssets can all affect a resulting package's data in the assets file for a given project.

Is there a case where data associated with a resolved package version could change over time?

If the package is resolved from a different source, it could be different, but I probably would worry about that case.

In that case:

  • I won't share structures across projects within a solution
  • I will de-duplicate strings across the solution
  • I will investigate reusing existing structures when freshly parsed structures are identical to existing ones, so we collect objects in younger generations

This is progressing smoothly, with work happening on the feature/dependencies-tree-performance branch. That branch will address several issues once merged.

It's using IAttachedCollection* APIs which are a pleasure to work with, but have no built-in support for search (unlike the graph/progression APIs). I need to explore options around this.

I also need to think through whether we need to change the public API here, specifically considering WebTools dependencies (npm/bower).

How often you see different data for the same package? once in a while or very often? Depending on that sharing package would save huge amount of memory. Also you could consider a "flyweight" share package metadata, but allow projects have some metadata that could override/be different?

@abpiskunov that's proposed in #3335.

With the current work we will only eagerly populate hierarchy items, and there's little data to share there.

For transitive dependencies, these are lazily populated as the user expands. We could share nodes, but I suspect that for 99.99% of sessions there'd be no noticeable benefit. There will be an up-front cost for de-duping nodes too, but that might not be too great.

Flyweight is interesting and I'll take a look to see if it'd help. We're removing redundant metadata from task items, and CPS is pretty good at keeping the sizes of snapshots down using shared key dictionaries.

PR for the SDK changes is up: https://github.com/dotnet/sdk/pull/11358

Was this page helpful?
0 / 5 - 0 ratings