Sdk: Support extending which assemblies are trimmable

Created on 12 Jun 2020  路  25Comments  路  Source: dotnet/sdk

The current linker defaults make it difficult to modify the linker behavior to achieve "aggressive" trimming. We should introduce an MSBuild property to turn on aggressive trimming - that is, to pass all linker inputs with the action "link" instead of "copy" or "copyused". We need to consider how this interacts with existing mechanisms to modify linker behavior on individual assemblies.

It should be possible to enable different levels of trimming on a per-assembly basis as well, especially by different SDK components. The primary use case we would like to support immediately is blazor.

Most helpful comment

I don't like the term "link" - I think it has too much history which is about tools not related to our "linker" functionality (there is some overlap, but relatively small and definitely hard to explain). Not counting that when we introduce true AOT, there might actually be a true platform linker in the toolchain as well - which would create lot of confusion.

So in that sense I prefer anything else - "trim"/"trimmer" is at least unique enough. So going with that logic, I think we should name all public (and as much of internal ones as possible) properties/items with that term "trim".

The above proposal looks good with the exception of ManagedAssemblyToLink - it just doesn't fir with the rest. It's not ideal either, but ManagedAssemblyToTrim would at least fit with the rest well. Even if we later on tell the "Trimmer" to skip the assembly for example.

As for the future analyze-only scenarios... if we have to I would be OK with using "trim" there as well (after all the trim part is by far the most complex and likely heavily used). Having an action "analyze" which would apply as the default or per-assembly would make that work reasonably well.

If we could improve on that, then I would probably go with a completely separate set of properties/items - the example ManagedAssemblyToAnalyze sounds good. The tricky bit would be how to unify the two sets (trim and analyze) when both are used and we would use both to tell linker what to do.

One other thing to consider - rename the actions - specifically "link" is just bad (see the above reasons). Not only it uses terminology which is typically associated with a different action, but even in the linker it makes little sense - it actually means "trim" or "sweep" or... but not "link" (whatever that means). So ideally the values used by the msbuild properties/items would be from a different set which align the terms with the above discussion. Implementation wise we could either do the translation in the msbuild or ILLink task, or we could teach linker to accept multiple values for the same thing.

While we're at it, I think it's a good idea to design ahead and incorporate the "trim granularity" into it - so basically rename "copy" to "TrimAssembly" or similar which would then probably mean renaming "link" to "TrimMembers" - not bad. We don't have to support or even define all of those yet if we don't need to.

Nit: For actions and related settings I would not use the name of the tool, but rather the name of the action, so: TrimmerGranularity -> TrimGranularity. TrimmerDefaultAction is different as that talks about the tool - so that could stay like that.

All 25 comments

Background

The SDK publish targets run ILLink, which has subtargets that process ResolvedFileToPublish. This list gets filtered down to the managed assemblies with PostProcessAssemblies == true, which are passed to the linker. Those with IsTrimmable != true by default are rooted and linked with the copy action, and the rest have the action determined by _TrimmerDefaultAction, which defaults to copyused.

It is worth reiterating that there are three conditions that influence the behavior:

  1. PostProcessAssemblies controls whether the linker will see the assembly at all
  2. IsTrimmable controls whether the linker will tree-shake the assembly (if not, it gets rooted, and gets action copy)
  3. the action (per-assembly Action metadata, or default) controls the level of tree-shaking

Blazor has their own linker targets that run during build, which filter assemblies by filename to determine the linker behavior. For .NET5, they are planning to move to the built-in linker targets that run during publish. They need a place to hook into the pipeline where they can continue filtering assemblies by name, and also a way to set the linker default behavior to aggressive trimming instead of assembly-level trimming (which is the .NET SDK default). In addition, they generate custom "type-granularity" roots for some assemblies, which will be done in the same place.

Proposal

TrimMode

To enable aggressive trimming instead of assembly-level trimming, we provide a public property TrimMode. Setting this tolink will change the default behavior from copyused to link (aggressive trimming) for assemblies that don't have per-assembly TrimMode. TrimMode can also be set as Item metadata to override the global property per-assembly.

PrepareForILLink

We will provide a new public target, PrepareForILLink that runs before ILLink target, and provides a convenient place to hook into the pipeline to modify metadata for trimming. SDK components can use this as an extension point via BeforeTargets and AfterTargets.

The global TrimMode may be set any time before PrepareForILLink runs, which will set it to a default value if not set previously.

ManagedAssemblyToLink

This target will have a dependency that creates the ItemGroup ManagedAssemblyToLink, which represents the set of assemblies that will be passed to the linker. Custom targets may modify IsTrimmable and TrimMode metadata on these assemblies before PrepareForILLink, which will set the assembly action based on this metadata, or they may modify the metadata after PrepareForILLink has run.

It will be illegal to change the items in ManagedAssemblyToLink, since this represents the set that needs to be filtered and replaced in the publish output. To change which assemblies are passed to the linker, a different extension point should be used to set PostProcessAssemblies metadata.

Examples

This shows how a developer could turn on aggressive trimming for framework assemblies (which are defined to be IsTrimmable by the SDK):

<PropertyGroup>
  <TrimMode>link</TrimMode>
</PropertyGroup>

This shows how Blazor (or a developer) could hook into the build to opt assemblies into different levels of trimming based on the filename:

<Target Name="PrepareForBlazorILLink"
        BeforeTargets="PrepareForILLink">
  <PropertyGroup>
    <!-- Set the default TrimMode for IsTrimmable assemblies -->
    <TrimMode>link</TrimMode>
  </PropertyGroup>
  <ItemGroup>
    <ManagedAssemblyToLink Condition="'$([System.String]::Copy('%(ManagedAssemblyToLink.Filename)').StartsWith('Microsoft.AspNetCore.'))">
      <!-- Trim these assemblies using the global TrimMode -->
      <IsTrimmable>true</IsTrimmable>
    </ManagedAssemblyToLink>
    <ManagedAssemblyToLink Condition="'$([System.String]::Copy('%(ManagedAssemblyToLink.Filename)').StartsWith('ThirdPartyAssembly.'))">
      <!-- Trim these assemblies with assembly-level trimming. Implies IsTrimmable. -->
      <TrimMode>copyused</TrimMode>
    </ManagedAssemblyToLink>
  </ItemGroup>
</Target>

Notes

IsTrimmable vs TrimMode

IsTrimmable exists in addition to TrimMode so that there can be a global default for assemblies without a per-assembly TrimMode. This lets the global property be used to set the mode for all IsTrimmable assemblies, and it lets individual assemblies be opted into trimming using the default mode set by the SDK for the target form factor.

Naming of TrimMode values

We have considered a few naming conventions for the TrimMode values:

  • Conservative/Aggressive - avoids complex terminology and would be easy to use for app developers without requiring an understanding of the linker, and might let us change optimization levels in the future, but hides details from developers who are interested in the underlying behavior
  • TrimAssembly/TrimMembers - describes what the linker is doing in each mode, but is incomplete because it doesn't mention the various optimizations that are turned on
  • copyused/link - maps directly to the underlying terminology used in the linker, letting developers who understand the linker make informed decisions, but requires more understanding of the linker

For now, the proposal is to stay with the copyused/link terminology that is used by the tool itself. IsTrimmable allows opting into or out of trimming without referencing this terminology. If we add higher-level options to the linker in the future, we could expose those as new TrimMode values, or aliases for existing values.

Relation to existing options

TrimMode (global and per-assembly) will replace _TrimmerDefaultAction and the per-assembly Action metadata. PrepareForILLink and ManagedAssemblyToLink will replace the _SetILLinkDefaults target and the _ManagedAssembliesToLink ItemGroup.

Build vs Publish

The public properties and targets exposed in this design do not require modifying ResolvedFileToPublish or other MSBuild entities that are related to publish, leaving some room for us to potentially reuse targets if we ever need to run the linker during build instead of publish.

@eerhardt @pranavkm @marek-safar

/cc @akoeplinger

I think this proposal looks really good. Thanks for putting it together. I have a couple comments/questions:

  1. > Custom targets may modify IsTrimmable and Action metadata on these assemblies

    • What's the difference between IsTrimmable and Action? Do we need both?

  2. The usage of the terms link and trim confuses me (this isn't just based on this proposal, we sort of have issues everywhere). The target is named ILLink, which respects a property named IsTrimmable. And then we have a property named TrimmerDefaultAction, which I can set to link.

Minor nit, The SDK typically names items in the singular e.g. ResolvedFileToPublish, ContentWithTargetPath etc. Consider naming the item group ManagedAssemblyToLink

What's the difference between IsTrimmable and Action? Do we need both?

IsTrimmable was originally intended as a user-facing opt-in that deals less with the implementation details. Its current semantics aren't captured by Action alone since non-IsTrimmable assemblies are also rooted. (Action just represents the linker action).

The usage of the terms link and trim confuses me

Agreed. :( To explain how we got here: ILLink is used because it's the name of the tool (originally monolinker). PMs recommended to use "Trimmer" to describe the tool to the public and for the public-facing MSBuild options in 3.0. Now more of the extension points are reflecting naming conventions used by the tool - and we are even considering using the tool in an analysis mode, where the input assemblies, tentatively named ManagedAssemblyToLink in the proposal, would be analyzed without being trimmed.

If I were starting fresh, I might consider naming the project something like "IL Analyzer" or "IL Optimizer", and use "ManagedAssemblyToAnalyze", "AnalyzerAction: Trim", etc.

Consider naming the item group ManagedAssemblyToLink

Edited above, thanks!

a property named TrimmerDefaultAction, which I can set to link

An alternative to TrimmerDefaultAction would be to define a different property like TrimmerGranularity (member/assembly) that maps to _TrimmerDefaultAction = link or copyused. Another idea is TrimmerBehavior (TrimMembers/TrimAssemblies), which we could extend to Analyze for a future analyze-but-don't-trim mode. Thoughts?

@vitek-karas @MichalStrehovsky

What about instead of TrimmerDefaultAction to have TrimmingMode with link, copy etc options. It could also be handy to allow the same attribute to be used on the project/assembly/packages references (can be done later).

I don't like the term "link" - I think it has too much history which is about tools not related to our "linker" functionality (there is some overlap, but relatively small and definitely hard to explain). Not counting that when we introduce true AOT, there might actually be a true platform linker in the toolchain as well - which would create lot of confusion.

So in that sense I prefer anything else - "trim"/"trimmer" is at least unique enough. So going with that logic, I think we should name all public (and as much of internal ones as possible) properties/items with that term "trim".

The above proposal looks good with the exception of ManagedAssemblyToLink - it just doesn't fir with the rest. It's not ideal either, but ManagedAssemblyToTrim would at least fit with the rest well. Even if we later on tell the "Trimmer" to skip the assembly for example.

As for the future analyze-only scenarios... if we have to I would be OK with using "trim" there as well (after all the trim part is by far the most complex and likely heavily used). Having an action "analyze" which would apply as the default or per-assembly would make that work reasonably well.

If we could improve on that, then I would probably go with a completely separate set of properties/items - the example ManagedAssemblyToAnalyze sounds good. The tricky bit would be how to unify the two sets (trim and analyze) when both are used and we would use both to tell linker what to do.

One other thing to consider - rename the actions - specifically "link" is just bad (see the above reasons). Not only it uses terminology which is typically associated with a different action, but even in the linker it makes little sense - it actually means "trim" or "sweep" or... but not "link" (whatever that means). So ideally the values used by the msbuild properties/items would be from a different set which align the terms with the above discussion. Implementation wise we could either do the translation in the msbuild or ILLink task, or we could teach linker to accept multiple values for the same thing.

While we're at it, I think it's a good idea to design ahead and incorporate the "trim granularity" into it - so basically rename "copy" to "TrimAssembly" or similar which would then probably mean renaming "link" to "TrimMembers" - not bad. We don't have to support or even define all of those yet if we don't need to.

Nit: For actions and related settings I would not use the name of the tool, but rather the name of the action, so: TrimmerGranularity -> TrimGranularity. TrimmerDefaultAction is different as that talks about the tool - so that could stay like that.

Adding @samsp-msft as the PM representative.

Thanks for the feedback! How does the following sound?

  • ManagedAssemblyToLink -> ManagedAssemblyToTrim (note that this set will include IsTrimmable != true assemblies)
  • TrimmerDefaultAction (link/copyused/copy/...) -> TrimMode (TrimMembers/TrimAssembly/Keep...)

My only concern is that Keep might be read by some as "root", which isn't what it means. Can anyone suggest a better name to replace copy?

I think it would make sense to do the translation in the task as long as there is a 1-to-1 mapping of these settings to AssemblyAction.

note that this set will include IsTrimmable != true assemblies

I still don't understand why we need IsTrimmable at all. Why can't we just have TrimMode, and if TrimMode isn't set, it defaults to Keep/copy?

Why can't we just have TrimMode, and if TrimMode isn't set, it defaults to Keep/copy?

We also want non-IsTrimmable assemblies to be rooted, but I don't think an empty TrimMode should have those semantics. Rooting means something different from the assembly action - which is more like a decision about what to output.

There are two ways to set the TrimMode/action at the level of illink.dll: a global setting, or per-assembly settings that take precedence over the global setting. The addition of IsTrimmable also lets you use opt an assembly into trimming without explicitly specifying the action.

That's how I think about it at least - but maybe there is a cleaner way to do it that I'm not seeing.

While we're at it, I think it's a good idea to design ahead and incorporate the "trim granularity" into it - so basically rename "copy" to "TrimAssembly" or similar which would then probably mean renaming "link" to "TrimMembers" - not bad. We don't have to support or even define all of those yet if we don't need to.

I don't like this approach. I think we are just switching from one incorrect naming into another similarly incorrect naming. For illustration when someone sets TrimMode=TrimMembers you are actually instructing linker to remove resources, remove type-forwarders, drop interfaces, etc. Similarly for TrimMode=TrimAssembly, which is even more confusing.

It seems to me we should make the option more user friendly and hide the implementation details and configuration options. We could do that by defining different modes/levels. For example, we could have TrimMode with the following options (each of them with defined ILLinker behaviour)

  • None
  • Conservative (never breaks stuff)
  • Aggressive (similar to link today)
  • Maximal (link today with most optimizations enabled)

and we could tweak their underlying setting as we make more stuff reliable.

Had an offline discussion with @marek-safar:
There should be an easy to use setting for app developers and it should not use any complex terminology.

This basically means that we should have one property which is the main knob for users to "play with". The TrimMode seems like a good candidate. Its values should be easy to understand - so the proposed None, Conservative, Aggressive, ... looks like a good start. SDK would then map that to the specific settings on the linker tool (enable/disable various options, set default action, maybe even change per-assembly actions, ...). Ideally the target specific SDK (Blazor in this case) would also have some relatively simple way to influence the mapping.

So I think the most important change is to introduce (or rename) the TrimMode property and define at least the 3 modes:

  • None - I think this basically maps to PublishTrimmed=false - but I think we should discuss this. We could also use this as a way to run the linker without it actually doing anything but analysis.
  • Conservative - would map to the 3.0 PublishTrimmed behavior
  • Aggressive - would map to the new .NET 5 behavior for Blazor - In this case the Blazor would help define what it means per-assembly.

The next one is to define the per-assembly settings. I think the above proposal are OK - but I ended up agreeing with @marek-safar on not using the "TrimMembers"/"TrimAssembly" terminology. We could go with either the current names linker has for actions (copy, link, ...) or rename those. For simplicity we can probably keep the linker names - we can always rename them in the future (basically add aliases).

I would try to stick to the terminology separation between the capability => "trim" and the tool => "ILLinker".

make the option more user friendly and hide the implementation details and configuration options
we could tweak their underlying setting as we make more stuff reliable

I like the idea that these options should be user-friendly - though I would suggest that the configuration options we define should themselves be user-friendly so that we don't have to hide them. If we are going to be changing the meanings of the optimization levels over time, maybe they belong in the linker itself rather than in the MSBuild glue?

That said, the proposed TrimMode seem like they would map pretty directly to the existing settings, so maybe that's not necessary right now.

It sounds like the suggestion is to have TrimMode:

  • Conservative -> default action: copyused
  • Aggressive -> default action: link

For both Conservative and Aggressive, !IsTrimmable would still mean root and copy.

Does that match your intent @marek-safar @vitek-karas ?

About None - I would avoid defining a new option that means the same thing as the well-established PublishTrimmed=false, which matches PublishReadyToRun and PublishSingleFile.

I would try to stick to the terminology separation between the capability => "trim" and the tool => "ILLinker".

To me that suggests using ManagedAssemblyToLink since this represents the set that is processed by the tool - whether they are rooted, copied, analyzed, trimmed, etc. Otherwise we end up with ManagedAssemblyToTrim including !IsTrimmable assemblies.

maybe they belong in the linker itself rather than in the MSBuild glue

I think they should be defined in ILLink.Task

About None - I would avoid defining a new option that means the same thing as the well-established PublishTrimmed=false

None would mean that linker is still run but it marks everything which is not same as PublishTrimmed=false

make the option more user friendly and hide the implementation details and configuration options

I don't like the Conservative vs Aggressive as it doesn't help the developer make an informed choice - our options should be clear about what level of shaking is involved. Its then much easier to reason about what the breakages and impact is.

My only concern is that Keep might be read by some as "root", which isn't what it means. Can anyone suggest a better name to replace copy?

DoNotTrim, DoNotModify, KeepAsIs

None would mean that linker is still run but it marks everything

This is what !IsTrimmable means today.

@eerhardt was suggesting this as well - to fold IsTrimmable and the action into a single TrimMode - but we hit the following issue: How would we handle assemblies without TrimMode metadata?

  • If they default to the None behavior, then the global TrimMode has no effect.
  • If they default to the global TrimMode, then assemblies (including nugets) will be trimmed by default, which is not what we want.

We need a separate IsTrimmable that can be used to opt assemblies into trimming without specifying the TrimMode per-assembly (and makes it possible to use a global TrimMode to influence all IsTrimmable assemblies at the same time).

I've updated the proposal to use TrimMode (using the existing action names), added notes and examples to clarify the IsTrimmable behavior, and summarized the feedback about naming the options.

Just to summarize the current state of the proposal:

  • PrepareForILLink and ManagedAssemblyToLink as extension points
  • TrimMode global property or per-assembly metadata: copyused/link
  • IsTrimmable metadata, where !IsTrimmable means copy and root

TrimMode global property or per-assembly metadata: copyused/link

Were we going to use different names for these modes? Or are we sticking with the linker .exe names?

I am suggesting we stick with the linker names - see the notes in "Naming of TrimMode values" above for a summary of the feedback I've received.

This proposal sounds good to me. I think it will address the short term needs for Blazor, and makes a good foundation for future improvements.

I agree. Let's get this implemented in time for the last preview to test it works for the scenario

Was this page helpful?
0 / 5 - 0 ratings