Runtime: Run illink task with at the end of build

Created on 4 Feb 2020  路  58Comments  路  Source: dotnet/runtime

Problem Statement

Today when linking a Xamarin or Blazor application using the dotnet/runtime/src/libraries we are not able to trim a lot of unused code from the libraries. One major reason unused code is left in the libraries is because the libraries are using ILLinkTrim.xml files in order to preserve members that are invoked dynamically across assemblies.

Goals

  1. Ensure all required code is preserved by the linker.
  2. Allow Xamarin and Blazor applications to use the dotnet/runtime/src/libraries and be as small as possible.

    • Both of these application models are very sensitive to the size of the downloaded application. As such every KB of app size we can reduce is a win for our customers.

Background

Take an example case where AppDomain is calling a private method in System.Security.Claims.

https://github.com/dotnet/runtime/blob/9af1c5e4b072b88b78fcc2d35a350ad4db737c3c/src/libraries/System.Private.CoreLib/src/System/AppDomain.cs#L389-L400

https://github.com/dotnet/runtime/blob/9af1c5e4b072b88b78fcc2d35a350ad4db737c3c/src/libraries/System.Security.Claims/src/System/Security/Claims/GenericPrincipal.cs#L86-L87

A simple usage of Reflection like this can be understood by the mono/linker when both assemblies are given to it at the same time. It sees a hard-coded Type and MethodInfo is being retrieved, and it understands that when AppDomain.GetThreadPrincipal() is called, it has a dependency on GenericPrincipal.GetDefaultInstance(). Thus, when AppDomain.GetThreadPrincipal() must be kept, it must also keep GenericPrincipal.GetDefaultInstance(). And in the reversed case, if an application doesn't need to execute AppDomain.GetThreadPrincipal(), the linker can also safely remove GenericPrincipal.GetDefaultInstance().

For cases that the linker can't figure out itself, we are adding a new attribute DynamicDependency (#30902), which we can place on AppDomain.GetThreadPrincipal() to annotate that it uses Reflection to call GenericPrincipal.GetDefaultInstance(). So even if the linker can't recognize the pattern, we can still be safe.

Today, when we run linker on the dotnet/runtime/src/libraries, we are linking a single assembly at a time. So when we link System.Security.Claims, the linker has no idea that AppDomain.GetThreadPrincipal() requires GenericPrincipal.GetDefaultInstance() (since AppDomain is in a different assembly). To prevent the linker from removing GenericPrincipal.GetDefaultInstance(), we use an ILLinkTrim.xml file that tells the linker to keep the private method. We also then embed the ILLinkTrim.xml file into the assembly so when other apps using System.Security.Claims are linked, the linker will still know that GenericPrincipal.GetDefaultInstance() is needed and not remove it.

The problem is that ILLinkTrim.xml files tell the linker to ALWAYS leave a member in the assembly. It doesn't know WHY the member is needed. Thus, even if the application doesn't use AppDomain.GetThreadPrincipal(), the linker is still leaving in GenericPrincipal.GetDefaultInstance().

Proposed Solution

Instead of linking a single assembly at a time, we should link in the context of the entire "distributable". We can then remove the ILLinkTrim.xml files, and instead add the correct annotations to the code directly. This will give the linker the whole picture of what code is required dynamically by what other code. And when using the library in a different "distributable", the linker can better tell what code can be trimmed safely.

The one "distributable" we are linking for today is the Microsoft.NETCore.App's shared framework. This would mean to link the entire shared framework at the end of our build, and not during each individual library.

One drawback of this approach is that when incrementally building a single library we won't be trimming it anymore during the inner dev loop. It currently takes ~30s to link the entire shared framework, which we wouldn't want to do during the inner dev loop. Unless the linker can support incremental linking, we won't be trimming the individual library. This may result in differences in behavior between the inner dev loop and a full build. Code will be left in the assembly during the inner dev loop that may be trimmed away in the full build. However, these situations wouldn't be able to get past CI since it does a full build.

Alternative Solutions

  1. Still link individual assemblies, but only use the ILLinkTrim.xml files when linking for the shared framework.

    • This is not ideal because it will lead to duplicate information between the attributes in the code, and the ILLinkTrim.xml. We will still need to add attributes to the code in order for the libraries to be linker safe when linking for Xamarin applications. And the linker won't be able to see the attributes in other assemblies when linking a single assembly - so we will also need to keep the information in the ILLinkTrim.xml.
  2. Don't trim the libraries in the shared framework at all.

    • If we truly don't care about the size of the shared framework, then we can just turn this functionality off. But completely turning it off would mean a size increase in our distributable. And we wouldn't be able to add more linker optimizations as outlined in #31785 (although adding more optimizations is not a direct goal of this work).
    • With the current master branch here are the size differences in the end shared framework on win-x64:

| | Size On Disk| Bytes|
| ------------- | ------------- | ----- |
| Trimmed| 65.8 MB| 69,058,560 bytes |
| UnTrimmed| 67.8 MB | 71,118,848 bytes |

  1. Use "fake" public methods.

    • The idea here is if the linker thinks the method is "public", then it won't trim it, even if it doesn't see any calls to it. So we would mark the method "public" in the implementation assembly, but leave it out of the ref assembly.

    • This approach honestly feels like a hack and I wouldn't recommend wide usage of it. It also breaks the developer flow of re-generating the ref source file using our tools (t:GenerateReferenceSource). Now the developer will have to remember to go remove these "fake" public methods in the ref assembly every time it is re-generated.

Original comments

This will allow it to apply more optimizations than when running with reference assemblies (e.g. constant propagation). It'll be required to reduce framework size when built-in substitutions are applied (e.g. intrinsics).

Another benefit we get is that PreserveDependency will work correctly as it will be applied to final assembly, not some intermediate reference.

area-Infrastructure-libraries linkable-framework untriaged

Most helpful comment

I've updated the top comment on this issue to be more like a one-page spec of what we are trying to solve with this work. I hope it better describes the goals we are trying to achieve, and why we should do this.

Please take a look and let me know if I've missed something or if you have other feedback.

All 58 comments

It can absolutely be explored. We just need to be cognizant with such a change of inner-loop development, where a developer iterates on an individual library. If illink doesn't run as part of that, it's more likely we'll see invalid tests and the like get written.

Also, we can do this only for assemblies that ship in netcoreapp and not in a NuGet package.

Can we estimate the benefits so that this can be prioritized against the other work?

@danmosemsft could you please prioritize this, we'd like to start building in this mode ASAP to test the changes.

@eerhardt is this something you can help with?

We just need to be cognizant with such a change of inner-loop development, where a developer iterates on an individual library. If illink doesn't run as part of that, it's more likely we'll see invalid tests and the like get written.

I'm not sure this is a blocker though. The same situation applies with cross-platform. ex. I have my inner-loop development on Windows, but I just wrote a test that only fails on Linux.

I think it is important that we run our CI tests on the output of the linker, though. That way we ensure what we ship to our customers works the same as what we test.


I looked into it a bit today with the help of @safern @ViktorHofer and @joperezr.

A problem that came up with this work is that there isn't a single place where we can run the linker that will produce netcoreapp assemblies that get picked up for both:

  1. What we run tests against
  2. What we install on customers machines

So doing this work with our current infrastructure means we would need to illink twice during our build. Once for the testhost and again against the shared fx that gets put into the installer/runtime pack. This is very similar to the situation with crossgen today. Today we crossgen what gets put into the installer, and #34365 is adding another call to crossgen the testhost assemblies before tests are executed.

cc @ericstj @janvorli

I'm not sure this is a blocker though. The same situation applies with cross-platform. ex. I have my inner-loop development on Windows, but I just wrote a test that only fails on Linux.

It's not a blocker per se, but I don't agree it's the same situation. I can iterate locally and know that the tests are passing on whatever OS I'm working on locally; I can iterate locally on multiple OSes and know the tests are passing on those; etc. With this change, I can iterate locally to my heart's content, but unless I'm doing a full build, I may still hit failures in the tests in CI on that exact OS I was using.

So doing this work with our current infrastructure means we would need to illink twice during our build.

I think you are on right path except for the fact the illink will be run once as part of the build and then on libraries-illink-test lane as a new testing configuration (this should have been done for 3.1). This is something that could leverage the in-place runtime packs for mobile testing we just added fro Mono runtime or you could write your own version.

We may want to explicitly rethink the approach and guarantees of the dev inner loop, and accept that the inner-loop and whole product build are not the same and you may see a failure in one that you do not see in the other.

This and other issues (https://github.com/dotnet/runtime/issues/35112) suggest that we should:

  • Do a regular build of whole product, with IL linker, crossgen, Release flavor (?)
  • The fast inner loop iteration on a single library can then build without IL linker and crossgen, debug/checked flavor, etc.

I looked into doing this today, and I think we would only need to link one time during the full build.

My current thinking is the following:

  1. Remove the linker target for individual project builds that output to the shared framework. (I assume we still want to run the linker for OOB assemblies?)
  2. In ~src/libraries/src.proj, we add a step that will run the linker with arguments like the following (note 2 directories are passed in to -d):
-c link -d $(LibrariesSharedFrameworkBinArtifactsPath) -d $(CoreLibLocation) ...
-r Microsoft.CSharp 
...
-r System.ObjectModel 
-r System.Private.CoreLib 
....
  1. And overwrite all the files in $(LibrariesSharedFrameworkBinArtifactsPath) and $(CoreLibLocation) with the trimmed/linked output.
  2. Then in ~src\libraries\pretest.proj, we copy the trimmed assemblies in $(LibrariesSharedFrameworkBinArtifactsPath) and $(CoreLibLocation) to the testhost folder.

Now when our tests run, they will run against the trimmed assemblies. And the trimmed assemblies will get picked up in the runtime pack and shared framework installer.

@ViktorHofer @ericstj @stephentoub @jkotas @joperezr @safern @marek-safar - Thoughts?

[Aside since you mention overwrite, it is generally fragile to overwrite files in a build since if the build doesn't complete successfully the next build can be bad.]

I assume we still want to run the linker for OOB assemblies?

Yep. Do we still have the cases where the same binary ships both in OOB NuGet package and in the shared framework? I assume they would need to be special-cased somehow; or we may just get rid of them completely.

CoreLib

  • We have two different CoreLibs - one for CoreCLR and one for Mono, but libraries are shared between CoreCLR and Mono. I assume that we would do the shared framework trimming twice (once for CoreCLR and once for Mono) and ended up with two shared frameworks at the end of the build?
  • For CoreCLR, we would also need to move crossgening of CoreLib from where it is today to the end (ie get rid of Clr.NativeCoreLib subset).

it is generally fragile to overwrite files

Agree. Stay aware from overwriting files.

I assume that we would do the shared framework trimming twice (once for CoreCLR and once for Mono) and ended up with two shared frameworks at the end of the build?

I didn't think we were building a shared framework for Mono. https://github.com/dotnet/runtime/pull/35127 says:

Since mono distributes only runtime packs

But if we were to ever add a shared framework for mono, yes we would need to build 2 shared frameworks and link them separately.

For CoreCLR, we would also need to move crossgening of CoreLib from where it is today

Good point. I'll investigate what to do for CoreLib.

I didn't think we were building a shared framework for Mono

Ok, replace "shared framework" with "runtime pack" in my comment. Runtime packs have the same problem.

I think there are two paths which need fixing

  1. Runtime Pack (aka self-contained) builds/targets
    This one should be easy as we already build RID specific output in $(NETCoreAppRuntimePackPath) which we can run ILLinker on at the end of the build and the tests pick up the assemblies from.

We'll also run linker second time (in selected configurations) on the output as part of the testing matrix.

  1. Shared Framework
    This is CoreCLR configuration only (and will most likely stay like that for net5). We could run the linker at the end of the build which would trim for CoreCLR only and that's fine.

/cc @steveisok

To clarify re: https://github.com/dotnet/runtime/pull/35127... That is true right now in an installer context. The rest of the build for desktop mono behaves the same as you're used. We do plan on making desktop self contained at some point.

For CoreCLR, we would also need to move crossgening of CoreLib from where it is today

Where should we move crossgening of CoreLib? The logical place feels like crossgening it with the rest of the libraries when we build the shared framework (during the src\installer build). But that would mean it wouldn't be crossgen'd for the tests. Will that be an issue?

Yes, the tests will run noticeably slower, especially if you are on the checked runtime. It is one of the reasons why I have suggested https://github.com/dotnet/runtime/issues/31712#issuecomment-616844904

I agree https://github.com/dotnet/runtime/issues/31712#issuecomment-616844904 is where we will want to get, but I'm not sure we can complete it all in one shot.

To make incremental progress toward that goal, and make progress on our size goals, what if we did the following:

  1. Keep CoreCLR's CoreLib as it is today - it gets linked and crossgen'd as part of src\coreclr's build.
  2. No longer ILLink src\libraries during the individual project's build. (From what I can tell, we only ILLink "CoreCLR shared framework" libraries today. Not OOB libraries.)
  3. At the end of the libraries build, add a step to run ILLink on all the libraries + CoreLib. But since CoreLib is already linked and crossgen'd, pass in -p copy System.Private.CoreLib. This will cause CoreLib to be analyzed, but not linked/trimmed again. That way PreserveDependency attributes in CoreLib to src\library code will be respected.
  4. Binplace the trimmed files to where the installer and testhost can pick them up (ensuring we don't overwrite files, as above)

This should allow us to get rid of a bunch of ILLinkTrim.xml files in src\libraries. One that can't be removed with this plan is the sgen => System.Private.Xml issue. We will have to solve that separately.

I have a prototype of the above plan here: https://github.com/eerhardt/runtime/commits/ILLinkEndOfBuild.

@marek-safar @jkotas - do you think this is an acceptable, incremental step? After this is done, we can also make the step ILLink the Runtime Packs that are created when $(TargetsMobile) is true, if applicable.

Make sure that the inner loop (iterating build + test on a single library without full build) still works after the change. Otherwise, it sounds reasonable to me.

Since inner loop already requires one full build pass, can we allow incremental build to rerun illink and crossgen on the library as it's rebuilt? This way we still test the same thing that we ship?

can we allow incremental build to rerun illink and crossgen on the library as it's rebuilt? This way we still test the same thing that we ship?

On my machine ILLink on the entire CoreCLR shared framework takes about 30 seconds. I know @vitek-karas and company are working on making it faster, but I don't believe this is something we want to do inside of the inner loop. (Note: you need to analyze the entire shared framework because some other assembly may have a PreserveDependency on a type/method in the assembly you just built.)

Do we get the same results by rerunning the linker a single assembly, but giving it runtime assemblies as input? Or do we need to replay linking the entire shared framework every time a single assembly changes?

If you want it to be 100% correct, you are going to need to run it against the whole thing again. If you add or remove any linker annotations (like PreserveDependency), it can affect what is trimmed/kept in other assemblies that you aren't building.

Tell me more about PreserveDependency.

@jkotas are you OK with the idea that the assemblies we are shipping are now not independently serviceable? That was one of the benefits of cross-gen over ngen. I know we service the entire shared framework as a set now but this is the first time I recall us perusing a change where the content of one library changes the content of another. Is it impossible for us to to design the linker metadata in such a way that we can represent it in reference assemblies?

are you OK with the idea that the assemblies we are shipping are now not independently serviceable?

I do not think that this change alone would prevent independent serviceability. This change means that one needs to be more careful about it.

Also, we are not really testing or exercising the independent serviceability at all. If we ever need to do that, we would likely found number of gotchas.

@marek-safar @jkotas - do you think this is an acceptable, incremental step?

Yes, I think this is an excellent first step to unblock other work based on this

I don't believe this is something we want to do inside of the inner loop

I think eventually we want to run some linker logic during inner dev loop but that will most likely be achieved by analyzers or some other linker mode. I don't think it needs to be done now.

Tell me more about PreserveDependency.

We just reviewed and approved the API for this today: #30902. It was renamed to DynamicDependencyAttribute. Check out the video of the discussion if you're interested.

@ericstj asked me some other questions offline that I figured I'd answer here:

Have you considered other alternatives: like only applying ILLinkTrim files for the target frameworks where linking needs to be applied, and avoiding them when we build for Xamarin frameworks / runtimes?

This approach is not ideal because we would need to duplicate information between the ILLinkTrim.xml file and attributes added directly to the code. Take this example. Here is a case where AppDomain is calling a private method in System.Security.Claims.

When you link the entire shared fx together, the linker will see that AppDomain has a [DynamicDependency] to this function, and it knows to leave it in. But if you are relying on the ILLinkTrim.xml file, and only linking System.Security.Claims by itself, you will need to also put it in the ILLinkTrim.xml file. This results in duplicate effort/maintenance.

and avoiding them when we build for Xamarin frameworks / runtimes?

You can't avoid it in the code attributes because there are cases where the linker can't figure it out by itself, and you need to apply the [DynamicDependencyAttribute] to tell the linker. That way if the Xamarin application is calling into AppDomain, the linker won't trim away the System.Security.Claims private method.

I鈥檝e heard
1: run ILLInk against while giving it implementation assemblies instead of reference assemblies.
2: eliminate ILLinkTrim xml files because they cause some bloat

At least #1 seems like it can be solved by just consuming different input for linking, which we can still make happen on rebuild of a library if we can ensure the rest of the framework has been built.

During the initial build of the repo, we can't run the linker on any individual assembly until all the shared fx assemblies are built. Because some assembly that is built later might have a [DynamicDependency] into an earlier built assembly.

As for the inner-dev loop scenario, see https://github.com/dotnet/runtime/issues/31712#issuecomment-618519126. Because of these dynamic dependencies, in order to be correct we need to re-link the whole shared framework again.

I don鈥檛 quite understand #2 yet. What bloat is this and why? All these files are included because they were representing some real implementation dependency, we can鈥檛 simply remove that, we鈥檇 need some other way to express it.

Check out https://github.com/dotnet/runtime/issues/35199#issuecomment-616559764, for more information on this. Basically, when you use ILLinkTrim.xml, it means ALWAYS keep this type/method unconditionally. Instead, we want to add annotations to the code that describe these relationships that declare WHERE/WHY this type/method should be kept. In the above linked example, if an application never calls into AppDomain.GetThreadPrincipal, there is no reason to keep this private method in System.Security.Claims. But when using ILLinkTrim.xml, the linker has no idea WHY it is being kept. It just unconditionally keeps it.

we can鈥檛 simply remove that, we鈥檇 need some other way to express it

Correct - the other ways of expressing it are:

  1. The linker figures it out itself when it has the whole picture (i.e. the entire shared framework at once).
  2. By expressing it in code using the new attributes. See #30902, #33861

If you want it to be 100% correct, you are going to need to run it against the whole thing again. If you add or remove any linker annotations (like PreserveDependency), it can affect what is trimmed/kept in other assemblies that you aren't building.

I see. That's not all that different than an API change in an assembly. We allow this today without rebuilding the whole product, but technically you need to rebuild the product to make sure it still works. I think we'd be OK permitting incremental re-link of a single assembly for inner-loop with that caveat.

But when using ILLinkTrim.xml, the linker has no idea WHY it is being kept. It just unconditionally keeps it.

Isn't this correct for the linking that happens for the shared framework and runtime packs. I don't expect we'll want to build any dead-code where we rely on the linker to remove code from our shared shipping assets that aren't specialized for an application. Really the only reason we run the linker when building the libraries is to help get get coverage / test the linker. We effectively delete any code that it legitimately removes and force root anything that it cannot see we are using. Sure we get some marginal benefits like trimming out internal constants or methods from internal shared source files that we haven't gone and factored more granularly, but that isn't the reason we run the linker. I agree that an application needs to have different behavior, but isn't that a different concern from the linking that happens when building our product?

I think we'd be OK permitting incremental re-link of a single assembly for inner-loop

I would not bother with incremental re-link of single assembly for inner-loop at all, until it is proven that it is a problem that needs solving.

I would not bother with incremental re-link of single assembly for inner-loop at all, until it is proven that it is a problem that needs solving.

Not solving a problem, just trying to describe how this proposal can avoid breaking existing functionality. Actually, it would be nice if this issue was actually phrased as the problem it's aiming to solve rather than as an implementation change that has a number of side-effects.

I wonder how much we really care about running the linker on the shared framework and runtime packs. When added initially, it was only to get coverage of the linker and trim dead code. We're not really expecting it to provide a ton of value in size gains on the shared framework and runtime-packs. I bet we can measure what we're getting today.

We'd do a lot better testing of the linker and the annotations by running it against self-contained apps/tests that only use some subset of the framework, since that's the real production scenario for the linker.

We're not really expecting it to provide a ton of value in size gains

The linker provides measurable size reduction for shared framework (IIRC, it was like 5%). There is a lot of unused bits in the shared framework binaries that are not practical to remove by editing the sources.

The linker provides measurable size reduction for shared framework (IIRC, it was like 5%). There is a lot of unused bits in the shared framework binaries that are not practical to remove by editing the sources.

Yes. Though how much more can be saved by trimming the whole framework together rather than as individual libraries? Do we have an estimate for that? I'm not clear what more we actually expect to be removed.

The largest benefit here would seem to be that we don't have to pin internals in one assembly in the framework used via reflection by other assemblies in the framework. There are of course other ways to achieve that.

There is a lot of unused bits in the shared framework binaries that are not practical to remove by editing the sources.

Just curious. Is this because it would be a bunch of work? Or some technical reason?

Just curious. Is this because it would be a bunch of work? Or some technical reason?

e.g. it can remove the fields for internal/private consts. That can't be done in source.

Those sorts of optimizations wouldn't require us to run the linker on the entire framework.

The largest benefit here would seem to be that we don't have to pin internals in one assembly in the framework used via reflection by other assemblies in the framework. There are of course other ways to achieve that.

We could tell the linker to treat those as if they were public, in the case of public trimming, rather than treat them as always rooted as it seems to be doing today. Or actually make them public, even if only in the implementation.

Or actually make them public, even if only in the implementation.

This solution was used to solve similar problem for .NET Native and there may be still some leftovers in the tree. It is indeed very simple, but it was frowned upon as non-pure.

The largest benefit here would seem to be that we don't have to pin internals in one assembly in the framework used via reflection by other assemblies in the framework. There are of course other ways to achieve that.

I think that's an understatement. There are two areas where the current setup is problematic. The reflection/dynamic calls inside framework and size. The first one is undoubtedly more problematic, and unless we can remove all reflection/dynamic calls in src/libraries it'll be very tricky to ILLink only individual assemblies and ensure consistency and meet the size constraints at the same time.

Saying that we don't care about ILLinker in the shared framework is also not that simple. The early complain @stephentoub raised was that the inner dev loop on the single assembly will be different. Making a shared framework build different to runtime packs is the other side of the same coin. Many developers develop for shared framework model only which means the same problem happens when they dev-loop on shared framework breaks self-contained version.

and size

Can you give some examples to help me understand where trimming the whole shared framework en mass rather than with individual libraries will net size wins? As noted above, I'm still unclear on that.

Can you give some examples to help me understand where trimming the whole shared framework en mass rather than with individual libraries will net size wins?

Try to look at this from the angle of runtime packs. We can compile runtime pack for AOT on platform X, we could do it the way we stub for example RuntimeFeature.IsDynamicCodeSupported as always false and the ILLinker is able to propagate this through all method calls so you might end up with methods becoming constants which could be propagated again and so on. I'm not saying this is the major reason as we can do all this at app-building time (and pay the cost there) just giving you the example. The problem with dynamic members access is more pressing though.

RuntimeFeature.IsDynamicCodeSupported

Even better example may be hardware intrinsics. This would allow us to strip all e.g. ARM codepaths for x64 runtime pack or shared framework; and vice versa.

I've updated the top comment on this issue to be more like a one-page spec of what we are trying to solve with this work. I hope it better describes the goals we are trying to achieve, and why we should do this.

Please take a look and let me know if I've missed something or if you have other feedback.

Can you give some examples to help me understand where trimming the whole shared framework en mass rather than with individual libraries will net size wins? As noted above, I'm still unclear on that.

@stephentoub - The reason we are proposing to trim the whole shared framework en mass rather than individual libraries isn't for net size wins of the shared framework. It is because:

  1. The linker can better understand the code when it sees the entire "distributable".
  2. We can move to adding attributes in the code to describe what needs to be kept and why.
  3. We can remove ILLinkTrim.xml files, which can't be used when linking for Xamarin/Blazor because they are preserving code when it isn't necessary. And ILLinkTrim.xml will be duplication of the above attributes.

The main point is the last one. We want to enable trimmng as much code as possible for Xamarin/Blazor. (The main "goal" here IMO.)

Great summary. That helped me understand better.

The main point is the last one.

Yes, this is the reflection case I already commented on.

isn't for net size wins of the shared framework. It is because: linker can better understand the code when it sees the entire "distributable".

What does that help other than net size wins?

We can move to adding attributes in the code to describe what needs to be kept and why.

Why is that only relevant when linking the whole framework? Isn't the only case it helps with cross-assembly private reflection?

Even better example may be hardware intrinsics

That's a good example.

What does that help other than net size wins?

Why is that only relevant when linking the whole framework? Isn't the only case it helps with cross-assembly private reflection?

My understanding is that the main goals we are trying to achieve with doing this work are:

  1. Ensure all required code is preserved by the linker.
  2. Allow Xamarin and Blazor applications to use the dotnet/runtime/src/libraries and be as small as possible.

The problem is the usage of ILLinkTrim.xml files to achieve goal (1) is interfering with goal (2). The cross-assembly private reflection is one main reason we are using ILLinkTrim.xml files, and doing this work allows us to remove most of them.

Should shared framework size be a goal here? It was not on my list for reasons to do this originally, but it may change the approach we decide to take to achieve the two goals above.

The problem is the usage of ILLinkTrim.xml files to achieve goal (1) is interfering with goal (2). The cross-assembly private reflection is one main reason we are using ILLinkTrim.xml files, and doing this work allows us to remove most of them.

Is it really most of them? From https://github.com/dotnet/runtime/issues/35199#issuecomment-616713152 it looks like it's a minority. Regardless, I'm no fan of the ILLinkTrim.xml files. I'm simply seeing that this cross-assembly private reflection is not a huge problem, showing up in just a handful of cases, so "duplication" to me isn't a huge downside (I'd also love to understand how much trimming these specific methods vs not is impacting app size). And as you noted in your revised intro post above, there are alternatives. For example, an attribute on internal/private members that need to be kept when the library is built (i.e. the first linker pass) but that themselves can be stripped during that build such that the subsequent app-build trimming could shake those out.

All that said, if shared framework size is a goal, then I agree, we should change how we do the build, as we'd have to change something significant about the build anyway (e.g. we don't currently build assemblies other than corelib per architecture).

Should shared framework size be a goal here?

I would think so, otherwise I question our doing this at all. If it's not, changing the workflow to account for a handful of cases here that could be addressed in other ways seems like overkill to me.

I also pointed out offline to @eerhardt another issue that will need to be addressed as part of this work. Today the linker is used not only for trimming code but also for stripping initlocals. If the linker moves to only be applicable when processing the whole shared framework, then so does its stripping of initlocals, which does have a very measurable impact on perf testing in inner loop for a good number of our assemblies and scenarios. Thus, if we do change to a model where the linker doesn't run as part of inner loop, before committing that change it'll be important to switch the repo to using [SkipLocalsInit]; that's something we should do anyway, but whereas previously it would just have been a nice-to-have, from my perspective doing it must be a precursor to any change here. I opened https://github.com/dotnet/runtime/issues/35527 for that.

e.g. it can remove the fields for internal/private consts. That can't be done in source.

SkipLocalsInit wouldn't do this either, right? Could we imagine a compiler directive to do it? Those also make lots of noise when we generate the dead code reports.

SkipLocalsInit wouldn't do this either, right?

Nope. It's unrelated.

Right... would it be reasonable for the compiler to do it?

I'd also love to understand how much trimming these specific methods vs not is impacting app size

That's being tracked in #35199 and right now it's over 50% for small apps for self-contained setups (this is not a scientific measurement and it's tricky to measure will all the scheduled work).

If the linker moves to only be applicable when processing the whole shared framework

I think the option of running it at multiple places (during inner library dev loop + during bcl build) is also a possibility.

and right now it's over 50% for small apps for self-contained setups

Because of the few internal methods being accessed here via cross-assembly reflection?? I'm having a hard time believing that. Is it mainly the XML ones?

For reference #35547

I just saw the update to the original issue comment and wanted to share a bit of a comment / clarification on my suggestion.

Use "fake" public methods.
The idea here is if the linker thinks the method is "public", then it won't trim it, even if it doesn't see any calls to it. So we would mark the method "public" in the implementation assembly, but leave it out of the ref assembly.
This approach honestly feels like a hack and I wouldn't recommend wide usage of it. It also breaks the developer flow of re-generating the ref source file using our tools (t:GenerateReferenceSource). Now the developer will have to remember to go remove these "fake" public methods in the ref assembly every time it is re-generated.

I disagree with this. It is not a hack to declare on an internal or private method that such a method may be called by another assembly. I actually think it is desirable to represent that API as something more than internal/private, which is why I initially suggested maybe making it public would be the right thing to do. If folks don't want to make things public it doesn't seem unreasonable to mark things as preserved in the public case. I honestly think the thing we are doing when we trim the shared framework to publics is a distinctly different task then when we trim an application.

One thing that is interesting is to look at the limited usage of ILLinkTrim in the framework today: https://gist.github.com/ericstj/b7da2181d5f098a7421d425c6667ad16

At this point, to make forward progress, #35547 is taking the Alternative Solution 1 approach above. While it involves duplication, the limited usage of ILLinkTrim in the framework does make it a feasible approach.

If we agree that is a valid solution, we can use the exact same solution for the following assemblies:

  • System.Security.Claims
  • System.Security.Principal.Windows
  • System.Diagnostics.StackTrace

    • this one is already duplicated, we would just need to strip the resource

  • System.ComponentModel.TypeConverter

After those, the remaining ILLinkTrim.xml usages fall into one of the following buckets:

  1. Only needed for Debugging support
  2. COM support
  3. XElement's default constructor is internal, but needed by XmlSerializer's code gen.

The first 2 could be eliminated if we added new features to mono/linker. XElement is a special case that might need to be handled specially.

This would allow us to remove all the ILLinkTrim.xml files from the compiled libraries and allow the produced assemblies to be linker safe.

Would moving forward with this plan be acceptable to everyone?

If we later decided to link the entire shared framework en mass during the build, all we would need to do is remove the ILLinkTrim.xml files.

That sounds like a good path forward to me.

The shared framework size issue can still be followed-up on, but as a separate issue that need not block forward progress here with this approach.

And we can always revise/revert this approach later if we end up doing the whole framework linking, anyway, for shared framework size reasons.

After those, the remaining ILLinkTrim.xml usages fall into one of the following buckets:

The bucketing makes sense to me but I disagree that we should use the exact same solution for all the problems (e.g. COM is supported on a small subset of platforms so the solution should reflect that). I'd suggest creating 3 different tracking issues or PRs where we can discuss the right solution for each of them.

I'd suggest creating 3 different tracking issues or PRs where we can discuss the right solution for each of them.

Done.

  1. Only needed for Debugging support
  2. COM support
  3. XElement's default constructor is internal, but needed by XmlSerializer's code gen.

Since it sounds like the consensus is to not pursue this approach for now, I'm going to close this issue.

To address https://github.com/dotnet/runtime/issues/35199, we can follow approach Alternative Solution 1. And for cases that approach won't address, the remaining 3 buckets are tracked by the 3 bugs above.

Was this page helpful?
0 / 5 - 0 ratings