Runtime: Move XElement ILLinkTrim.xml file to only be used during the library build

Created on 1 May 2020 · 21Comments · Source: dotnet/runtime

Today, we have an internal XElement constructor that isn't invoked directly:

https://github.com/dotnet/runtime/blob/1d70b31cecddc5d6db0b0f103b1a6cfe963a083b/src/libraries/System.Private.Xml.Linq/src/System/Xml/Linq/XElement.cs#L134-L137

However, we need to tell the IL Linker that it needs to preserve this constructor:

https://github.com/dotnet/runtime/blob/1d70b31cecddc5d6db0b0f103b1a6cfe963a083b/src/libraries/System.Private.Xml.Linq/src/ILLinkTrim.xml#L3-L5

because it is used by the XmlSerializer when generating code. See #20445, #21454, and https://github.com/dotnet/corefx/pull/19244 for some information about this.

However, XmlSerializer is already special casing creating an XElement when it generates IL:

https://github.com/dotnet/runtime/blob/1d70b31cecddc5d6db0b0f103b1a6cfe963a083b/src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriterILGen.cs#L2394-L2418

The only place this constructor is truly necessary is when XmlSerializer generates C# code.

https://github.com/dotnet/runtime/blob/1d70b31cecddc5d6db0b0f103b1a6cfe963a083b/src/libraries/System.Private.Xml/src/System/Xml/Serialization/XmlSerializationWriter.cs#L1939-L1965

Here the XmlSerializer relies on the fact that there is an internal, parameterless constructor for XElement.

Investigating an early Blazor on .NET 5 app, leaving XElement rooted is causing roughly 500KB out of 4MB of unnecessary IL code to be left in the application, since this one constructor will bring in a large amount of XML code.

~We should change the C# generation to special case XElement (just like we did for the IL generation case), and have it generate new XElement("default") instead of calling Activator.CreateInstance. This will allow us to remove this "dead" constructor from XElement, and it will allow us to delete the ILLinkTrim.xml file. This way the linker will never have to worry about preserving the code (in the shared framework, or a linked app).~

~We should make the default constructor public, so the linker knows if it can trim it successfully or not. If no one is calling the constructor, it can be trimmed.~

We should rename the ILLinkTrim.xml file to ILLinkTrim_LibraryBuild.xml so it gets used only while building the System.Private.Xml.Linq assembly. While linking/trimming a complete application, the constructor will be preserved if it is needed by the generated code since the generated code uses this pattern:

C# o.@Element = (global::System.Xml.Linq.XElement)ReadSerializable(( System.Xml.Serialization.IXmlSerializable)System.Activator.CreateInstance(typeof(global::System.Xml.Linq.XElement), System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.CreateInstance | System.Reflection.BindingFlags.NonPublic, null, new object[0], null), true

The linker will be able to recognize Activator.CreateInstance(typeof(global::System.Xml.Linq.XElement) and preserve the constructors on XElement.

cc @krwq @mconnew @StephenMolloy

area-System.Xml linkable-framework untriaged

Source

eerhardt

👍1

Most helpful comment

Another option here I discussed with @ericstj is that if we made XElement's parameterless constructor public this would all go away as well. We could mark it as EditorBrowsableState.Never and doc it as "This method is not intended to be called from consumer code.". That seems like a pretty reasonable way forward here that would resolve the linkability concern.

@krwq @ericstj - thoughts?

eerhardt on 21 May 2020

👍3

All 21 comments

Tagging subscribers to this area: @buyaa-n, @krwq
Notify danmosemsft if you want to be subscribed.

msftbot[bot] on 1 May 2020

I like the plan, but it does mean that previously generated C# code might break. What's been our general stance on compatibility for XmlSerializer-generated code? Hopefully we'd be willing to say that if you want to use it with .NET 5 and you're (de)serializing XElements, you need to re-generate the serialization code, but I don't know enough about use cases to know whether that's feasible.

stephentoub on 1 May 2020

👍1

Nice catch on breaking a previously generated serialization assembly running on .NET 5. I added the breaking-change label to call it out.
Even though it is an internal constructor, and we're using private reflection to call it, it is us who are using private reflection against ourselves.

eerhardt on 1 May 2020

@marklio

danmosemsft on 2 May 2020

Adding @PriyaPurkayastha

How do we feel about folks who code-generated XmlSerializers and compiled them into NuGet packages? It's hard to query the data for arbitrary Activator.CreateInstance calls for XElement. Are there other calls that would be made by such serializers that would be relatively unique to those scenarios? Then, we could know the extent of the break.

On the surface, the benefit here seems super low-value in comparison the the downside of the break.

marklio on 2 May 2020

(whatever we do here, we should fix the generator to stop generating such private reflection into persisted code/assemblies)

stephentoub on 2 May 2020

👍3

@mconnew ?

StephenMolloy on 2 May 2020

@krwq @ericstj - thoughts?

eerhardt on 21 May 2020

👍3

I don't mind making it public for serialization purposes but I'm against adding EditorBrowsableState, not a fan of hidden APIs (although I'll leave the decision about attribute for API reviewers)

krwq on 21 May 2020

Isn't exposing it but not hiding it a pit of failure? Seems like it wasn't exposed because you want to guide developers constructing instances to always pass an element name, and the public parameterless constructor will be the first thing someone sees, and very difficult to use correctly.

stephentoub on 21 May 2020

👍1

Could we be a little more friendly on the documentation and mention that it is the equivalent of calling new XElement(XName.Get("default")) as well as saying it's not intended to be called from consumer code?

mconnew on 21 May 2020

XElement.Name is settable so I don't see any reason why to hide it, it's the same as having a struct and assigning fields vs having specialized constructor, I'd just let it be, there is nothing wrong or confusing about this constructor IMO. With this public you can do:

XElement e = new XElement() { Name = "foo" };

krwq on 21 May 2020

I renamed the issue and edited the top comment to align with the current proposal. If everyone agrees this is the correct approach I can mark this as api-ready-for-review.

eerhardt on 11 Jun 2020

I still think this is creating a pit of failure. But if I'm the only one who thinks so, so be it.

stephentoub on 11 Jun 2020

I still think this is creating a pit of failure.

Exposing it at all? Or exposing it, but not hiding it?

We can bring up the EditorBrowsable concern in the API review and see what the consensus is there.

Also - we don't know of any other options here right? The current options I'm aware of are:

Make the ctor public
a. Optionally add EditorBrowsable(Never) to the ctor
Change sgen to not rely on an internal constructor, and remove the internal constructor
- as pointed out above this would be a breaking change that will break already generated binaries, which makes it infeasible.

Any other options?

eerhardt on 11 Jun 2020

Can you remind me why it's insufficient to just have the internal ctor survive the assembly build but then be trimmable if the app doesn't use it? If app-level linking is performed and generated serializers are involved, won't the linker see the Activator.CreateInstance usage and end up rooting the type's ctors?

stephentoub on 11 Jun 2020

That should work too. The way sgen generates the code looks like:

Which the linker should be able to recognize since the hard-coded Type is passed directly into Activator.CreateInstance.

Although, using sgen in a trimmed app is going to have other issues, specifically that the generated XmlSerializers assembly is loaded weakly, which I don't believe the linker will be able to handle.

https://github.com/dotnet/runtime/blob/1d70b31cecddc5d6db0b0f103b1a6cfe963a083b/src/libraries/System.Private.Xml/src/System/Xml/Serialization/Compilation.cs#L172-L175

eerhardt on 11 Jun 2020

That should work too.

That's my suggested route, then. No new public surface area, we still get to trim when it counts, and the only thing we need to do is rename https://github.com/dotnet/runtime/blob/master/src/libraries/System.Private.Xml.Linq/src/ILLinkTrim.xml.

stephentoub on 11 Jun 2020

👍1

Although, using sgen in a trimmed app is going to have other issues

Using sgen in a trimmed app, or using the generated serializer in a trimmed app? The former doesn't seem like much of a concern. The latter is more impactful; if it's the latter, sounds like we should have a separate issue about it.

stephentoub on 11 Jun 2020

The issue is using the generated serializer in a trimmed app. Logged https://github.com/dotnet/runtime/issues/37754.

eerhardt on 11 Jun 2020