Runtime: Expose top-level nullability information from reflection

Created on 31 May 2019  ·  63Comments  ·  Source: dotnet/runtime

With C# 8, developers will be able to express whether a given reference type can be null:

```C#
public void M(string? nullable, string notNull, IEnumerable nonNullCollectionOfPotentiallyNullEntries);

(Please note that existing code that wasn't compiled using C# 8 and nullable turned on is considered to be unknown.)

This information isn't only useful for the compiler but also attractive for reflection-based tools to provide a better experience. For example:

* **MVC**
    - Provides a way to automatically deserialize inputs to controller methods ("model binding")
    - Would like to provide model validation so that the existing pattern would allow code to bail early
    - Without it, customers would have to apply a custom attribute, such as `[Required]`, or resort to additional null-checks
    - Only needs top-level annotations, i.e. `string?` but not nested, such as `IEnumerable<string?>`

* **EF**
    - Provides a way to generate database schemas from user classes ("code first")
    - Would like use nullable information to infer whether columns should be null or non-null (they already do that for nullable value types).
    - Without it, customers would have to apply a custom attribute to repeat that information.
    - Also only needs top-level annotations

The nullable information is persisted in metadata using custom attributes. In principle, any interested party can already read the custom attributes without additional work from the BCL. However, this is not ideal because the encoding is somewhat non-trivial:

* **Custom attribute might be generated**. The custom attribute might have been generated (meaning is embedded in the user's assembly) or might use the [to-be-provided attribute](https://github.com/dotnet/corefx/issues/36222).
* **Encoded as a byte array**. The tri-state is encoded as a linearized version of the constructed generic type.
* **Compressed**. Right now, each member will have the attribute when nullability is turned on but this causes metadata bloat. We're working on a proposal that allows the containing member, type, or assembly to state a default to reduce the number of attribute applications.

Exposing nullability in reflection for nested generics isn't straight forward. We can't just expose an additional property on `Type` because at runtime there is no difference between `string` (unknown), `string?` (nullable), and `string` (non-null). So we'd have to expose some sort of API that allows consumers to walk the type structure and getting information. Given that we have other cases with similar issues, such as `dynamic` and tuple names, we might want to generalize this. For .NET Core 3.0 we're likely running out of time to design a complex API like this.

However, the 80% scenario is just answering the question whether the given return value or parameter can be null *as a whole*. This simplifies the design tremendously as it only requires exposing APIs on the `MethodInfo`, `ParameterInfo`, `FieldInfo`, `PropertyInfo`, and `EventInfo`. Even if we were to provide the more complete API, these simple APIs would still be useful as they would be much easier to use.

### Unifying nullable-value types and nullable-reference types

It was suggested that these APIs also return `NullableState.MaybeNull` for nullable value types, which seems desirable indeed. Boxing a nullable value type causes the non-nullable representation to be boxed. Which also means you can always cast a boxed non-nullable value type to its nullable representation. Since the reflection API surface is exclusively around `object` it seems logical to unify these two models. For customers that want to differentiate the two, they can trivially check the top-level type to see whether it's a reference type or not.

### API proposal

The APIs are declared virtual so that derived reflection models can implement them in the appropriate way (for example, the metadata reader might read attributes in a better way which can make these cheaper to compute).

```C#
namespace System.Reflection
{
    public enum NullableState
    {
        Unknown,
        NotNull,
        MaybeNull
    }

    public partial class FieldInfo
    {
        public virtual NullableState GetNullableState()        
    }

    public partial class PropertyInfo
    {
        public virtual NullableState GetNullableState()        
    }

    public partial class EventInfo
    {
        public virtual NullableState GetNullableState()        
    }

    public partial class MethodBase
    {
        public virtual NullableState GetReturnTypeNullableState()        
    }

    public partial class ParameterInfo
    {
        public virtual NullableState GetNullableState()        
    }
}

Sample usage

Below is a somewhat contrived usage for a serializer:

```C#
private void DeserializePropertyValue(PropertInfo p, object instance, object? value)
{
if (value == null)
{
var nullableState = p.GetNullableState();
var allowsNull = nullableState != NullableState.NotNull;
if (!allowsNull)
throw new MySerializerException($"Property '{p.GetType().Name}.{p.Name}'' cannot be set to null.");
}

p.SetValue(instance, value);

}
```

@dotnet/nullablefc @dotnet/ldm @dotnet/fxdc @rynowak @divega @ajcvickers @roji @steveharter

api-needs-work area-System.Reflection

Most helpful comment

I do not think we should be trying to get these half-baked APIs in at the last minute. It should be fine for the few places that need to have the nullability decoder ring to use a local private copy of it for 3.0.

All 63 comments

What's the behaviour with a value type? Does this include reflecting the effect of Nullable<> as well?

Are these really static methods? Shouldn't they have a parameter then?

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute. I.e

[NotNull] string? Property { get; set; }

That will have a nullable input but a non-null output.

Also what about generic constraints or interface implementations? Those can also have nullability information and to when getting the generic constraints or interface implementations we get an array of Type.

@rynowak

What's the behaviour with a value type? Does this include reflecting the effect of Nullable<> as well?

They would return Unknown right now. We could add an enum member NotApplicable that we return for non-reference types or throw InvalidOperationException. We could also handle those, but this feels wrong because nullable value types are actually different types.

@mattwar

Are these really static methods? Shouldn't they have a parameter then?

Sigh. No, I wrote up a sample project to test this and I used extension methods and then I didn't fix it when copy and pasting. Fixed.

@safern

We need to think about the higher-level attributes. I only thought about NullableAttribute so far.

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute.

This wouldn't change the API shape. People would have to go to PropertyInfo.GetGetMethod() and ask the corresponding MethodInfo directly.

Also what about generic constraints or interface implementations? Those can also have nullability information and to when getting the generic constraints or interface implementations we get an array of Type.

See intro. I've deliberately scoped them out.

We could also handle those, but this feels wrong because nullable value types are actually different types.

They are, but on a source level, C# abstracts the difference away. Maybe introduce something like ValueNull and ValueNotNull as additional enum values. That way, the distinction is still prominent, but it’s also not a runtime surprise (a.k.a. exception) that these won’t be handled.

Why not just make these extension methods and stash them away somewhere in the System.Diagnostics.CodeAnalysis namespace? Or better yet, in a NuGet package shipped out of the Roslyn repo - since their meaning is defined by the C# compiler?

These annotations don't have a meaning to the runtime and reflection itself doesn't respect them in MethodBase.Invoke or FieldInfo.SetValue - it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack. These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

E.g. we added a IsByRefLike property to System.Type, for ref struct because byref-like types have a meaning to the runtime. We didn't add any API for unmanaged struct, because that constraint only means something to the C# compiler.

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute.

See https://github.com/dotnet/csharplang/issues/2384 for this at the language level. @terrajobst users could call PropertyInfo.GetGetMethod() and check there, but what would be the result for a call to GetNullableState() directly on a PropertyInfo which has different nullability settings for getter and setter?

it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack.

I think the idea here is that these attributes are meaningful to various reflection-based consumers at runtime (EF, ASP.NET, 3rd-party products). If it were only for the compiler an API would probably not be needed at all.

These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

They are different in that they are emitted by the compiler and interpreted by it. They are also potentially much more complex to interpret (a context-based scheme may be introduced to combat the bloat they introduce), hence the desire to encapsulate that interpretation logic.

These annotations don't have a meaning to the runtime and reflection itself doesn't respect them in MethodBase.Invoke or FieldInfo.SetValue - it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack. These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

These attributes are not only going to be respected by the C# compiler, they are also going to be understood and consumed by other nullable aware languages (such as F#).

I would also say that, regardless of whether or not the runtime treats these attributes specially, the overarching ecosystem ultimately will, which I believe is the important point here. These attributes (and others like the unmanaged constraint) convey additional metadata that is generally (but not always) enforced by some user code in the method and it is useful to have an easy/logical way of querying it so that you can perform the appropriate checks when/if needed.

We are already burning this information into all of our assemblies, because of its perceived benefit and exposing that metadata in a well-defined way in the reflection type system will only help fulfill that story and make it easier for users who want to utilize them.

I would also say that, regardless of whether or not the runtime treats these attributes specially, the overarching ecosystem ultimately will, which I believe is the important point here

To be clear, my question is not whether we need an API that provides this functionality - but whether such API should be on ParameterInfo and friends. Since this is adding non-virtual methods (not properties), my expectation is that this API is not going to do any caching - so it doesn't really _need_ to be an instance method.

We could easily stash this API away in e.g. System.Reflection.TypeExtensions and even make it available on lower versions of NetStandard (doesn't have to be exclusive to .NET Core 3.0).

We could easily stash this API away in e.g. System.Reflection.TypeExtensions and even make it available on lower versions of NetStandard (doesn't have to be exclusive to .NET Core 3.0).

Nullable reference types is a C# 8 feature which is only officially supported on .NET Standard 2.1 and higher.

I think discussion on where in the reflection stack it goes is reasonable, although I think most people would expect and look for it to be an instance (or extension) method

Nullable reference types is a C# 8 feature which is only officially supported on .NET Standard 2.1 and higher.

Not that I want to make this my main argument (my main argument is to not pollute the instance member surface area with convenience APIs), but runtime reflection is not the only flavor of reflection. A .NET Standard 2.1 assembly can be loaded for inspection in all sorts of ways on downlevel runtimes (reflection only load is one way, and things like our own System.Reflection.MetadataLoadContext is another). Some people might find this API in the form of an extension method that runs downlevel still valuable.

@chucker

They are, but on a source level, C# abstracts the difference away. Maybe introduce something like ValueNull and ValueNotNull as additional enum values. That way, the distinction is still prominent, but it’s also not a runtime surprise (a.k.a. exception) that these won’t be handled.

Syntactically, you could make the argment that they look similar. But C# doesn't abstract them away. int? and int are different types and this has real observable language behavior. For example, this is valid:

```C#
public void M(int x) {}
public void M(int? x) {}

While this is not:

```C#
public void M(string x) {}
public void M(string? x) {}

In other words, you can't overload based on nullability of reference types (because they are fundamentally the same type at runtime). Also, the types have different members. So no, I don't think we should make the argument that "C# abstracts the differences away".

That being said, that doesn't mean we cannot expose the nullability information for nullable types. Boxing a nullable value type causes the non-nullable representation to be boxed. Which also means you can always cast a boxed non-nullable value type to its nullable representation. Given that the reflection API surface is exclusively around object, this might actually work just fine.

So I think I'm coming around agreeing with you 😄

@MichalStrehovsky

Why not just make these extension methods and stash them away somewhere in the System.Diagnostics.CodeAnalysis namespace? Or better yet, in a NuGet package shipped out of the Roslyn repo - since their meaning is defined by the C# compiler?

But they aren't specific to C#. In order for the C in CLR to be meaningful, other languages, so they support nullability, should use the same underlying encoding. Otherwise exchanging types no longer works. Hence, the reflection API can be considered general purpose.

These annotations don't have a meaning to the runtime and reflection itself doesn't respect them in MethodBase.Invoke or FieldInfo.SetValue - it feels odd to burn knowledge of custom attributes that are only meaningful to the C# compiler into the runtime reflection stack. These are not that different from e.g. the custom attributes for System.ComponentModel - they're just guidelines.

Reflection is about inspecting the program at runtime. While I understand that there good reasons to separate concerns and C# semantics and runtime semantics are different, it totally makes sense to expose static information as people have an expectation that they can access that at runtime. And they already can, but it's error prone. As my examples above show, I'd argue these aren't contrived scenarios but fairly compelling and very much in line with expectation for the .NET development experience.

I don't like shipping core functionality like this on NuGet as this is yet another thing that can go out of sync between the various pieces. Also, given our strategy to stronger align C# with the release cycle of the .NET Core platform, this is also against what we want to achieve as a product.

To be clear, my question is not whether we need an API that provides this functionality - but whether such API should be on ParameterInfo and friends. Since this is adding non-virtual methods (not properties), my expectation is that this API is not going to do any caching - so it doesn't really need to be an instance method.

That is a good point -- I should probably mark these APIs as virtual so that other implementations (such as the new metadata based reflection layer) can provide specialized implementations.

We could easily stash this API away in e.g. System.Reflection.TypeExtensions and even make it available on lower versions of NetStandard (doesn't have to be exclusive to .NET Core 3.0).

I'd rather not. We don't want to optimize for out-of-band or downlevel. We want to foremost optimize for what's right for the .NET platform moving forward. I'd argue making them part of the reflection model to encapsulate them in a central way is the right thing to do.

Should we also see a need for this functionality downlevel, we could also ship a bridge pack that defines them as extension methods. But let's not default to frankendesigns just because.

Not that I want to make this my main argument (my main argument is to not pollute the instance member surface area with convenience APIs), but runtime reflection is not the only flavor of reflection

I'd argue that the reverse is true today: the reflection APIs are polluted with a ton of plumbing APIs when the vast majority of the consumers want simple methods to get stuff done. And moving forward I think reasoning about declared nullability will be important to many consumers of reflection.

@roji

For, PropertyInfo, we can have a NullableState on the setter and on the getter depending if we have an attribute.

See dotnet/csharplang#2384 for this at the language level. @terrajobst users could call PropertyInfo.GetGetMethod() and check there, but what would be the result for a call to GetNullableState() directly on a PropertyInfo which has different nullability settings for getter and setter?

Right now, this issue is still open so there is no spec. In general, at the metadata level the property itself has a signature (and thus a return type). Similar, the C# compiler emits the nullable attribute for the property itself, as well as for all accessors:

```C#
// public string? Name { get; set; }

[Nullable(1)]
public string Name
{
[return: Nullable(1)]
get
{
return this.k__BackingField;
}
[param: Nullable(1)]
set
{
this.k__BackingField = value;
}
}
```

So in other words I don't think exposing this on the property is a bad idea; depending on context it might just not be sufficient for the consumer.

you can't overload based on nullability of reference types (because they are fundamentally the same type at runtime)

That’s an interesting edge case, yes.

But I stand by my assertion — the syntax of string? wasn’t just picked because other languages do that, but also because, by and large, it expresses a sufficiently similar intent to int?. And if someone were to design .NET today, those subtle behavioral differences would have been avoided. They exist by and large for legacy reasons.

(IANA C# language designer, though.)

I'd argue that the reverse is true today: the reflection APIs are polluted with a ton of plumbing APIs when the vast majority of the consumers want simple methods to get stuff done.

Yup. I think higher-level additions to Reflection are welcome.

But I stand by my assertion — the syntax of string? wasn’t just picked because other languages do that, but also because, by and large, it expresses a sufficiently similar intent to int?

I know. I was in the LDM meeting when the feature was discussed and many times when it was actively designed. Originally, we also considered doing string and string! where string meant nullable and string! meant non-nullable. The lack of symmetry with nullable value types was urking us, but the higher order bit was that we all felt nullable would be the wrong default. At the same time, it was also clear from the beginning that we can't actually make string and string! or string? different types. Which is very different from how many other languages have designed it. So yeah, the result is a compromise between what is doable and what introduces the least amount of concepts.

But they aren't specific to C#. In order for the C in CLR to be meaningful, other languages, so they support nullability, should use the same underlying encoding.

Nullability as it was implemented is a tooling concern - the CLR is not involved. CLR and the metadata format provide a general purpose way to encode metadata and the tools can have an agreement on what it means. E.g. JetBrains has their own set of attributes that deal with nullability and they're in the same boat as far as the runtime is concerned. So does PostSharp or Entity framework. These are all things that already exist in the ecosystem.

It would be a different discussion if we gave these new annotations a meaning within the runtime, but as far as the runtime is concerned, the annotations that the C# compiler uses are no different from the JetBrains annotations.

The runtime reflection API surface area should ideally not be opinionated about things that don't have a meaning to the runtime. A way to achieve that is to put the various opinions into extension methods that people can opt into, depending on what they're doing.

The runtime reflection API surface area should ideally not be opinionated about things that don't have a meaning to the runtime.

As @terrajobst said, "Reflection is about inspecting the program at runtime."

Here's MS introducing the feature:

The classes in the System.Reflection namespace, together with System.Type, enable you to obtain information about loaded assemblies and the types defined within them, such as classes, interfaces, and value types.

It's not (just) about what's meaningful to the runtime. It's also about what's meaningful to _you_, the developer, _during_ runtime. This is already the case with MemberInfo.GetCustomAttributes() — those are largely meaningless to the runtime, but may be very meaningful to you (for example, to help with serialization).

I think @MichalStrehovsky isnt' wrong; there is a balance to strike. If we put every little language convention into the reflection model, we'll create a mess. Also, not every language feature is relevant at runtime. However, I think we can agree that these things hold:

  1. Nullable reference types will be a popular feature
  2. There are sensible scenarios using the nullable annotations at runtime to make decisions
  3. Getting them right is hard

Given all three things I'd argue that it is a sensible addition to the reflection API surface.

+1 on having the new API recognize nullable value types. It's true the nullable reference and value types are completely different things under the hood, but it seems like a good idea not to surface that distinction where not strictly necessary. Ideally users shouldn't need to care more about this than absolutely necessary (see https://github.com/dotnet/csharplang/issues/1865 for an example issue for making nullable value types behave more like NRTs).

Also +1 on including the nullability interpretation logic in the reflection API as opposed to some extension method in a nuget package. This is where people will be looking for it, and if the representation ever changes it makes sense to couple this to the version of .NET Core.

Right now, this issue is still open so there is no spec. In general, at the metadata level the property itself has a signature (and thus a return type). Similar, the C# compiler emits the nullable attribute for the property itself, as well as for all accessors

Makes sense, although this could get very confusing to users, e.g. what does the property's nullability annotation even mean when it's at odds with the nullable on its getter/setter - but I guess that a language question rather than a question for this API. It's still probably a good idea to understand where the language design is heading and make sure the API aligns with that.

@MichalStrehovsky

It would be a different discussion if we gave these new annotations a meaning within the runtime, but as far as the runtime is concerned, the annotations that the C# compiler uses are no different from the JetBrains annotations.

I'd expect (or rather hope) that the Jetbrains annotations would eventually be replaced by the C# 8.0 ones. There's no real justification for the ecosystem to have multiple ways to say "this should not/cannot contain null", and had nullability been part of the original langauge the Jetbrains (and other) ways would probably not have existed. This is what justifies the reflection API being opinionated and exposing this specific representation.

The runtime reflection API surface area should ideally not be opinionated about things that don't have a meaning to the runtime. A way to achieve that is to put the various opinions into extension methods that people can opt into, depending on what they're doing

How far do you take that? Take optional values, params, constant fields, properties or events; no meaning to the runtime and only a tooling concern, but we still represent these.

Great point @davkean!

How far do you take that? Take optional values, params, constant fields, properties or events; no meaning to the runtime and only a tooling concern, but we still represent these.

Those are also represented with unique structures at the file format level. If we take those apis away, there's no way to get to them. Nullable annotations are custom attributes like any other. That's my line.

It's very common to look for the CompilerGeneratedAttribute to see if something was compiler generated. Would we put a top level IsCompilerGenerated property on FieldInfo and friends to make it discoverable and optimizable as well? Where do _you_ draw the line?

Where do you draw the line?

I think a key point - and what drove this issue in the first place - is that the current scheme of placing an attribute on every single member is causing significant bloat, and the compiler team is looking into a more elaborate and space-saving scheme (e.g. assembly-level and/or type-level defaults). That would make interpretation much more complicated, hence the need of an API.

Basically, if this were a simple boolean attribute on a member I don't think we'd be having this discussion.

I also think that if we end up going with a context-based interpretation, then some form of caching could be beneficial. For example, if we allow specifying default nullability at the type level for its members (to reduce repetition for its members), that information could be cached on Type to avoid performing constant expensive lookups. This would be another reason to put the methods in the reflection API rather than as extension methods.

@MichalStrehovsky

Those are also represented with unique structures at the file format level. If we take those apis away, there's no way to get to them. Nullable annotations are custom attributes like any other. That's my line.

That's a fair point.

It's very common to look for the CompilerGeneratedAttribute to see if something was compiler generated. Would we put a top level IsCompilerGenerated property on FieldInfo and friends to make it discoverable and optimizable as well? Where do _you_ draw the line?

I don't think it's that common for runtime reflection to ask that question. It's more common in static analysis tools and AFAIK Roslyn does expose an API for that. If it were a common question to ask at runtime to, then yes, I would consider exposing a property on the reflection APIs as well.

I've outlined my bar above:

  • If it's popular feature
  • If there are sensible scenarios at runtime to make decisions
  • If reading the attributes can be error prone

During ProjectN we've learned the hard way that reflection is a very popular feature among .NET libraries and applications. I'm somewhat surprised that we're having the conversation if there is value in exposing a convenience API for things we expect many developers wanting to do. Having these APIs is the very reason why .NET is considered productive by developers.

In general, if the question could be answered by a one liner, I don't think we need a reflection API. But it's really not. Below is the code I wrote. The code currently assumes that the nullable state is encoded on the member directly, which won't be the case once we ship. Also, I'm sure you can find at least one bug in it, too 😉


Implementing GetNullableState()

```C#
namespace System.Reflection
{
public enum NullableState
{
Unknown,
NotNull,
MaybeNull
}

public static class NullableExtensions
{
    public static NullableState GetNullableState(this FieldInfo f)
    {
        return GetTopLevelNullableState(f.GetCustomAttributesData());
    }

    public static NullableState GetNullableState(this PropertyInfo p)
    {
        return GetTopLevelNullableState(p.GetCustomAttributesData());
    }

    public static NullableState GetNullableState(this EventInfo e)
    {
        return GetTopLevelNullableState(e.GetCustomAttributesData());
    }

    public static NullableState GetReturnTypeNullableState(this MethodInfo m)
    {
        return GetTopLevelNullableState(m.ReturnTypeCustomAttributes.GetCustomAttributes(false).OfType<Attribute>());
    }

    public static NullableState GetNullableState(this ParameterInfo p)
    {
        return GetTopLevelNullableState(p.GetCustomAttributesData());
    }

    private static NullableState GetTopLevelNullableState(IEnumerable<CustomAttributeData> customAttributes)
    {
        foreach (var x in customAttributes)
        {
            if (x.AttributeType.FullName == "System.Runtime.CompilerServices.NullableAttribute" && x.ConstructorArguments.Count == 1)
                return TranslateObject(x.ConstructorArguments[0].Value);
        }

        return NullableState.Unknown;
    }

    private static NullableState GetTopLevelNullableState(IEnumerable<Attribute> customAttributes)
    {
        foreach (var x in customAttributes)
        {
            if (x.GetType().FullName == "System.Runtime.CompilerServices.NullableAttribute")
            {
                var field = x.GetType().GetField("NullableFlags");
                if (field != null)
                    return TranslateObject(field.GetValue(x));
            }
        }

        return NullableState.Unknown;
    }

    private static NullableState TranslateObject(object o)
    {
        if (o is byte[] bs && bs.Length > 0)
            return TranslateByte(bs[0]);
        else if (o is byte b)
            return TranslateByte(b);
        else if (o is ReadOnlyCollection<CustomAttributeTypedArgument> args && args.Count > 0)
            return TranslateObject(args[0].Value);
        else
            return NullableState.Unknown;
    }

    private static NullableState TranslateByte(byte singleValue)
    {
        switch (singleValue)
        {
            case 1: return NullableState.NotNull;
            case 2: return NullableState.MaybeNull;
            default: return NullableState.Unknown;
        }
    }
}

}

```

I don't think it's that common for runtime reflection to ask that question.

Here it's used in ASP.NET Core, here in EF, here in Newtonsoft JSON. There are also hits in CoreFX and CoreLib. Maybe mine and your definitions of "common" differ.

During ProjectN we've learned the hard way that reflection is a very popular feature among .NET libraries and applications. I'm somewhat surprised that we're having the conversation if there is value in exposing a convenience API for things we expect many developers wanting to do

To be clear, my concern is only about polluting the member surface area with something that could be in the System.Diagnostics.CodeAnalysis namespace. There are plenty other context-specific convenience methods we could turn into instance methods to make them discoverable (the one-liners that just check for the presence of an attribute), but we didn't, because asking on StackOverflow is discoverable enough. It seems like instead of focusing on that, the arguments here focus on "Michal, but this is a useful concept to expose", and now even "Michal, but reflection is useful". I'm not arguing on those.

Maybe mine and your definitions of "common" differ.

Good point, we should expose IsCompilerGenerated too.

It is not argument that Reflection does not represent the languages that consume them. Binder itself tries to implement C#/VB binding rules. GetCustomAttributes takes into account overridden "properties" - a concept that doesn't even exist at runtime, nor represented in metadata.

Binder itself tries to implement C#/VB binding rules

I find this to be an egregious layering violation and a .NET 1.0 mistake. The reflection system has no business in doing binding.

I'd like to see reflection be "policy free" as much as reasonable, so inferring "nullability" from NullableAttribute andor Nullable<> seems like it should be a higher-level API than reflection. Or at least until the implementation and adoption has proven to be stable.

@steveharter

I'd like to see reflection be "policy free" as much as reasonable, so inferring "nullability" from NullableAttribute andor Nullable<>

The proposal here isn't inference -- it is reading the attribute that the compiler put in the metadata. The point is to get the result correct and agree with the compiler semantics. And we're in a much better boat to make that efficient & correct than higher layers.

seems like it should be a higher-level API than reflection.

That layer doesn't exist today. In my book creating such a layer, meaning a new assembly and a new set of types has much higher bar than adding the very targeted methods I'm proposing here. How many features would we have to make such a component necessary and warranted? There isn't enough data to make that case. And I'm asserting that waiting for that data isn't actionable.

Or at least until the implementation and adoption has proven to be stable.

We've put in quite a bit of effort to expose this information in the BCL. That includes the attributes in System.Runtime.CompilerServices or the annotations in the public API surface. We've taken the bet as a platform and once we shipped we can't easily remove any of it. I don't see a reason why we'd suddenly draw the line at five additional methods and one enum. We'll just expose the platform's state.

I think these should be static state-free decoder methods.

The point is to get the result correct and agree with the compiler semantics.

It is fine for these to be part of the platform, in System.Runtime if we wish, be kept in sync with the compilers, etc.

That layer doesn't exist today

These domain-specific layers do exist today, all over the place. For example, there is bool Marshal.IsTypeVisibleFromCom that reads a bunch of COM related attributes and gives you an answer.

One of the problems not mentioned here yet with stuffing everything possible into the reflection object model itself are hard to get right performance characteristics. Should the API implementations provide fast cache for the result - it means that every user of reflection has to pay because of the space for the cache needs to be allocated; or should the API implementations not cache the result and always crack it from scratch - it means that every other place that calls these APIs will end up caching this? There is no good answer.

When these APIs are state-free static decoders, it is pretty obvious what their performance is going to be and that one should better cache the result.

It is fine for these to be part of the platform, in System.Runtime if we wish, be kept in sync with the compilers, etc.

👍

Should the API implementations provide fast cache for the result - it means that every user of reflection has to pay because of the space for the cache needs to be allocated; or should the API implementations not cache the result and always crack it from scratch - it means that every other place that calls these APIs will end up caching this?

The idiomatic pattern in .NET for indicating uncached and potentially expensive operations is exposing them as a methods, which is why I didn't make them properties.

Should these APIs be called a lot, we may find ourselves in the position where we still want to cache the operation in order to speed up the rest of the stack. However, in that case I don't see how the proposed location makes makes that harder. The (internal) cache could still live wherever we want. Making them instance methods opens up the path for them to be stored as fields, but it doesn't have to be. It could still, for example, be a conditional weak table on the side. We could decide to make these non-virtual, in order to have more wiggle room.

Should these APIs be called a lot, we may find ourselves in the position where we still want to cache

The performance characteritics should be part of the API design. When we fundamentally change the API performance characteristics, it just causes confusion. If certain callers end up calling these APIs a lot, they should be responsible for inventing their own scheme for caching.

the 80% scenario is just answering the question whether the given return value or parameter can be null as a whole

If this is going to be popular, I am pretty sure somebody comes up with good reasons why they want to answer the fine grained questions. What would the object model for answering the fine grained questions be? We should do an exercise what the object model for that would look like, and make sure there is natural transition from the 80% object model to the 99% object model. Are we going to add N more Nullable related methods to all reflection *Info to support the 99% object model?

The performance characteritics should be part of the API design.

Agreed, which is why I made them methods as I expect them to be not cached.

What would the object model for answering the fine grained questions be? We should do an exercise what the object model for that would look like, and make sure there is natural transition from the 80% object model to the 99% object model. Are we going to add N more Nullable related methods to all reflection *Info to support the 99% object model?

Also agreed; I have a design in my head. I'll update the issue description with more details. The basic idea would be to introduce a new concept in System.Reflection (e.g. TypeWithAnnotations) that can be created from the *info classes that allows you to walk a type signature and expose information that is provided by custom attributes (I don't think it would be an actual System.Type but a class that holds a type and the custom attributes from the original member info). It would have a property that exposes the nullability state. If we were to expose other information (such as dynamic or tuple names) they could just become properties on this new type too. This type would also cache the answers, but since the creation of the type itself won't be it wouldn't create any overhead for the core reflection model. But as I said, I don't expect to land this design for 3.0.

I'd like to see reflection be "policy free" as much as reasonable, so inferring "nullability" from NullableAttribute andor Nullable<>

The proposal here isn't inference -- it is reading the attribute that the compiler put in the metadata. The point is to get the result correct and agree with the compiler semantics. And we're in a much better boat to make that efficient & correct than higher layers.

Does the proposed API attempt to also tie in Nullable<> semantics?

Is MethodBase.GetReturnTypeNullableState() even necessary? It is just a shortcut for MethodInfo.ReturnParameter.GetNullableState().

@steveharter @GrabYourPitchforks are either of you driving this (as owners of area-Reflection)? Do you need help here to complete this by 7/1?

I do not think we should be trying to get these half-baked APIs in at the last minute. It should be fine for the few places that need to have the nullability decoder ring to use a local private copy of it for 3.0.

Video

  • Given that we don't have all the features in-place, we believe this feature doesn't make the cut for 3.0
  • What about the custom attributes, such as NullIfNotNullAttribute?
  • What about generic type parameters (e.g. for tuples)
  • Namespace choice: should this be in the System.Reflection namespace or should this go into a nested namespace?
  • There is the general question whether MVC or EF should even handle nullable annotations.
  • How do we square making runtime decisions with being able to change annotations
  • We should consider generalizing the feature to handle dynamic and tuple names.

@steveharter

Does the proposed API attempt to also tie in Nullable<> semantics?

Yes, that was the idea. See section "Unifying nullable-value types and nullable-reference types".

@jkotas

Is MethodBase.GetReturnTypeNullableState() even necessary? It is just a shortcut for MethodInfo.ReturnParameter.GetNullableState().

That's a good point. That didn't occur to me.

Since the nullability annotations are still in flux, and since it's likely that consumers will want to consume this from downlevel frameworks, we should start this work in _corefxlab_ and see where things go from there. That also allows us to iterate rapidly without waiting for a formal review to take place.

Edit: I also don't think _System.Reflection_ is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

Edit: I also don't think System.Reflection is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

Note that the original need for this arose from packages such as EF and MVC. In that sense it's just another piece of information retrieved about members, next to other things retrieved via reflection.

@GrabYourPitchforks

Since the nullability annotations are still in flux, and since it's likely that consumers will want to consume this from downlevel frameworks, we should start this work in _corefxlab_ and see where things go from there. That also allows us to iterate rapidly without waiting for a formal review to take place.

We should design the API/feature for in-box first and then figure out what downlevel looks like.

Edit: I also don't think _System.Reflection_ is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

I disagree. Reflection supports runtime as well as design-time metadata via MetadataLoadContext. The nullable annotations will be an important part of our public API surface, both for the platform as well as the larger ecosystem. While the annotations are in flux the encoding itself is stable at this point and we should centralize the handling for all consumers.

Edit: I also don't think System.Reflection is the right place for this because it's not really a runtime or a reflection concern. It's really something meant for compiler consumption.

Reflection has lots of concepts that aren't runtime concerns; static method inheritance, default values, VB/C#-like binding (params), concepts of overridden properties. All needed for compilers.

Number of these - VB/C#-like binding in particular that is largely outdated and unusable by now - are design mistakes of the past that we should not repeat.

On the binding site I very much agree, but I see nullable annotations as largely language independent. Yes, only C# has support for it right now but there is no reason why it can't be used by other languages or static analyzers. For example, R# has used side car files for years and lights up warnings in both C# and VB code. They could just read the metadata as well.

there is no reason why it can't be used by other languages or static analyzers.

Agree. The reflection object model is largely unusable for building language compilers or static analyzers, so this API won't solve their problem.

There are number of C# language features that are in similar category: named tuple items, readonly, unmanaged constrains, ... . We need a principle that we are going to use to decide whether given C# language feature should be exposed in reflection object model. What should be this principle?

We need a principle that we are going to use to decide whether given C# language feature should be exposed in reflection object model. What should be this principle?

To me, it's a combination of "how hard is the decoding" and "how likely is it that someone needs to answer that question using reflection APIs". Today, most tools that read metadata don't use reflection because it was tied to runtime loading. With MetadataLoadContext my hope would be that more tools could use the (familiar) reflection APIs, so I wouldn't want to focus reflection on runtime-only scenarios.

Regardless, in this case we have both satisfied IMHO: decoding is non-trivial and there are runtime reflection-scenarios for serializers, OR mappers, and model binder. So I'd say that makes the bar.

Most of the listed features and more (NRT, named tuples, xml docs) need to be reflected by serializers or schema generators (EF, json schema or open api). This is why i had to implement this myself (NRT, xml docs): https://github.com/RicoSuter/Namotion.Reflection

Ideally this would be provided by .NET so I dont have to maintain this myself. However it would be hard to add these APIs to existing types in a nice, because for example Type is not context aware and the same object for every occurrence (is this correct?) - where should the context specific NRT be stored? A sample: ASP.NET Core api explorer only provides the Type of the response - with only this object and no context (i.e. the return parameter or attributes) it’s impossible to retrieve the NRT info which is needed for the openapi spec.

I think these advanced reflection apis should be provided as a nuget package (not part of the .net core/std api) so that they can easily be updated/improved/fixed and are not tight to the runtime version (C#/NRT can also be used in a .NET Standard 1.0 or .NET Core 2.1 library).

FYI just ran into the non-trivial problem of calculating nullability fo generic type parameters in EF Core (https://github.com/dotnet/efcore/issues/20718). ASP.NET currently ignores this case (https://github.com/dotnet/aspnetcore/blob/master/src/Mvc/Mvc.DataAnnotations/src/DataAnnotationsMetadataProvider.cs#L492), just to reiterate the need for this feature.

/cc @rynowak

Due to 5.0 schedule constraints, this is being moved to Future.

Here's a strawman2 proposal (updated from strawman1 proposal)

  • Add extension methods to the existing System.Reflection.Extensions assembly.

    • This avoids runtime knowledge and pay-to-play for consumers.

  • These lookup the nullable attributes and return a class that returns the nullable information and possibly other attribute-based information such as [DoesNotReturn].

    • This is somewhat slow and no extra caching is performed (like similar cases elsewhere). Caching should be done at a higher level, if necessary.

    • First a check is done to see if the type is in a nullable context or not. If the type is not in a nullable context, return "CLR semantics" of NotNull or MayBeNull depending on if the type is a reference type, value type, or Nullable<T>.

    • The more I look into this, the more complexity I find:

    • The [NullableAttribute] semantics are non-trivial

    • The nullable context attribute can be in several locations

    • It is non-obvious whether the current nullability attributes affect in or out nullability. For example, the names "AllowNull" and "MaybeNull" don't provide that context . This design tries to make that easier by having both "in" and "out" properties each with their own enum.

    • We could add additional properties such as "bool AllowsNull {get;}" that can be used for simple scenarios that don't need to differentiate between in and out (especially true for properties; parameters differ more than properties here).

  • The extensions should work for both runtime-based and designtime-based reflection types (via MetadataLoadContext).

    • This means we can't return System.Attribute instances since MLC can't create attributes since that could execute code.

Sample usage

PropertyInfo propertyInfo = typeof(TestClass).GetProperty("MyProperty")!;
AttributedInfo attributeInfo = propertyInfo.GetAttributedInfo();
Assert.Equal(NullableInCondition.AllowNull, attributeInfo.NullableIn);
Assert.Equal(NullableOutCondition.MaybeNull, attributeInfo.NullableOut);

Proposed API (in System.Reflection namespace):

+public static class FieldInfoExtensions
{
+    public static AttributedInfo GetAttributedInfo(this FieldInfo fieldInfo);
}

public static class MethodInfoExtensions
{
+    public static AttributedInfo GetAttributedInfo(this MethodInfo method);
}

+public static class ParameterInfoExtensions
{
+    public static AttributedInfo GetAttributedInfo(this ParameterInfo parameter);
}

public static class PropertyInfoExtensions
{
+    public static AttributedInfo GetAttributedInfo(this PropertyInfo property);
}

+public sealed partial class AttributedInfo
+{
+    public bool HasNullableContext { get; } }

+    // The two common methods:
+    public NullableInCondition NullableIn { get; } }
+    public NullableOutCondition NullableOut { get; } }

+    // Possibly simple helper that ignores "in"\"out" attributes and uses [Nullable] and return type characteristics:
+    // public bool AllowNull { get; }

+    // We can expose non-nullability state as well:
+    public DoesNotReturnCondition DoesNotReturn { get { throw null; } }

+    // Set for [NotNullIfNotNull] and [DoesNotReturnIf]
+    public string[]? MemberReferences { get; }

// todo: expose info for generic parameter types and array element type
+}

+ public enum NullableInCondition
+{
+    NotApplicable,
+    AllowNull,
+    DisallowNull
+}

+public enum NullableOutCondition
+{
+    NotApplicable,
+    MaybeNull,
+    MaybeNullWhenFalse,
+    MaybeNullWhenTrue,
+    NotNull,
+    NotNullWhenTrue,
+    NotNullWhenFalse,
+    NotNullIfNotNull
+}

+public enum DoesNotReturnCondition
+{
+    NotApplicable,
+    Returns,
+    DoesNotReturn,
+    DoesNotReturnWhenTrue,
+    DoesNotReturnWhenFalse
+}

Prototype at https://github.com/dotnet/runtime/compare/master...steveharter:ReflectionExt

This is somewhat slow and no extra caching is performed (like similar cases elsewhere). Caching should be done at a higher level, if necessary.

For caching to be done effectively outside the API, wouldn't this have to expose [NullableContext] information (which it really shouldn't, since that's a metadata encoding detail)?

For example, say I ask for nullability info on method Foo. Internally, this would potentially walk all the way up to the assembly to find the nearest [NullableContext]. If I then ask for info on method Bar on that same type, that same walk up to the assembly would have to happen again - the user has no way of caching anything.

Given the complexity of nullability metadata, it seems important for NullableContext to be calculated and cached internally, e.g. at the type level. I'm also not sure why this can't be pay-per-play by doing things lazily...

The approach I'd originally envisioned here is that the API would track only the annotation state of the types. Essentially is the type oblivious, annotated or not annotated. That is the key information that is missing from the reflection API today. The other values listed in the proposal here include attributes which are readily available today.

The other issue is that the attributes are being represented as enum values. It is reasonable to expect that the langauge will add more attributes in the future. The evidence being that we've done so already and had rejected a few other attributes under the "lets see if the need is really there before doing this.". Generally I believe the BCL has shied away from using enum to represent values that can expand in the future.

Cc @vitek-karas @marek-safar for linker

We have form factors (Blazor-WASM) where nullability information is dropped on deployment by default because it's a size concern.

Linker is not aware of these attributes right now - this information comes from an XML embedded into CoreLib. If APIs like this are added, it will likely require hardcoding the knowledge of the attributes and the new APIs into linker.

We may need new efficient and linker-friendly custom attribute querying APIs to build this feature on top of: https://github.com/dotnet/runtime/issues/44319#issuecomment-722588772 . The linker would need to understand these new custom attribute querying APIs, but it would not need to have hardcoded knowledge of every place that queries specific attributes.

Caching could be performed even if this API is part of the System.Reflection.Extensions assembly. There could be a NullableReflectionContext of sorts. It would be a class with query methods. An instance of that class would maintain a cache.

I personally like that the main reflection APIs are not polluted with C# concepts. This has been done in the past and it was, in my opinion, a design mistake. It is not true to the spirit of having a language-independent CLR.

How would properties with [AllowNull] and [DisallowNull] be expressed? I feel like there is different nullability for setter and getter. Similar question for ref arguments? Also should the nullable custom attributes be just exposed through IEnumerable<Attribute> NullableAttributes (or similar)? That should be more future-proof rather than trying to express that with enum

Caching could be performed even if this API is part of the System.Reflection.Extensions assembly. There could be a NullableReflectionContext of sorts.

That's certainly possible, but it would require consuming applications to manage the cache instance themselves, making it harder to reuse the information across different parts of the program etc. Re C# concepts, I am not sure to what extent nullability is (at least in theory) supposed to be a C#-only thing - it makes sense for F#/VB to be able to express and consume the same concepts at some point in the future.

To me, it's a combination of "how hard is the decoding" and "how likely is it that someone needs to answer that question using reflection APIs".
-- @terrajobst

@terrajobst How do you map out _what is hard to decode_ and _how likely it is that someone needs to answer that question_? Is there a list of user stories? For example, c# now has language design documents available: https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-7.1/async-main A lot of these have specific motivations for why something was introduced, beyond "totes brill upvotes".

I am curious, because I personally can think of several user stories (and large in circulation projects) for various aspects of reflection that are common to applications I write. I _also_ know how to rewrite some of these reflection calls to use more efficient expression trees, etc. But how common is this knowledge? It seems socially unjust and bad for .NET popularity for it to be kept in the knowledge of someone with 18+ years experience with .NET

The approach I'd originally envisioned here is that the API would track only the annotation state of the types.

@jaredpar What do you mean by the "annotation state of the types"? I could not tell if you were responding to Immo or Steve's strawman API proposal.

I mostly just want to know if the properties can be nullable, non-nullable or unknowable. I did not really understand what value Steve's proposal brought to the table, other than forcing me to understand the literal difference between AllowNullAttribute and MaybeNullAttribute. If I were to leave .NET for 5 years and come back and read this API on docs.microsoft.com, I would just go back to Google and search StackOverflow for "how to use reflection to find nullable reference types" and I would find this post , briefly lament that I am not using the "canonical API" (and perhaps think about how many times code will call the reflection helper function), and then probably go eat a long lunch with the time I saved.

@krwq

How would properties with [AllowNull] and [DisallowNull] be expressed? I feel like there is different nullability for setter and getter

I've updated the proposal (strawman2) to have separate "in" and "out" state, each with a separate enum. For a property getter, the "out" enum is appropriate. For a setter, the "in" enum is appropriate.

Also should the nullable custom attributes be just exposed through IEnumerable NullableAttributes (or similar)? That should be more future-proof rather than trying to express that with enum

Since the API is also intended to work with MetadataLoadContext, System.Attribute can't be returned since MLC doesn't support attributes since creating an attribute executes code (the constructor of the attribute). We could return CustomAttributeData but that would be hard to use.

@jzabroski

I mostly just want to know if the properties can be nullable, non-nullable or unknowable.

Are your scenarios related to run-time, or design-build-compile-time? If there are real scenarios for run-time then we have to be concerned about perf, plus any tooling that may strip out these attributes for run-time optimizations of assembly size.

If we do not have run-time scenarios, I wonder if using Roslyn makes more sense. The more I look at these APIs, the more complexity I am finding. Also because the metadata produced by C# has changed over releases, that is also a factor in producing a reliable API.

It used to be simpler when there was just "?" which caused the compiler to add "[Nullable]". This applied to the whole property and affected both the getter and the setter. Now with the new nullability attributes there is more complexity including the ability to have the getter and setter have different nullability.

Here's an example of different ways to specify nullability on properties:

#nullable enable

using System;
using System.Diagnostics.CodeAnalysis;

namespace ConsoleApp87
{
    public class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");

            var obj = new MyClass();

            object? nullableValue;
            object nonNullableValue;

            nullableValue = obj.MyProperty0; // ok
            nonNullableValue = obj.MyProperty0; // warning CS8600: Converting null literal or possible null value to non-nullable type.
            obj.MyProperty0 = nullableValue; // ok
            obj.MyProperty0 = nonNullableValue; // ok
            obj.MyProperty0 = null; // ok
            obj.MyProperty0 = 1; // ok

            nullableValue = obj.MyProperty1; // ok
            nonNullableValue = obj.MyProperty1; // warning cs8600: converting null literal or possible null value to non-nullable type.
            obj.MyProperty1 = nullableValue; // warning cs8601: possible null reference assignment.
            obj.MyProperty1 = nonNullableValue; // warning cs8601: possible null reference assignment.
            obj.MyProperty1 = null; // warning cs8625: cannot convert null literal to non-nullable reference type.
            obj.MyProperty1 = 1; // ok

            nullableValue = obj.MyProperty2; // ok
            nonNullableValue = obj.MyProperty2; // ok
            obj.MyProperty2 = nullableValue; // ok
            obj.MyProperty2 = nonNullableValue; // ok
            obj.MyProperty2 = null; // ok
            obj.MyProperty2 = 1; // ok

            nullableValue = obj.MyProperty3; // ok
            nonNullableValue = obj.MyProperty3; // ok
            obj.MyProperty3 = nullableValue; // ok
            obj.MyProperty3 = nonNullableValue; // ok
            obj.MyProperty3 = null; // ok
            obj.MyProperty3 = 1; // ok

            nullableValue = obj.MyProperty4; // ok
            nonNullableValue = obj.MyProperty4; // ok
            obj.MyProperty4 = nullableValue; // ok
            obj.MyProperty4 = nonNullableValue; // ok
            obj.MyProperty4 = null; // warning CS8625: Cannot convert null literal to non-nullable reference type.
            obj.MyProperty4 = 1; // ok
        }
    }

    class MyClass
    {
        public object? MyProperty0 { get; set; } // property has [Nullable]
        [DisallowNull] public object? MyProperty1 { get; set; } // setter has [DisallowNull]; property has [Nullable]
        [AllowNull] public object MyProperty2 { get; set; } // setter has [AllowNull]
        [AllowNull, NotNull] public object MyProperty3 { get; set; } // getter has [NotNull]; setter has [AllowNull]
        public object MyProperty4 { get; set; } = 1; // no attributes
    }
Was this page helpful?
0 / 5 - 0 ratings