This is the API proposal and feature set for object (de)serialization. It covers the main API entry point (JsonSerializer
), the options (JsonSerializerOptions
) and the ability to ignore properties during (de)serialization.
_Review process update: due to the size of this issue and since new API issues are being added for new features, the overview and forward-looking information has been moved to https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/docs/SerializerProgrammingModel.md. Future API additions will have their own independent API issue created instead of re-using this issue._
The reviewed portion of the API is in corefx master and Preview 4. It consists of the JsonSerializer
and the JsonSerializerOptions
class and provides no extensibility features.
There is a previous prototype at CoreFxLab in the package System.Text.JsonLab.Serialization
. It contains non-reviewed APIs including extensibility features like using attributes to define custom value converters, property name policies (like camel-casing), etc. It is build against .NET Core 3.0 Preview 2.
The last status of the API review is shown below.
This static class is the main entry point.
Let's start with coding examples before the formal API is provided.
Using a simple POCO class:
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public DateTime? BirthDay { get; set; }
}
To deserialize JSON bytes into a POCO instance:
ReadOnlySpan<byte> utf8= ...
Person person = JsonSerializer.Parse<Person>(utf8);
To serialize an object to JSON bytes:
Person person = ...
byte[] utf8 = JsonSerializer.ToBytes(person);
The string-based Parse()
and ToString()
are convenience methods for strings, but slower than using the <byte>
flavors because UTF8 must be converted tofrom UTF16.
namespace System.Text.Json.Serialization
{
public static class JsonSerializer
{
public static object Parse(ReadOnlySpan<byte> utf8Json, Type returnType, JsonSerializerOptions options = null);
public static TValue Parse<TValue>(ReadOnlySpan<byte> utf8Json, JsonSerializerOptions options = null);
public static object Parse(string json, Type returnType, JsonSerializerOptions options = null);
public static TValue Parse<TValue>(string json, JsonSerializerOptions options = null);
public static ValueTask<object> ReadAsync(Stream utf8Json, Type returnType, JsonSerializerOptions options = null, CancellationToken cancellationToken = default(CancellationToken));
public static ValueTask<TValue> ReadAsync<TValue>(Stream utf8Json, JsonSerializerOptions options = null, CancellationToken cancellationToken = default(CancellationToken));
public static byte[] ToBytes(object value, Type type, JsonSerializerOptions options = null);
public static byte[] ToBytes<TValue>(TValue value, JsonSerializerOptions options = null);
public static string ToString(object value, Type type, JsonSerializerOptions options = null);
public static string ToString<TValue>(TValue value, JsonSerializerOptions options = null);
public static Task WriteAsync(object value, Type type, Stream utf8Json, JsonSerializerOptions options = null, CancellationToken cancellationToken = default(CancellationToken));
public static Task WriteAsync<TValue>(TValue value, Stream utf8Json, JsonSerializerOptions options = null, CancellationToken cancellationToken = default(CancellationToken));
}
}
This class contains the options that are used during (de)serialization.
If an instance of JsonSerializerOptions
is not specified when calling readwrite then a default instance is used which is immutable and private. Having a globalstatic instance is not a viable feature because of unintended side effects when more than one area of code changes the same settings. Having a instance specified per threadcontext mechanism is possible, but will only be added pending feedback. It is expected that ASP.NET and other consumers that have non-default settings maintain their own global, thread or stack variable and pass that in on every call. ASP.NET and others may also want to read a .config file at startup in order to initialize the options instance.
An instance of this class and exposed objects will be immutable once (de)serialization has occurred. This allows the instance to be shared globally with the same settings without the worry of side effects. The immutability is also desired with a future code-generation feature. Due to the immutable nature and fine-grain control over options through Attributes, it is expected that the instance is shared across users and applications. We may provide a Clone() method pending feedback.
For performance, when a JsonSerializerOptions
instance is used, it should be cached or re-used especially when run-time attributes are added because when that occurs, caches are held by the instance instead of being global.
```cs
namespace System.Text.Json.Serialization
{
///
/// Provides options to be used with
///
public class JsonSerializerOptions
{
public JsonSerializerOptions();
/// <summary>
/// Defines whether an extra comma at the end of a list of JSON values in an object or array
/// are allowed (and ignored) within the JSON payload being read.
/// By default, it's set to false, and the reader will throw a <exception cref="JsonReaderException"/> if it encounters a trailing comma.
/// </summary>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public bool AllowTrailingCommas {get; set;}
/// <summary>
/// The default buffer size in bytes used when creating temporary buffers.
/// </summary>
/// <remarks>The default size is 16K.</remarks>
/// <exception cref="System.ArgumentException">Thrown when the buffer size is less than 1.</exception>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public int DefaultBufferSize { get; set; }
/// <summary>
/// Determines whether null values are ignored during serialization and deserialization.
/// The default value is false.
/// </summary>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public bool IgnoreNullValues { get; set; }
/// <summary>
/// Determines whether readonly properties are ignored during serialization and deserialization.
/// A property is readonly if it contains a public getter but not a public setter.
/// The default value is false.
/// </summary>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public bool IgnoreReadOnlyProperties { get; set; }
/// <summary>
/// Gets or sets the maximum depth allowed when reading or writing JSON, with the default (i.e. 0) indicating a max depth of 64.
/// Reading past this depth will throw a <exception cref="JsonReaderException"/>.
/// </summary>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public int MaxDepth { get; set; }
/// <summary>
/// Defines how the <see cref="Utf8JsonReader"/> should handle comments when reading through the JSON.
/// By default the reader will throw a <exception cref="JsonReaderException"/> if it encounters a comment.
/// </summary>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public JsonCommentHandling ReadCommentHandling { get; set; }
/// <summary>
/// Defines whether the <see cref="Utf8JsonWriter"/> should pretty print the JSON which includes:
/// indenting nested JSON tokens, adding new lines, and adding white space between property names and values.
/// By default, the JSON is written without any extra white space.
/// </summary>
/// <exception cref="InvalidOperationException">
/// Thrown if this property is set after serialization or deserialization has occurred.
/// </exception>
public bool WriteIndented { get; set; }
// several more options here for other features not covered here
}
/// <summary>
/// The base class of serialization attributes.
/// </summary>
public abstract class JsonAttribute : Attribute
{
}
/// <summary>
/// Prevents a property from being serialized or deserialized.
/// </summary>
[AttributeUsage(AttributeTargets.Property, AllowMultiple = false)]
public sealed class JsonIgnoreAttribute : JsonAttribute
{
public JsonIgnoreAttribute();
}
}
Thank you for describing your API and performance concepts.
API
JsonConverterSettings
, i.e. NoNulls, Int/String Enums etc?Performance
No objects created on the heap during (de)serialization after warm-up, with the exception of the POCO class itself and related properties during serialization.
This probably won't be true if you want to support things like ReadOnlyCollection, I probably would simply say "related properties and types".
Optimized hashtable lookup for property names using byte[] as the key.
This will be slow, I recommend talking to @rynowak about his work in ASP.NET Core Routing and the IL Matcher (the code is pretty much what you need, technically matching a route and matching a property name is quite similar), this is by far the fastest approach, in both the Routing and all the experiments I did for SpanJson https://github.com/Tornhoof/SpanJson/wiki/Technology-and-Internals#3-comparison-against-integers (which is pretty much identical to the approach in Routing).
I expect that you could probably reuse 75%+ of the Routing code (an estimation based on how long it took to port my Expression Tree Code from SpanJson to do the Routing API, which was trivial).
Where鈥檚 the async API?
Where鈥檚 the async API?
In what scenario would you require async (de)serialize APIs? Can you please provide a sample usage?
I'm going to be a bit obtuse here. If we're not making these APIs async then we need to have a long discussion again about ASP.NET Core usage of these APIs.
Input:
Output:
Just to throw out some more examples here's the HttpClient JSON formatter that's used today in most applications:
To mitigate sync reading, today we buffer the entire Stream into memory or disk (depending on the size):
Which is unfortunate and we'd really love to avoid that. The plan was to have to get that benefit from the new JSON reader/writer/serializer. We don't have a good solution for output today. You end up doing synchronous IO once you hit a buffer threshold which ends up doing sync over async and leans towards causing thread pool starvation.
See https://github.com/aspnet/AspNetCore/issues/6397 for more details on output. ASP.NET Core is going to disable all sync IO on the request and response by default in 3.0.
For the ASP.NET Core formatter you additionally need to support a bag of json properties, which is either serialized into the parent object or deserialized into the bag for RFC 7807 Error Details. See https://github.com/aspnet/Mvc/pull/8529 for more details, that one uses JSON.NET's https://www.newtonsoft.com/json/help/html/T_Newtonsoft_Json_JsonExtensionDataAttribute.htm
cc @KrzysztofCwalina, @jkotas, @glennc, @terrajobst as an FYI (note: this is still WIP)
My feedback in a rough stream-of-consciousness format.
On JsonConverter, we will need the ability to specify the type dynamically as an argument - in addition to the generic overloads. Without this, it will really cause a lot of folks to use a workaround to call the generic version with MakeGenericMethod
. It's OK to box in this case.
JsonConverterSettings
seems like it has a mashup of settings that generally global to an operation, and don't pertain to specific object or type (indent size for example). Is it expected that I can specify a global indent size, but then override the indent size for a specific type?
I'm really intrigued by how JsonConverterSettings
tries to maintain a loose coupling with the specific types of settings/options we provide support for. This is similar to a number of things we do in MVC and it scales horizontally pretty nicely.
I think it would be really neat to see a recipe guide or some side-by-side examples: "this is how you'd set the json property name with an attribute vs with code".
Another natural thing that users will want from JsonConverterSettings
is a copy constructor and a way to merge them.
The JsonClassMaterializer
seems like a god-object. This is where you control all object creation and property get/set operations for all objects and all types. If the intention is to use these as a replacement for JSON.Net's JsonConverter
then I think it's far off. This seems like something that would be intentionally hard to replace. Maybe that's OK depending on what other things we do.
Async is a higher-level bugaboo (I think that's a SFW replacement for what I was thinking 馃槅). I still think C# and .NET do async better than any other language out there, but we're still struggling to pull together an end-to-end here for the web that actually avoids synchronous I/O at all levels.
From my perspective, I'm still excited to see all of the work that's here. I think we're solving a very critical problem and these are all steps in the right direction.
@Tornhoof
I'd like to see non-generic FromJson methods, e.g. for the ASP.NET Core TextFormatter classes, assuming this will not change that much
Yes that makes sense. The API is still being defined, and this issue wasn't meant to be complete at this point.
What about global settings in JsonConverterSettings, i.e. NoNulls, Int/String Enums etc?
Currently those simple settings are in JsonPropertyAttribute which can be "added" at run-time. If too verbose we may just place those right on JsonConverterSettings.
Performance
No objects created on the heap during (de)serialization after warm-up, with the exception of the POCO class itself and related properties during serialization.
This probably won't be true if you want to support things like ReadOnlyCollection, I probably would simply say "related properties and types".
Yes, of course.
Optimized hashtable lookup for property names using byte[] as the key.
This will be slow, I recommend talking to @rynowak about his work in ASP.NET Core Routing and the IL Matcher (the code is pretty much what you need, technically matching a route and matching a property name is quite similar), this is by far the fastest approach, in both the Routing and all the experiments I did for SpanJson https://github.com/Tornhoof/SpanJson/wiki/Technology-and-Internals#3-comparison-against-integers (which is pretty much identical to the approach in Routing).
Thanks. I updated the description and implementation to use an interger-based array which is looking very promising at this point in time.
@rynowak
On JsonConverter, we will need the ability to specify the type dynamically as an argument - in addition to the generic overloads. Without this, it will really cause a lot of folks to use a workaround to call the generic version with MakeGenericMethod. It's OK to box in this case.
Yes that will be added. Thanks
JsonConverterSettings seems like it has a mashup of settings that generally global to an operation, and don't pertain to specific object or type (indent size for example). Is it expected that I can specify a global indent size, but then override the indent size for a specific type
As it is now, JsonConverterSettings is fairly clean and only contains settings that would need to be specified at run-time like MaxDepth and DefaultBufferSize (I'll update the API doc soon with new members). The mashup is JsonPropertyAttribute for now.
I'm really intrigued by how JsonConverterSettings tries to maintain a loose coupling with the specific types of settings/options we provide support for. This is similar to a number of things we do in MVC and it scales horizontally pretty nicely.
Yes there is some loose coupling with the extension points like the property casing where you can provide the implementation. Other simple settings will not really be extensible in that way but the values can be specified via attributes at the assembly, class and property level plus run-time overrides to each of these -- i.e. the values for these simple settings can be changed in many flexible ways, but not necessarily the code that uses them.
I think it would be really neat to see a recipe guide or some side-by-side examples: "this is how you'd set the json property name with an attribute vs with code".
Yes I will do this in the next update.
The JsonClassMaterializer seems like a god-object. This is where you control all object creation and property get/set operations for all objects and all types. If the intention is to use these as a replacement for JSON.Net's JsonConverter then I think it's far off. This seems like something that would be intentionally hard to replace. Maybe that's OK depending on what other things we do.
JsonClassMaterializer is just an enum. The implementation is internal. We may not even need the enum if we decide the default is ok. Currently the default will attempt to use the most efficient mechanism (currently IL gen for get,set,ctor) but if IL gen is not supported then use standard reflection. However, the enum is there currently for someone to change the default if for example they have a single object to deserialize and don't want any additional memory overhead (and initial perf hit) of IL so they want to use standard reflection instead.
@davidfowl
I'm going to be a bit obtuse here. If we're not making these APIs async then we need to have a long discussion again about ASP.NET Core usage of these APIs.
I added the async apis along with a brief description. Currently they only support a Stream, more if necessary. I have a prototype that verifies the feasibility.
Some quick feedback:
@steveharter If you care about integer tricks, you can optimize the json member writing for utf8 with the same trick. Assuming you will bake precalculated byte-arrays for the property names into the IL, you can use the same concept as for deserialization. Instead of copying the precalculated buffer to the output, just write appropriate integer values to the output.
Instead of writing Encoding.UTF8.GetBytes("\"message\":")
, you could also write
jsonWriter.WriteUtf8Verbatim(7306916068917079330UL /*"message*/, 14882 /*":*/);
Atleast in my SpanJson tests that was the last main difference in serialization speed between UTF16 and UTF8 as, atleast for expression trees, there is a large difference if I have
Expression.Constant("\"message:\"");
which is fast for utf16 or Expression.Constant(Encoding.UTF8.GetBytes("\"message\":"))
for UTF8 which was ~20% slower than the UTF16 version.
In SpanJson properties up to ~30 chars length are written via integers, for longer properties the byte[] copy (incl. expression tree) overhead is pretty much the same speed or faster
This might not be necessary if you can leverage those shiny new static byte arrays with public
ReadOnlySpan<byte> property => new byte[]{0xA,0xB,0xC}
, but it might be worth a try.
That's more or less the last larger performance trick in SpanJson as now the UTF8 and UTF16 serialization speeds are the same, still deserialization of UTF8 is slower, but that's mostly due to the utf8 to utf16 overhead for strings., especially since @stephentoub did all those nice TryFormat
``TryParse`` optimizations for both UTF8 and UTF16.
An additional thing we would need to support Azure SDK: version resilient serialization, including round-tripping.
We need to be able to add a property bag field (dictionary) to deserialized types. If deserialized payload has a property that does not exist in the type, the property would be saved in the dictionary. The serializer would serialize not only actual properties of the type, but also the dictionary (so all payload round trips).
cc: @SchaabS
@KrzysztofCwalina @Tornhoof's comment from here - https://github.com/dotnet/corefx/issues/34372#issuecomment-451928379 - goes in to this. JSON.NET supports it using ExtensionDataAttribute
, the data contract serializers support it using IExtensibleDataObject
.
I spent a little bit of time plugging in the deserializer in to MVC to see what comes out of it. This is based on version 0.1.1-e190205-1 of the package.
C#
var settings = new JsonConverterSettings();
settings.AddAttribute(modelType, new JsonCamelCasingConverterAttribute());
This works for the immediate properties of modelType
, but not sub-properties (e.g. Person.PersonAddress.StreetName
).
While we're here, would it make sense to have the method name indicate this was a policy being applied? AddAttribute
really doesn't convey that information.
JsonConvert
throws any time it finds a JSON value that does not have .NET property to serialize to. It's a fairly common scenario to want to bind a subset of properties. If this is a feature that needs to ship, could it sit behind a flag?
Is deserialization meant to perform case sensitive property name matches? This becomes problematic with MVC since the server is configured to do camel casing, while HttpClient.PostAsJsonAsync
defaults to pascal case. Case insensitive matches would circumvent this issue.
We spoke about this in the offline sync up, but it would be nice if FromJsonAsync
supported PipeReader
. In the most common case, servers in ASP.NET Core would default to using PipeReader
with Stream
wrappers for legacy \ convenience reasons. Using the former could possibly avoid some inefficiences.
Optionally including the path inJsonReaderException
errors would be very useful. When working with large JSON payloads, narrowing down what failed can be very useful in diagnosing the error. Knowing there's an error in (Line 1, BytePosition: 864), is not always useful. MVC uses Json.NET's path information to produce an error object pointing to individual things that went wrong. Although it's noteworthy in that with JSON.NET, deserialization also performs some validation, so it is more useful there.
Json.NET supports Javascript literals which makes both { Id: 10 }
and { 'Id': 10 }
legal inputs. While I can see this tripping up MVC users hand crafting json payloads, it matches the spec and browser behavior (JSON.parse
). If there are plans to support single quotes, perhaps you could also consider the other format.
Deserializing some types don't work. For instance, Dictionary<,>
didn't work. Some of the more exotic types such as byte?
didn't work. The converter spat out todo errors, so I'm guessing this is WIP.
There's a mix of InvalidOperationException and JsonReaderException thrown by the converter for the same kinds of exceptions. All parser \ formatting exceptions should uniformly be JsonReaderException
.
@pranavkm
Thanks for the feedback!
The attributes can be specified at an Assembly level for all types in a given Assembly. By using design-time attributes like this it helps to isolate the settings for a particular consumer's types assuming they belong to the same Assembly. I can revisit this andor make it easier to specify "globally" per the settings class if we want to do that.
settings.AddAttribute(modelType.Assembly, new JsonCamelCasingConverterAttribute());
Is deserialization meant to perform case sensitive property name matches
Not by default (current plan). I relialize this is different than Json.Net. However this helps with performance slightly and is the "right" json-spec thing to do. I haven't added the case-insensitive switch yet.
We spoke about this in the offline sync up, but it would be nice if FromJsonAsync supported PipeReader
Yes that is the current plan. I will be working on it this week.
Json.NET supports Javascript literals which makes both { Id: 10 } and { 'Id': 10 } legal inputs.
I'll look into this. @ahsonkhan has this been considered?
Deserializing some types don't work. For instance, Dictionary<,> didn't work.
Dictionary is not implemented yet. The next iteration (and current PR in corefxlab) support the Nullable
There's a mix of InvalidOperationException and JsonReaderException thrown by the converter for the same kinds of exceptions
Yes I plan on having the exceptions cleaned up and have the information regarding line number etc. The internal reader throws FormatException as well and I'll switch to using the TryParse* methods instead to avoid that.
has this been considered?
Yes, and given single quote strings are not JSON RFC compliant, we decided against supporting it at the lowest layer. Adding too many of these switches/options results in perf costs for the most common cases of using the Utf8JsonReader
, where people have valid, spec-complaint JSON payloads. If there is a strong requirement for this, especially if its affecting a lot more customers, we can reconsider adding an option for it at the reader layer.
That said, as an alternative solution, we should evaluate if this is something that occurs frequently at the higher layer (i.e. Deserializer
) and consider only supporting it there.
Optionally including the path in
JsonReaderException
errors would be very useful. When working with large JSON payloads, narrowing down what failed can be very useful in diagnosing the error. Knowing there's an error in (Line 1, BytePosition: 864), is not always useful.
Unfortunately, this has performance implications since it requires potentially storing the path in some re-sizable data structure which ends up allocating. We have made an effort to keep the reader non-allocating with emphasis on performance which made providing Path
info difficult.
Similarly I've tried out the deserializer on the SignalR side, version 0.1.1-e190201-1.
One of the biggest issues is that we use the Utf8JsonReader
up until we find an array of objects that we want to parse. Then we grab each object individually and pass them into the converter. The problem is that in order to get a usable chunk of data into the converter from the reader, you need to do a lot of manual work.
For example to serialize a string like type (String, DateTime, etc.):
var bytes = reader.HasValueSequence ? reader.ValueSequence.ToArray() : reader.ValueSpan;
if (reader.TokenType == JsonTokenType.String)
{
var b = new byte[bytes.Length + 2];
b[0] = 34; // "
bytes.CopyTo(b.AsSpan().Slice(1, bytes.Length));
b[bytes.Length + 1] = 34; // "
bytes = b.AsSpan();
}
var obj = Serialization.JsonConverter.FromJson(bytes, type);
(Yes I know this is unoptimal, but ignore that)
And to deserialize an array you'd need to do a similar thing and add the brackets around the payload.
Another thing is that the DateTime
parsing is far too strict. When testing I had to write my own deserialization for DateTime
so I wasn't even using the converter anymore. I know there is https://github.com/dotnet/corefx/issues/34576 for the writer doing "smallest output that round trips", and https://github.com/dotnet/corefx/issues/34690 for the reader getting DateTime
directly, so it will probably be fixed around the same time as those.
Because Json properties can be in any order we sometimes need to store a "JArray" like object to then be used later in the deserialzer. For now I store the BytesConsumed
from the reader, then run Skip()
and store the BytesConsumed
afterwards and use both of those later on the original ValueSequence
to get the bytes needed to pass into the deserializer. Some kind of first class feature for this would be amazing.
Looking to the future where we'll be using the Utf8JsonWriter
alongside the serializer, there needs to be a way of writing to the writer either from the serializer, or get the json out of the serializer and write raw bytes to the writer. I know @ahsonkhan has proposed some APIs in that area for exposing writing raw bytes to the writer. And was recently working on a change for writing JsonElement to the writer.
On a brigher note, when I did some microbenchmarks with the Utf8JsonReader
+ JsonConverter
they turned out ~77% faster than using the low level types of Newtonsoft.Json.
cc @ahsonkhan I probably missed some feedback from when I was talking with you.
The attributes can be specified at an Assembly level for all types in a given Assembly.
That wouldn't necessarily work for MVC since complex sub-properties could come multiple different assemblies.
I can revisit this andor make it easier to specify "globally" per the settings class if we want to do that.
That sounds perfect.
@pranavkm
That wouldn't necessarily work for MVC since complex sub-properties could come multiple different assemblies.
I see three way to add global settings:
1) Duplicate the properties from JsonPropertyAttribute (and perhaps others) to JsonConverterOptions
.
2) Support a null
for the first parameter in AddAttribute()
to represent "all" assemblies. Somewhat unconventional.
3) Add a property to JsonConverterOptions
to return a JsonPropertyAttribute
. Somewhat unconventional.
Basically options 2 and 3 prevent duplicating 10+ eventual properties from JsonPropertyAttribute.
However if there is a need to use JsonCamelCasingConverterAttribute or having a different DateTime converter, etc. then option 2 looks more appealing. Even if we do option 2 we could still do option 1 for JsonPropertyAttribute for better discoverability of common options.
Currently the API here is taking option 1 but currently just for JsonPropertyAttribute, not for camel-casing and others (pending API discussion).
Is there really a case to be made that configuring a setting per-assembly is more common that configuring a setting globally?
Put another way, I think there's a really strong case to be made for configuring a setting per-type, per-property and globally, and a weak case to be made for configuring settings per-assembly. Do you have use-cases in mind for that?
@rynowak
configuring a setting per-assembly... Do you have use-cases in mind for that?
Here you go:
1) My organization requires a couple hundred POCO Types to be used internally.
2) Various developers add POCO Types over time.
3) Various teams use these POCO Types in different settings so they want to use design-time attributes for all possible settings to avoid mismatches in applying the same run-time settings.
4) The organization requires all POCO types to default to camel-casing by default, and require the use of a custom DateTime converter for interop with other code. Since it is easy to forget to apply the attributes to evey POCO type (or property) the attribute can be applied to the assembly that contains the POCO types.
That seems like you'd just apply the defaults globally though not per-assembly...
@BrennanConroy
Because Json properties can be in any order we sometimes need to store a "JArray" like object to then be used later in the deserialzer... Some kind of first class feature for this would be amazing.
Is the feature you are asking for controlling the deserialization order of properties? And secondary I suppose specify the serialization order although that's not important.
That seems like you'd just apply the defaults globally though not per-assembly...
Perhaps if you don't use types from other assemblies that have their own contractschema.
However, currently there is not a way to specify design-time attributes globally (across assemblies). You could use the run-time AddAttribute(null, myattribute) however if we decide that will work.
Do you have something else in mind (config file support maybe)?
@steveharter We need a way of storing a json array for later use in the deserializer.
For example, the following two json payloads are equivalent and the target value is used to determine what types are in the arguments array, however, when using the forward reading only Utf8JsonReader
we don't know the target until later, so we need to store the array until we have the target.
{ "target": "name", "arguments": [ 1, "example", 2.3 ] }
{ "arguments": [ 1, "example", 2.3 ], "target": "name" }
With Newtonsoft.Json we can do:
c#
argumentsToken = JArray.Load(reader);
// ... later
for (var i = 0; i < length; i++)
{
argumentsToken[i].ToObject(type, serializer);
}
However, currently there is not a way to specify design-time attributes globally (across assemblies). You could use the run-time AddAttribute(null, myattribute) however if we decide that will work.
Right, I'm questioning why we need the ability to specify settings per-assembly. Existing JSON serializers don't have it and I've never see a user ask for it.
Specifying settings for the application as a whole (regardless of what assembly the data types live in) is exceedingly common for a web project.
Notes from today are here.
IMO, the deserialization calls should implement 'TryRead'. Since the json attempted to be deserialized typically comes from external sources, serialization exceptions are expected to be a common occurrence. If we place deserialization at the edge of a protocol, then the try/catch pattern will cause a noticeable amount of performance (time) to serve the request due to exception catching.
The APIs naming is so long and not clear
System.Text.Json.Serialization.JsonSerializer.Read<TValue>(ReadOnlySpan<byte> utf8Json);
System.Text.Json.Serialization.JsonSerializer.ReadString<TValue>(string json);
System.Text.Json.Serialization.JsonSerializer.Write<TValue>(TValue value);
System.Text.Json.Serialization.JsonSerializer.WriteString<TValue>(TValue value);
System.Json
, that's would make it more symmetrical with System.Xml
. I know that there is already some APIs under System.Json
but this old APIs is rarely used and kind of obsolete. https://github.com/dotnet/corefx/issues/33552#issuecomment-439730860So, I would recommend more simple naming that looks like this:
namespace System.Json
{
public static class JsonConvert
{
public static TValue Parse<TValue>(ReadOnlySpan<byte> utf8Json, JsonSerializerOptions options = null);
public static TValue Parse<TValue>(string json, JsonSerializerOptions options = null);
public static byte[] ToBytes<TValue>(TValue value, JsonSerializerOptions options = null);
public static string ToString<TValue>(TValue value, JsonSerializerOptions options = null);
...
}
}
Also would be more symmetrical with System.Convert
and Newtonsoft.Json.JsonConvert
@jtillman
IMO, the deserialization calls should implement 'TryRead`
A simple bool TryRead(...)
won't include error information. For the read case, the exception will include a message, line number, byte offset and property path. Is this important for your scenario, or is a bool sufficient? We could also provide an overload that takes an out
parameter for the error information.
@Waleed-KH thanks for your feedback.
The APIs naming is so long and not clear
There was an update based on an API review a few hours before you commented - the "ReadWrite" is now only for the async streampipe APIs and we use "ToXXXParse" terminology for the rest.
It would be more natural if these APIs under System.Json, that's would make it more symmetrical with System.Xml
@ahsonkhan can you comment on the history of using System.Text.Json
instead of System.Json
-- did we not use System.Json
because it already contained olderexisting APIs?
@steveharter The information provided by the exception would be beneficial, but throwing an exception on a hot code path is undesirable. Ideally, I would have a single root call that packages error information as a part of the return value.
public class JsonConverterResult<T>
{
public T Value { get; }
public JsonError? Error { get; }
public bool HasError => Error.HasValue;
}
public static class JsonConvert
{
public static JsonConverterResult<T> ToObject<T>(...);
public static T ToObject<T>(...)
{
JsonConverterResult<T> result = ToObject(...);
if (result.HasError)
throw new JsonException(result.Error);
return result.Value
}
}
Since this approach isn't typical in most of the existing serialization apis, I would settle for either of the solutions you suggested (boolean or out parameter).
@steveharter Thanks for your response.
There was an update based on an API review a few hours before you commented - the "ReadWrite" is now only for the async streampipe APIs and we use "ToXXXParse" terminology for the rest.
Yeah I thought of that, that's why I only mentioned the functions that work withReadOnlySpan<byte>
, byte[]
& string
. But I would like to suggest to name the class JsonConvert
and put it in the main namespace because this class will be the most used one in these APIs so it would be better if it is in the main namespace not in Serialization
.
I also hope you consider using the namespace System.Json
instead of System.Text.Json
for the reasons that I mentioned before, and because it will be sad not given this good APIs the namespace that it deserves, for an old obsolete APIs that no one use.
@BrennanConroy
We need a way of storing a json array for later use in the deserializer.
It seems you require three pending features:
1) Ability to (de)serialize a heterogeneous array: your arguments
array has multiple types
2) Ability to (de)serialize into temporary state for "overflow" "hidden" properties: your arguments
array is likely not a public property
3) Assign temporary state once deserialization is finished for the object: your arguments
array needs to be passed to the appropriate logic based upon the value of the target
property
For (1) we have had discussions on using the existing JsonElement
in some fashion, although currently that is coupled tightly to JsonDocument
(cc @bartonjs @ahsonkhan ) so we need to decide whether to extend JsonElement
or create a new type to handle this.
For (2) and (3) I am working on the explicit class (de)serialize feature where you can implement before- and after-serialize methods along with per-property callbacks including "overflow" properties that exist in the json but not the class.
Here's a mock-up using JsonElement
+ "overflow property" + callbacks.
// Either through an attribute or code, register your callback\converter:
settings.AddConverter<MyPocoType>(typeof(MyPocoTypeConverter));
...
// Implement your callbacks (an instance of this struct is created per POCO instance):
public struct MyPocoTypeConverter : ITypeConverterOnMissingPropertyDeserialized, ITypeConverterOnDeserialized
{
IList<JsonElement> _arguments;
void ITypeConverterOnMissingPropertyDeserialized.OnMissingPropertyDeserialized(
object obj,
JsonClassInfo jsonClassInfo,
string propertyName,
object propertyValue,
JsonTokenType tokenType,
JsonSerializerOptions options)
{
if (jsonPropertyInfo.PropertyName == "arguments")
{
Debug.Assert(tokenType == JsonTokenType.Array);
_arguments = (IList<JsonElement>)propertyValue;
}
}
void ITypeConverterOnDeserialized.OnDeserialized(
object obj,
JsonClassInfo jsonClassInfo,
JsonSerializerOptions options)
{
MyPocoType poco = (MyPocoType)obj;
if (poco.Target == "CallMethod2")
{
poco.Method2(
_arguments[1].GetString(),
_arguments[0].GetInt32(),
_arguments[2].GetDecimal()
}
}
}
@steveharter I think there is some misunderstanding of our scenario.
We use the Utf8JsonReader
to read all the properties both for perf and for custom logic while walking through the payload. One thing we need is the ability to store the "arguments" property (which is an array) for later use in the deserializer. Currently I have a bunch of hacks to try and store the position of the Utf8JsonReader
for the "arguments" property and then try to pass in each element to the deserializer one by one.
I showed the Newtonsoft example which is what we already use for the current Json parsing and works perfectly. I think what you're showing is assuming we just call the deserializer with the whole payload which is definitely not what we're doing.
We need interoperability with the Utf8JsonReader
and deserializer (similarly with the Utf8JsonWriter
and the serializer).
```c#
if (utf8JsonReader.ValueSpan.SequenceEqual("arguments"))
{
argumentsToken = JArray.Load(utf8JsonReader);
}
// ... later
for (var i = 0; i < argumentsToken.Size(); i++)
{
argumentsToken[i].ToObject(type, serializer);
}
```
@BrennanConroy thanks for the clarification. Yes I was assuming you wanted to use the (de)serializer for the entire scenario, at least for the flow-of-control because the (de)serializer will allow access to the lower-level readerwriter in specific callbacks.
We currently have JsonDocument
JsonElement
but since it doesn't currently accept an existing reader you can't do something like JArray.Load
. However, those types do seem to be the natural fit to hold the raw values, which then could be passed to the deserializer.
Should I open a new issue for that feature?
@BrennanConroy I'll discuss the feature offline with @bartonjs and @ahsonkhan and then reply back here.
One of the cool things about the reader (and writer) is that it supports a streaming model where the buffer doesn't have to contain all of the data. When the reader runs out of buffer for the current property, it returns false and expects to be called again with a fresh buffer (and to be passed the previous JsonReaderState object back in so it can pick up where it left off).
However this complicates scenarios where a more than one property must be returned like in a "JArray" case because every property in theory may have to ask for a fresh buffer.
The deserializer handles this scenario and returns the whole objectarray when it is finished including when it spans multiple buffer refreshes (which occurs when using the async pipestream APIs). However, the reader doesn't support this scenario internally (it leaves it up to the caller, like the serializer) and the documentelement supports this scenario but requires all of the data up-front in basically one large buffer (for stream cases it reads to the end up-front).
So we have to decide who should support this multi-property functionality - the readerwriter, the documentelement, the (de)serializer, andor the caller.
Is it really necessary for implementers of JsonValueConverter<TValue>
to implement both Write
overloads? The logic in them are going to be identical except for the call to Utf8JsonWriter
's WriteXXXValue
vs WriteXXX
method.
Could we add a method to Utf8JsonWriter
to prep it with a property name so that the next WriteXXX
call writes the value as a property instead of value?
Or alternatively, could we have a single Write(bool writeProperty, Span<byte> escapedPropertyName, TValue value, ref Utf8JsonWriter writer)
method where the implementer would then make the appropriate call to the Utf8JsonWriter
based on the writeProperty
flag?
@TylerBrinkley
Is it really necessary for implementers of JsonValueConverter
to implement both Write overloads?
I've been working with @ahsonkhan on addressing this by either pre-populating the property name and just calling WriteValue methods, wrapping the writer somehow, or by having the Write methods in the writer detect whether the property name is empty and call WriteValue if so (however unfortunately json allows empty property names so that may not work).
@BrennanConroy update: we are working on extending JsonElement
and JsonDocument
to make them more usable in your scenario and for scenarios where the (de)serializer has to support "overflow" properties and cases where a property is of type object
or JsonElement
.
Would it make sense to add a JsonSerializerReadOptions
class that either inherits or is composed with JsonReaderOptions
and include IgnoreNullPropertyValue
in the new class? Same applies for writer options.
It wouldn't surprise me if we would want to add more options related to reading and writing later and having some options in JsonSerializerOptions
' Reader/WriterOptions
property and others directly in JsonSerializerOptions
will be confusing to consumers.
Also it will be usefull to have something like custom Exression serializer builder
LambdaJsonClassConverterBuilder<TestClass2> builder2 = new LambdaJsonClassConverterBuilder<TestClass2>();
builder2.AddObjectProperty(x => x.A, "a");
builder2.AddIntegerProperty(x => x.B, "b");
builder2.AddObjectProperty(x=>x.C,"c");
It will be usefull when class is separated from json serializer logic
A big gap I see in the proposal as specified is constructor handling. For a domain-driven design I think it is common to instantiate required parameters in constructors so that the object is always in a valid state.
Since this current proposal only looks for a constructor without parameters the current state of deserialization is unfortunately limited to simple types like data transfer objects. Json.NET
invokes the default constructor or else the constructor specified with the JsonConstructorAttribute
applied however there's no equivalent option here.
Is more advanced constructor handling being considered?
~Please include a generic overload to JsonSerializerOptions.GetClassInfo(Type)
like this:~
namespace System.Text.Json.Serialization
{
partial class JsonSerializerOptions
{
public JsonClassInfo GetClassInfo<T>() => GetClassInfo(typeof(T));
}
}
Update: Retracted suggestion based on comment by @khellang (https://github.com/dotnet/corefx/issues/34372#issuecomment-475721641)
It was just covered in the review; https://www.youtube.com/watch?v=_CdV75tEsVk. TLDR; There's no need for it and it's easy enough for you to add it yourself as an extension method.
@ahsonkhan can you comment on the history of using
System.Text.Json
instead ofSystem.Json
-- did we not useSystem.Json
because it already contained olderexisting APIs?
@steveharter Have there been any updates/consideration to using System.Json
as the namespace rather than System.Text.Json
?
The types in System.Json
are, if I remember correctly, a throwback to Silverlight 3. They constitute a legacy API and don't conflict (in naming) to the types in the new JSON API. It seems like the new JSON APIs should be in a System.Json
namespace as well, since that seems (to me at least) to be the more natural container and is also consistent with the use of the System.Xml
namespace to contain all types that are XML related.
Have there been any updates/consideration to using System.Json as the namespace rather than System.Text.Json?
I believe that decision is closed. cc @terrajobst
However other feedback is whether System.Text.Json.Serialization
should be a sub-namespace of the readerwriterdocument that is in System.Text.Json
based on how to best factor higher-level functionality (serializer) vs lower-level (readerwriter).
Has there been considered or is it possible for the serializer to provide an additional "report" of deserialized properties?
I.e. which properties were set by the serializer and which properties were missing from the JSON payload. This would be really interesting information for web frameworks that do model binding.
Consider the following POCO:
public class Person
{
public string FirstName { get; set; }
public string LastName { get; set; }
public DateTime BirthDay { get; set; }
}
Since no property is nullable (assuming nullable reference types are enabled and considered), all properties are required by default, but if you call
ReadOnlySpan<byte> utf8= ...
Person person = JsonSerializer.Parse<Person>(utf8);
Where utf8
contains the following JSON payload
{
"firstName": "John",
"lastName": "Doe"
}
You'll end up with person.BirthDay == default(DateTime)
.
At that point, you have no clue whether the BirthDay
property was actually specified as 0001-01-01T00:00:00.0000000
, or if it's just the default value.
That's why you normally define all non-nullable properties as nullable and use validation (and additional metadata, like attributes) to make sure all required values are present.
The problem with that, is that now you have a DTO full of nullable properties, which all contain valid values (verified by validation). Every time you want pull a value from a property, you have to either guard against null (or use HasValue
+ Value
) or supress warnings by static analysis tools (including the new null-flow analysis by Roslyn). This really sucks.
@rynowak Is this something that MVC could use?
It sounds like what you're after here is input validation - validating what's on the wire based on metadata associated with properties. I don't think what you're proposing is going to be enough to do that in a good way - IMO if we want to have input validation support it would have to be based on callbacks built into the serializer, not layered on top.
Speaking of input validation, I'm not sure why you'd want to use that to specify a behaviour difference between explicit null
and omitted. It doesn't seem useful in consideration of the end to end - by that I mean it doesn't seem like a good way to specify a REST api.
It sounds like what you're after here is input validation - validating what's on the wire based on metadata associated with properties.
Yeah, but not any advanced validation, just present/not present.
I don't think what you're proposing is going to be enough to do that in a good way - IMO if we want to have input validation support it would have to be based on callbacks built into the serializer, not layered on top.
Why would it need to be callback-based? What is my proposal lacking to do it "in a good way"?
I'm not sure why you'd want to use that to specify a behaviour difference between
explicit null
and omitted.
That's not what I'm saying. For nullable properties, I think explicit null
or omitted entirely should have the exact same result. I'm talking about differentiating between non-nullable properties' default values or omitted properties. I.e. actual int 0
or int default
(omitted).
That's not what I'm saying. For nullable properties, I think explicit null or omitted entirely should have the exact same result. I'm talking about differentiating between non-nullable properties' default values or omitted properties. I.e. actual int 0 or int default (omitted).
OK thanks that helps.
I think your proposal is lacking because it solves the problem at the wrong layer. It assumes that we can get a result from the serializer that ASP.NET knows how to interpret - I don't think this goes far enough. A feature like this has be more built in than returning a list of properties.
Think about what would happen for a nested object - how would that work?
I think your proposal is lacking because it solves the problem at the wrong layer.
What layer is the correct layer? The only one that knows this information is the serializer, so it has to provide this information somehow. Are you saying you want it at an even lower level? 馃
Think about what would happen for a nested object - how would that work?
That's what you have JSON path for 馃槈
Anyway, I'm not dead set on a list of properties. I'm totally open for alternative solutions.
I just think it's a problem worth solving, cause today's "workarounds" are pretty bad. This seems (to me) like the best level to solve it, and a perfect opportunity as the APIs are being brought in.
Anyway, I'm not dead set on a list of properties. I'm totally open for alternative solutions.
I'm not sure how a nice API could possibly look like for this though. A list of property paths would be a bit annoying since then you'd spent most of the time parsing paths.
I think some callback during the binding of a single object would work best actually. Passing in the target object instance, the source JSON reader and maybe a path would probably work. And that callback could then allow you to customize the binding behavior, or it could be used for validation.
In regards to the top-level namespace...
I believe that decision is closed. cc @terrajobst
I think System.Text.Json
is really the wrong namespace, but if the decision is fixed, then so be it.
However other feedback is whether
System.Text.Json.Serialization
should be a sub-namespace of the readerwriterdocument that is inSystem.Text.Json
based on how to best factor higher-level functionality (serializer) vs lower-level (readerwriter).
Are you asking if the public types currently in System.Text.Json.Serialization
should be kept in that namespace or moved to be part of System.Text.Json
?
If so, and seeing that System.Text.Json.Serialization.JsonSerializer
is analogous to Newtonsoft.Json.JsonConvert
(at least in what I perceive to be its intent), I think it should be in System.Text.Json
. Since the majority of use cases will probably be serialization/deserialization, the natural instinct will probably be to reference "System.Text.Json", thinking that will get you access to everything. It seems counter-intuitive to me that I'd have to reference a sub-namespace to get the 90% use case scenario.
One reason of choosing System.Text.Json
was described in the PR that ported System.Json
to .NET Core: https://github.com/dotnet/corefx/pull/9897#discussion-diff-69974331. The idea was to keep the System.Json
implementation in a separate assembly as deprecated.
Maybe all public classes under System.Json
can be decorated with [Obsolete]
now, indicating we should consider newly added types under System.Text.Json
in .NET Core 3.0?
in reference to @khellang's comment, I definitely think there should be a way to differentiate whether a property was supplied if it is not a nullable type. I specifically ran into this situation in a Web API 2 API for my employer. We want the RequestModel to be validated but I also want the RequestModel to include primitive properties such as int, bool, etc. Example: I have a request model like this:
public class GetATMsRequestModel
{
[Required]
public decimal Latitude { get; set; }
[Required]
public decimal Longitude { get; set; }
}
Notice that on my decimal properties is have added the [Required]
attribute. What I was astounded to find was that, in a situation like this, if I neglected to send in one of the properties in my request JSON, the api _happily accepted the input and never raised an error_
Reason? The parser was assigning a default value _before model validation_. This required me to
To solve this problem I had write a custom ContractResolver for use with newtonsoft.json like this:
public class RequiredPropertyContractResolver : CamelCasePropertyNamesContractResolver
{
protected override JsonProperty CreateProperty(MemberInfo member, MemberSerialization memberSerialization)
{
JsonProperty property = base.CreateProperty(member, memberSerialization);
// if we are marking up the property as 'required', we need to make sure that the JSON serialization will
// respect that and force the JSON to contain the value we want. Setting the JsonProperty Required field
// to Required.Always causes the serializer to throw an exception if the JSON doesn't contain the property
if(member.CustomAttributes.Any(a => a.AttributeType == typeof(System.ComponentModel.DataAnnotations.RequiredAttribute)))
{
property.Required = Required.Always;
}
return property;
}
}
If we add the [Required]
attribute to non-nullable type... there should be a way for the serializer to respect that.
Most helpful comment
Some quick feedback: