Runtime: Add snake_case support for System.Text.Json

Created on 17 Jul 2019  Â·  45Comments  Â·  Source: dotnet/runtime

Contents below were taken from @khellang, thanks!

API Proposal

Add snake_case support for System.Text.Json

namespace System.Text.Json
{
    public abstract class JsonNamingPolicy
    {
        protected JsonNamingPolicy() { }
        public static JsonNamingPolicy CamelCase { get { throw null; } }
+       public static JsonNamingPolicy SnakeCase { get { throw null; } }
        internal static JsonNamingPolicy Default { get { throw null; } }
        public abstract string ConvertName(string name);
    }

+   internal sealed class JsonSnakeCaseNamingPolicy : JsonNamingPolicy
+   {
+       public override string ConvertName(string name) { throw null; }
+   }
}

Behavior Proposal

I propose the same behavior as Newtonsoft.Json, just like the existing camel case behavior. The implementation is here and the tests are here.

Other Comments

I think snake case naming is pretty common, especially in the Ruby world, and should be supported out of the box. GitHub's API is probably the most popular example using this naming convention off the top of my head.

Implements

See: dotnet/corefx#40003

api-approved area-System.Text.Json

Most helpful comment

Why is this isn't in the library already? This is a very basic feature for a JSON library. System.Text.Json is not usable as it is.

All 45 comments

If you have any designs in mind, could you update the top post with what the proposed APIs would look like / how commonly used snake case is / etc? Here's what I think is a good example: https://github.com/dotnet/corefx/issues/28933#issue-312415055. This should help people better understand the problem/importance of this issue, and may come up with improvements.

API Proposal

namespace System.Text.Json
{
    public abstract class JsonNamingPolicy
    {
        protected JsonNamingPolicy() { }
        public static JsonNamingPolicy CamelCase { get { throw null; } }
+       public static JsonNamingPolicy SnakeCase { get { throw null; } }
        internal static JsonNamingPolicy Default { get { throw null; } }
        public abstract string ConvertName(string name);
    }

+   internal sealed class JsonSnakeCaseNamingPolicy : JsonNamingPolicy
+   {
+       public override string ConvertName(string name) { throw null; }
+   }
}

Behavior Proposal

I propose the same behavior as Newtonsoft.Json, just like the existing camel case behavior. The implementation is here and the tests are here.

Other Comments

I think snake case naming is pretty common, especially in the Ruby world, and should be supported out of the box. GitHub's API is probably the most popular example using this naming convention off the top of my head.

UPDATE: Looks like dotnet/corefx#40003 implemented more or less this exact proposal.

@khellang I've pasted your proposal contents to the top post, thanks!

I've prepared a benchmark that compares other implementation with dotnet/corefx#40003 PR.

BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18362
Intel Core i7-7700HQ CPU 2.80GHz (Kaby Lake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.0.100-preview8-013656
  [Host]     : .NET Core 3.0.0-preview8-28405-07 (CoreCLR 4.700.19.37902, CoreFX 4.700.19.40503), 64bit RyuJIT
  DefaultJob : .NET Core 3.0.0-preview8-28405-07 (CoreCLR 4.700.19.37902, CoreFX 4.700.19.40503), 64bit RyuJIT

|                          Method |        Mean |      Error |     StdDev | Rank |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |------------:|-----------:|-----------:|-----:|-------:|------:|------:|----------:|
|  ToSnakeCaseStringBuilderBySpan |    85.97 ns |  1.0500 ns |  0.9821 ns |    1 | 0.0637 |     - |     - |     200 B |
| ToSnakeCaseNewtonsoftJsonBySpan |    86.35 ns |  1.7816 ns |  1.4877 ns |    1 | 0.0484 |     - |     - |     152 B |
|       ToSnakeCaseNewtonsoftJson |    88.89 ns |  0.8417 ns |  0.7461 ns |    2 | 0.0484 |     - |     - |     152 B |
|               ToSnakeCaseBySpan |   122.01 ns |  1.9066 ns |  1.6902 ns |    3 | 0.0253 |     - |     - |      80 B |
|                 ToSnakeCaseLinq |   365.75 ns |  4.4839 ns |  3.5008 ns |    4 | 0.1450 |     - |     - |     456 B |
|                ToSnakeCaseRegex | 2,108.71 ns | 40.6829 ns | 31.7625 ns |    5 | 

I investigated a couple of weeks about this case and tried other approaches + dotnet/corefx#40003 that have better performance to having snake case naming policy. Finally, I got the best result and it have less allocation and fast execution. My last implementation is ToSnakeCaseBySpan . Shoud I open a new PR for this?

|                          Method |        Mean |      Error |     StdDev | Rank |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|-------------------------------- |------------:|-----------:|-----------:|-----:|-------:|------:|------:|----------:|
|               ToSnakeCaseBySpan |    27.47 ns |  0.4629 ns |  0.4330 ns |    1 | 0.0153 |     - |     - |      48 B |
|  ToSnakeCaseStringBuilderBySpan |    85.23 ns |  1.6495 ns |  1.3774 ns |    2 | 0.0637 |     - |     - |     200 B |
|       ToSnakeCaseNewtonsoftJson |    85.72 ns |  1.6418 ns |  1.4554 ns |    2 | 0.0484 |     - |     - |     152 B |
| ToSnakeCaseNewtonsoftJsonBySpan |    86.96 ns |  1.7060 ns |  1.5958 ns |    2 | 0.0484 |     - |     - |     152 B |
|                 ToSnakeCaseLinq |   353.42 ns |  3.9670 ns |  3.7108 ns |    3 | 0.1450 |     - |     - |     456 B |
|                ToSnakeCaseRegex | 2,056.69 ns | 29.5694 ns | 26.2125 ns |    4 | 0.1526 |     - |     - |     496 B |

cc @karelz

@xsoheilalizadeh I think the name of a property is usually not too long, so maybe you can use stackalloc to reduce memory allocation on heap to reduce GC pressure? (It may achieve zero allocation)

public unsafe ReadOnlySpan<char> ToSnakeCaseBySpan()
{
    ...
    Span<char> buffer = stackalloc char[bufferSize];
    ...
}

Nice suggestion, but here is a question how I should return buffer, I've got following:

Cannot use local 'buffer' in this context because it may expose referenced variables outside of their declaration scope

Should I open a new PR for this?

Nope, not yet 😅 not until the API is approved.

Because buffer now is allocated on the stack, you cannot return it directly since it will be disposed after return. To solve it, you can simply change return buffer to return buffer.ToString();

Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10


From: Soheil Alizadeh notifications@github.com
Sent: Friday, September 13, 2019 3:48:38 PM
To: dotnet/corefx corefx@noreply.github.com
Cc: Steve hez2010@outlook.com; Author author@noreply.github.com
Subject: Re: [dotnet/corefx] Add UnderScoreCase support for System.Text.Json (#39564)

Nice suggestion, but here is a question how I should return buffer, I've got following:

Cannot use local 'buffer' in this context because it may expose referenced variables outside of their declaration scope

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHubhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fcorefx%2Fissues%2F39564%3Femail_source%3Dnotifications%26email_token%3DADSENWIDJVGBMFT3T4V3U5LQJNAVNA5CNFSM4IEL7W4KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6UIBGI%23issuecomment-531136665&data=02%7C01%7C%7C68574ff9a79b43fbaf2a08d7381ecaaf%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637039577204910801&sdata=PhAc8%2BvEwr9AHgXI1vNdlLENvuKw1jDC6x%2F%2FNuc7PDk%3D&reserved=0, or mute the threadhttps://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FADSENWIHQJ7DC64RZH2RQFDQJNAVNANCNFSM4IEL7W4A&data=02%7C01%7C%7C68574ff9a79b43fbaf2a08d7381ecaaf%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637039577204920805&sdata=M9kuqi6W4P3Msx76loMm8q2laC%2FrNgb4ZwIozAwcDZ4%3D&reserved=0.

@ahsonkhan are we ready to review the API? @xsoheilalizadeh is eager to implement it ...

I've figured out that the ToSnakeCaseBySpan can't pass these tests, it's just compatible with pascal case properties, I've tried to make it compatible with other formats but unfortunately, I've failed and it'd be unreadable. I was hurrying to contribute and got a bad result, sorry.

This looks like a reasonable API addition. I was considering how common such formats are to warrant adding support built-in (rather than folks creating their own naming policy to support it), and looks like quite a few places use snake_case including Facebook, AWS, and Twitter APIs (aside from the fact that Json.NET supports it):
https://stackoverflow.com/questions/5543490/json-naming-convention

And of course, what @khellang said:

I think snake case naming is pretty common, especially in the Ruby world, and should be supported out of the box. GitHub's API is probably the most popular example using this naming convention off the top of my head.

cc @steveharter

Video

Makes sense. We also considered others (such as kebap casing) but it seems we should defer that until someone actually needs that.

@xsoheilalizadeh, are you still working on it? :D you should be able to open PRs for it.

Unfortunately no till I pass tests, I tried to pass tests but my implementation only support pascal-case naming for example FirstName will be first_name but in other cases like IsCIA it will be Is_c_i_a that isn't expected.

@Gnbrkm41 How about the existing PR dotnet/corefx#40003

I mean... sure, why not, as long as it has all the tests and it passes :D

@ahsonkhan Did the same thing already, so taking this.

Makes sense. We also considered others (such as kebap casing) but it seems we should defer that until someone actually needs that.

The lack of PascalCase option is a show stopper for our migration.
I believe there should be a "preserve case" option which will also work for us.

The lack of PascalCase option is a show stopper for our migration.
I believe there should be a "preserve case" option which will also work for us.

Isn't that the default? 🤔

https://github.com/dotnet/corefx/blob/296c0e71ddcc4885d30af18dea829c1923cfdc5c/src/System.Text.Json/src/System/Text/Json/Serialization/JsonDefaultNamingPolicy.cs#L9

The lack of PascalCase option is a show stopper for our migration.
I believe there should be a "preserve case" option which will also work for us.

Isn't that the default? 🤔

https://github.com/dotnet/corefx/blob/296c0e71ddcc4885d30af18dea829c1923cfdc5c/src/System.Text.Json/src/System/Text/Json/Serialization/JsonDefaultNamingPolicy.cs#L9

Yes - that worked ... but not easily!
You can not select the Default !!!
JsonSerializerOptions.PropertyNamingPolicy = JsonNamingPolicy.Default
and also PropertyNamingPolicy gets initialised in constructor to JsonCamelCaseNamingPolicy
new JsonOptions().JsonSerializerOptions.PropertyNamingPolicy

However you can trick it with :-)
JsonSerializerOptions.PropertyNamingPolicy = null;
and PascalCase goes through unobstructed!

the default is to use exact match IIRC. setting it to null also does that as well.

I'd prefer if we could just define our own naming strategies. We use our own naming method for snake_case because it occurred before Json.net came out with it and it's slightly different. So perhaps just allow us to set a naming policy or use the default one.

@niemyjski Of course you can. Just implement JsonNamingPolicy like the built-in ones and set that. Almost everything about the new serializer is configurable and/or extensible, these issues are usually just about including more stuff in the box.

Please could KebabCase also be added?

It should be a very minor addition - basically the same as SnakeCase but with - in place of _

E.g. requests sent to Apple Push Notification Service require kebab-case JSON fields: https://developer.apple.com/documentation/usernotifications/setting_up_a_remote_notification_server/generating_a_remote_notification#2943360

@sveinfid - There was precedence for snake casing (e.g. in newtonsoft json) and enough user need to provide built-in support for it. I am not sure that's the case for KebabCase. Please file a separate issue (with the API proposal) and we can consider it, but my hunch is, this is something that the caller would have to implement (like any other custom casing/naming strategy) and doesn't seem common enough to require built-in support.

Like you said, the impl could be relatively straight forward (take existing snake case impl and replace the delimiter), which means there is also little value in providing one built-in.

From @YohDeadfall in https://github.com/dotnet/runtime/issues/673:

Originally proposed by @hez2010 in dotnet/corefx#39564.

@ahsonkhan, could you add the original issue here and important comments from dotnet/corefx#41354?

@karelz We need your opinion here, since @khellang asked a question about altering the existing behavior of the camel case policy too follow common translator rules for the JS world as splitting a string by words and concatenating it back using some separator and applying a policy to change casing for each word. It's a breaking change and @stephentoub likes it. Me too. See dotnet/corefx#41354 (comment).

I still would like to drive this feature.

@steveharter @layomia

Only FYI

If someone reading this needs to implement this asap, you could find useful the next classes to use a naming policy for SnakeCase and KebabCase.

https://github.com/J0rgeSerran0/JsonNamingPolicy

I hope that helps.

Thank you, @J0rgeSerran0, for your JsonSnakeCaseNamingPolicy! Going to use it while we are waiting for the built-in support...

and looks like quite a few places use snake_case including Facebook, AWS, and Twitter APIs

It's also used in all OAuth 2.0 / OpenID Connect responses (token response, discovery document, etc.)
I think it's really important to support it out of the box; right now I have to fall back to Newtonsoft.Json to handle it.

right now I have to fall back to Newtonsoft.Json to handle it.

@thomaslevesque You've seen the various workarounds in this thread, right?

@khellang I haven't seen any real workaround mentioned in this thread... I could implement my own naming policy or use the one proposed by @J0rgeSerran0, but it's not worth the trouble in my scenario. Eventually I didn't use Newtonsoft.Json, I just put [JsonPropertyName] attributes on all my properties.

Thanks for comment @thomaslevesque
Anyway, I am curious to know which are the impediments that you have in your scenario or why System.Text.Json and/or the workarounds are not worth the trouble.
Do you have any sample code to check this?
It could help us to improve System.Text.Json or the workarounds that you have used.
Thanks a lot!

Anyway, I am curious to know which are the impediments that you have in your scenario or why System.Text.Json and/or the workarounds are not worth the trouble.

In a real application, I would probably use your policy implementation. But in this case it's just a demo app to illustrate a blog article, so I want to keep the code terse and simple. Having a whole snake case naming policy implementation would just add noise.

Hi Thomas.

Now I better understand the situation.
Good to know that you are talking about a special situation in this case, and no about a general situation.
This is an important point to avoid possible misunderstandings.

Thanks a lot!

@xsoheilalizadeh I would like to also add that you might want to change this line to remove the && name[i] != name[0] and start the for loop at i = 1. One test case is if the property name is FirstNameFirst would result in First_Name_Firs

https://github.com/xsoheilalizadeh/SnakeCaseConversion/blob/master/SnakeCaseConversionBenchmark/SnakeCaseConventioneerBenchmark.cs#L55

Unfortunately no till I pass tests, I tried to pass tests but my implementation only support pascal-case naming for example FirstName will be first_name but in other cases like IsCIA it will be Is_c_i_a that isn't expected.

@xsoheilalizadeh I've built off your solution to get it working. Let me know if you can think of any other test cases

public class UnitTest1
    {
        [Theory]
        [InlineData("FirstName", "First_Name")]
        [InlineData("FirstNameM", "First_Name_M")]
        [InlineData("StreetAddress1", "Street_Address_1")]
        [InlineData("StreetAddress1Two34", "Street_Address_1_Two_34")]
        [InlineData("IsCIA", "Is_Cia")]
        [InlineData("IsCIAAgent", "Is_Cia_Agent")]
        public void Test1(string input, string expected)
        {
            Assert.Equal(expected, ConvertNameSnakeCase(input));
        }

        public string ConvertNameSnakeCase(string name)
        {
            int numOfUnderscores = 0;

            int namePosition = 1;
            while (namePosition < name.Length - 1)
            {
                if (
                    (char.IsUpper(name[namePosition]) && (char.IsLower(name[namePosition + 1]) || !char.IsUpper(name[namePosition - 1]))) ||
                    (char.IsNumber(name[namePosition]) && !char.IsNumber(name[namePosition - 1]))
                   )
                {
                    numOfUnderscores++;
                }
                ++namePosition;
            }

            if (
                (char.IsUpper(name[namePosition]) && !char.IsUpper(name[namePosition - 1])) ||
                (char.IsNumber(name[namePosition]) && !char.IsNumber(name[namePosition - 1]))
               )
            {
                numOfUnderscores++;
            }

            int bufferSize = name.Length + numOfUnderscores;

            Span<char> buffer = new char[bufferSize];

            int bufferPosition = 0;
            namePosition = 0;

            buffer[bufferPosition++] = name[namePosition++];

            while (namePosition < name.Length - 1)
            {
                if (
                    (char.IsUpper(name[namePosition]) && (char.IsLower(name[namePosition + 1]) || !char.IsUpper(name[namePosition - 1]))) ||
                    (char.IsNumber(name[namePosition]) && !char.IsNumber(name[namePosition - 1]))
                    )
                {
                    buffer[bufferPosition] = '_';
                    buffer[bufferPosition + 1] = name[namePosition];
                    bufferPosition += 2;
                    namePosition++;
                    continue;
                }

                buffer[bufferPosition] = char.ToLower(name[namePosition]);
                bufferPosition++;

                namePosition++;
            }

            if (
                (char.IsUpper(name[namePosition]) && !char.IsUpper(name[namePosition - 1])) ||
                (char.IsNumber(name[namePosition]) && !char.IsNumber(name[namePosition - 1]))
                )
            {
                buffer[bufferPosition] = '_';
                buffer[bufferPosition + 1] = name[namePosition];
                return new string(buffer);
            }

            buffer[bufferPosition] = char.ToLower(name[namePosition]);
            return new string(buffer);
        }
    }

@jonathann92
There is a PR that already implemented built-in snake-case naming policy, you can us its tests here.

@YohDeadfall, re https://github.com/dotnet/runtime/issues/876#issuecomment-629428887:

Can you remind me what the hold-up/issue was on the implementation for this - https://github.com/dotnet/corefx/pull/41354? I see an approval, and the only unaddressed feedback seemed to be around caching conversion results for dictionary keys (which doesn't need to be addressed as part of this work).

While Stephen approved the PR, there are some open questions in https://github.com/dotnet/corefx/pull/41354#issuecomment-554958476. No one gave comments on them.

By the way, if my mind serves me Toub said that the camel casing policy should share the same logic with other policies, but right now it just takes the first character and lower cases it while other platforms do it in a different way.

Thanks @YohDeadfall. I'll digest this and follow up.

@layomia @terrajobst Do I correctly understand that the team isn't willing to introduce the feature in .NET 5? Am really interested in it since it intersects with some stuff in Npgsql, and we should have the standard way to convert strings using the snake case convention.

Why is this isn't in the library already? This is a very basic feature for a JSON library. System.Text.Json is not usable as it is.

Due to other commitments for 5.0, this will have to come in 6.0. Custom implementations can be utilized until then via JsonSerializerOptions.PropertyNamingPolicy.

One year for 6.0 is just way too long for minor stuff like this. I've mentioned this in other .net projects before, but I feel like you guys need to allow yourselves interim feature updates. No breaking changes or anything.

Think about how any other language/framework or ecosystems would do it.

cc @terrajobst -- Does "one year" satisfy any definition of being competitive in our industry? It looks more like a regression to what was comfortable in the Windows-only days.

Was this page helpful?
0 / 5 - 0 ratings