Runtime: Optimize number processing and validation while reading JSON token

Created on 7 Nov 2018 · 10Comments · Source: dotnet/runtime

Consider removing the ConsumeNumberResult internal enum and optimize the implementation of TryGetNumber. The way it is implemented currently is not fast enough, likely due to some redundant or unnecessary work. The helper methods may need to be marked as aggressively inlined.

As part of this work, try to avoid potential redundant search for end of number in ConsumeNumber.

area-System.Text.Json enhancement tenet-performance up-for-grabs

Source

ahsonkhan

Most helpful comment

Some minor possible optimizations (no game changers however):

Avoid the call to ConsumeSign if data[i] != (byte)'-' && data[i] != (byte)'+' as it will not be inlined:
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L981

Avoid the call to ConsumeNegativeSign if data[i] != (byte)'-' as it will not be inlined:
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L906

ConsumeSign and ConsumeNegativeSign could return bool as there are only two possible outcomes (besides an exception):
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L1011
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L1130

Consider replacing JsonConstants.Delimiters.IndexOf calls with a switch statement as follows:

public static bool IsEndOfNumber(byte b)
{
    switch(b)
    {
        case ListSeperator:
        case CloseBrace:
        case CloseBracket:
        case Space:
        case LineFeed:
        case CarriageReturn:
        case Tab:
        case Slash:
            return true;
    }

    return false;
}

Benchmark results:

| Method | Input | Mean | Error | StdDev |
|-------- |------ |----------:|----------:|----------:|
| IndexOf (best case) | , | 4.1159 ns | 0.0789 ns | 0.0738 ns |
| Switch (best case) | , | 0.7663 ns | 0.0611 ns | 0.0572 ns |
| IndexOf (worst case) | 0 | 4.8661 ns | 0.1193 ns | 0.1116 ns |
| Switch (worst case) | 0 | 1.4195 ns | 0.0353 ns | 0.0313 ns |

Occurrences:
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L697

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L1046

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L1100

Consider storing whether the previously parsed number has an exponent. This already known fact, is recalculated when accessed by TryGetSingleValue, TryGetDoubleValue or TryGetDecimalValue.

Occurrences:
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.TryGet.cs#L128

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.TryGet.cs#L150

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.TryGet.cs#L172

tkp1n on 17 Dec 2018

👍3

All 10 comments

Some insights regarding inlining of helper methods based on the ETW profiler applied to the benchmarks at https://github.com/dotnet/performance/pull/185:

Inliner | Inlinee | FailureReason
---|---|---|
Utf8JsonReader.TryGetNumber | Utf8JsonReader.Slice | success
Utf8JsonReader.TryGetNumber | Utf8JsonReader.Add | success
Utf8JsonReader.TryGetNumber | Utf8JsonReader..ctor | success
Utf8JsonReader.TryGetNumber | Utf8JsonReader.set_ValueSpan | success
Utf8JsonReader.TryGetNumber | Utf8JsonReader.ConsumeNegativeSign | too many il bytes
Utf8JsonReader.TryGetNumber | Utf8JsonReader.ConsumeSign | too many il bytes
Utf8JsonReader.TryGetNumber | Utf8JsonReader.ConsumeDecimalDigits | too many il bytes
Utf8JsonReader.TryGetNumber | Utf8JsonReader.ConsumeZero | too many il bytes
Utf8JsonReader.TryGetNumber | Utf8JsonReader.ConsumeIntegerDigits | unprofitable inline
Utf8JsonReader.TryGetNumber | ThrowHelper.ThrowJsonReaderException | does not return
Utf8JsonReader.TryGetNumber | ThrowHelper.ThrowJsonReaderException | does not return
Utf8JsonReader.TryGetNumber | Utf8JsonReader.ConsumeIntegerDigits | unprofitable inline
Utf8JsonReader.TryGetNumber | ThrowHelper.ThrowJsonReaderException | does not return
Utf8JsonReader.TryGetNumber | ThrowHelper.ThrowArgumentOutOfRangeException | does not return

tkp1n on 11 Dec 2018

👍1

A comparison of the current master vs. all helpers attributed with [MethodImpl(MethodImplOptions.AggressiveInlining)]:
bench

tkp1n on 11 Dec 2018

A comparison of the current master vs. all helpers attributed with

Generally speaking, we try to avoid adding the AggressiveInlining attribute excessively, especially since it ends up giving a false sense of improving performance on benchmarks measuring local performance, while potentially hurting real life usages (or larger/end-to-end perf tests). When measuring locally (only), almost any method marked as AggressiveInlining could show improvement. So, we try to reserve its usage for unique scenarios, or one where the gain is quite significant (or we validate that it actually improved the actual user workload). I don't see (from the results you shared), that we get much from marking them with AggressiveInlining and hence not worth it.

The issue here is trying to look for large performance improvements that come from actually doing less work in TryGetNumber either be finding and removing redundant checks or by finding ways to re-write it without using the ConsumeNumberResult enum in certain cases.

cc @jkotas

ahsonkhan on 11 Dec 2018

👍1

I agree, that it is probably not worth adding AggressiveInlining to the helper methods, given the rather insignificant gain in the benchmarks. Just wanted to share the results as you mentioned inlining of the helper methods in the issue.
I'll see if I come up with a cheaper overall solution to TryGetNumbers.

tkp1n on 12 Dec 2018

❤1

Some minor possible optimizations (no game changers however):

Consider replacing JsonConstants.Delimiters.IndexOf calls with a switch statement as follows:

public static bool IsEndOfNumber(byte b)
{
    switch(b)
    {
        case ListSeperator:
        case CloseBrace:
        case CloseBracket:
        case Space:
        case LineFeed:
        case CarriageReturn:
        case Tab:
        case Slash:
            return true;
    }

    return false;
}

Benchmark results:

Occurrences:
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L697

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L1046

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.cs#L1100

Consider storing whether the previously parsed number has an exponent. This already known fact, is recalculated when accessed by TryGetSingleValue, TryGetDoubleValue or TryGetDecimalValue.

Occurrences:
https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.TryGet.cs#L128

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.TryGet.cs#L150

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Utf8JsonReader.TryGet.cs#L172

tkp1n on 17 Dec 2018

👍3

Consider storing whether the previously parsed number has an exponent. This already known fact, is recalculated when accessed by TryGetSingleValue, TryGetDoubleValue or TryGetDecimalValue.

We should definitely do this!

Consider replacing JsonConstants.Delimiters.IndexOf calls with a switch statement as follows:

I am not too sure about this one. I recommend measuring random "delimiters". So, try reading an integer array JSON with numbers ending in spaces and commas randomly (let's say you alternate between the two in your array). Switch statements tend to show great performance for predictable inputs (for instance, your constant input) but may not be that performant when the branch predictor is incorrect and the cases jump around. If it still proves to be useful, we should do it.

ahsonkhan on 17 Dec 2018

Just measured a bunch of "random delimiters and numbers". Even in the worst case I was able to produce, 'switch' was twice as fast as 'IndexOf'. I am still not too confident in this optimization either.

tkp1n on 17 Dec 2018

👍1

Comparing what shipped with the preview-1 against dotnet/corefx#34386 and dotnet/corefx#34500 merged yields the following perf improvements:

Lots of numbers...

| Method | NumberSet |-----------------|----------------------|-------- | GetInt32Value | positive numbers | 740,30 | GetInt64Value | positive numbers | 958,80 | GetSingleValue | positive numbers | 1768,60 | GetDoubleValue | positive numbers | 1787,90 | GetDecimalValue | positive numbers | 1809,20 | GetInt32Value | positive numbers E+2 | 1039,80 | GetInt64Value | positive numbers E+2 | 1225,20 | GetSingleValue | positive numbers E+2 | 2163,60 | GetDoubleValue | positive numbers E+2 | 2076,70 | GetDecimalValue | positive numbers E+2 | 2122,10 | GetInt32Value | positive numbers E-2 | 988,10 | GetInt64Value | positive numbers E-2 | 1225,50 | GetSingleValue | positive numbers E-2 | 2087,20 | GetDoubleValue | positive numbers E-2 | 2032,80 | GetDecimalValue | positive numbers E-2 | 2101,60 | GetInt32Value | positive numbers E2 | 1034,10 | GetInt64Value | positive numbers E2 | 1226,40 | GetSingleValue | positive numbers E2 | 2172,80 | GetDoubleValue | positive numbers E2 | 2043,10 | GetDecimalValue | positive numbers E2 | 2137,30 | GetInt32Value | positive numbers e+2 | 1010,50 | GetInt64Value | positive numbers e+2 | 1225,70 | GetSingleValue | positive numbers e+2 | 2102,70 | GetDoubleValue | positive numbers e+2 | 2004,10 | GetDecimalValue | positive numbers e+2 | 2112,70 | GetInt32Value | positive numbers e-2 | 970,90 | GetInt64Value | positive numbers e-2 | 1205,00 | GetSingleValue | positive numbers e-2 | 2098,20 | GetDoubleValue | positive numbers e-2 | 2021,70 | GetDecimalValue | positive numbers e-2 | 2134,60 | GetInt32Value | positive numbers e2 | 964,20 | GetInt64Value | positive numbers e2 | 1306,40 | GetSingleValue | positive numbers e2 | 2154,70 | GetDoubleValue | positive numbers e2 | 2051,20 | GetDecimalValue | positive numbers e2 | 2090,20 | GetInt32Value | negative numbers | 842,80 | GetInt64Value | negative numbers | 1037,40 | GetSingleValue | negative numbers | 1779,80 | GetDoubleValue | negative numbers | 1746,70 | GetDecimalValue | negative numbers | 1777,50 | GetInt32Value | negative numbers E+2 | 1029,10 | GetInt64Value | negative numbers E+2 | 1164,00 | GetSingleValue | negative numbers E+2 | 1988,00 | GetDoubleValue | negative numbers E+2 | 2033,10 | GetDecimalValue | negative numbers E+2 | 2126,50 | GetInt32Value | negative numbers E-2 | 928,10 | GetInt64Value | negative numbers E-2 | 1158,70 | GetSingleValue | negative numbers E-2 | 2038,40 | GetDoubleValue | negative numbers E-2 | 2073,00 | GetDecimalValue | negative numbers E-2 | 2377,70 | GetInt32Value | negative numbers E2 | 1011,90 | GetInt64Value | negative numbers E2 | 1231,90 | GetSingleValue | negative numbers E2 | 2116,10 | GetDoubleValue | negative numbers E2 | 2086,80 | GetDecimalValue | negative numbers E2 | 2172,80 | GetInt32Value | negative numbers e+2 | 1026,90 | GetInt64Value | negative numbers e+2 | 1256,20 | GetSingleValue | negative numbers e+2 | 2060,40 | GetDoubleValue | negative numbers e+2 | 2052,30 | GetDecimalValue | negative numbers e+2 | 2137,40 | GetInt32Value | negative numbers e-2 | 1016,00 | GetInt64Value | negative numbers e-2 | 1259,70 | GetSingleValue | negative numbers e-2 | 1996,20 | GetDoubleValue | negative numbers e-2 | 2053,30 | GetDecimalValue | negative numbers e-2 | 2040,20 | GetInt32Value | negative numbers e2 | 970,10 | GetInt64Value | negative numbers e2 | 1232,00 | GetSingleValue | negative numbers e2 | 2078,40 | GetDoubleValue | negative numbers e2 | 2004,70 | GetDecimalValue | negative numbers e2 | 2085,60 | Mean us (before) | Mean us (after) | Diff | --------|---------------|---------| | 656,80 | -11,28% | | 763,50 | -20,37% | | 1487,80 | -15,88% | | 1424,80 | -20,31% | | 1601,30 | -11,49% | | 824,20 | -20,73% | | 1100,40 | -10,19% | | 2033,60 | -6,01% | | 1932,80 | -6,93% | | 1830,90 | -13,72% | | 872,70 | -11,68% | | 1046,10 | -14,64% | | 1953,10 | -6,42% | | 1946,20 | -4,26% | | 1863,50 | -11,33% | | 888,20 | -14,11% | | 1098,60 | -10,42% | | 1917,90 | -11,73% | | 1796,80 | -12,06% | | 1772,00 | -17,09% | | 852,00 | -15,69% | | 1081,50 | -11,76% | | 1993,90 | -5,17% | | 1965,60 | -1,92% | | 1846,70 | -12,59% | | 865,00 | -10,91% | | 909,90 | -24,49% | | 1952,60 | -6,94% | | 1937,30 | -4,17% | | 1843,50 | -13,64% | | 838,20 | -13,07% | | 1081,70 | -17,20% | | 2000,50 | -7,16% | | 1978,70 | -3,53% | | 1821,70 | -12,85% | | 641,00 | -23,94% | | 756,10 | -27,12% | | 1541,50 | -13,39% | | 1512,00 | -13,44% | | 1638,20 | -7,84% | | 845,70 | -17,82% | | 1042,50 | -10,44% | | 1928,20 | -3,01% | | 1876,70 | -7,69% | | 1773,60 | -16,60% | | 835,30 | -10,00% | | 1046,00 | -9,73% | | 1932,90 | -5,18% | | 1888,30 | -8,91% | | 1766,60 | -25,70% | | 849,70 | -16,03% | | 900,00 | -26,94% | | 1862,80 | -11,97% | | 1804,00 | -13,55% | | 1770,60 | -18,51% | | 841,10 | -18,09% | | 1074,90 | -14,43% | | 1922,30 | -6,70% | | 1871,40 | -8,81% | | 1755,30 | -17,88% | | 834,90 | -17,82% | | 1100,40 | -12,65% | | 1961,40 | -1,74% | | 1888,10 | -8,05% | | 1765,70 | -13,45% | | 845,80 | -12,81% | | 1037,90 | -15,75% | | 1906,90 | -8,25% | | 1801,60 | -10,13% | | 1761,40 | -15,54% |

tkp1n on 18 Jan 2019

❤1

@ahsonkhan Seems like removing ConsumeNumberResult from TryGetNumber is still up-for-grabs? Ok if I take that up although it is marked for future? Because I want to take up something which I can spread out over some days.

WinCPP on 15 Aug 2019

@WinCPP - the intent of removing ConsumeNumberResult from TryGetNumber is only in an effort to improve performance. I wouldn't remove it unless you are trying to make perf optimizations to the utf8jsonreader's number parsing and simplifying the logic makes a difference (or is the same in terms of perf but results in cleaner code that is easier to understand/maintain).

Consider leveraging the perf analysis work by @tkp1n and use the benchmarks from https://github.com/dotnet/performance/pull/185 to measure the performance optimization.

Ok if I take that up although it is marked for future? Because I want to take up something which I can spread out over some days.

Certainly. Feel free to explore it and try out changes. However, I wouldn't recommend merging any changes until after 3.0 is done (~September) to avoid master and 3.0 diverging too much while we are still making code changes for 3.0.

ahsonkhan on 15 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings