Json: Substantial performance penalty caused by polymorphic input adapter

Created on 25 Jan 2019  路  18Comments  路  Source: nlohmann/json

  • Describe the feature in as much detail as possible.

I ran a quick-and-dirty-ish devirtualization test by

  1. templating the lexer and parser on the input adapter subclass (defaulting to the base class)
  2. mark the input adapter subclasses as final
  3. Update the benchmark code to use the correct subclass directly

Before:

ParseFile/jeopardy            1024201192 ns 1022001731 ns          1   51.8404MB/s
ParseFile/canada                55346558 ns   55270439 ns         11   38.8412MB/s
ParseFile/citm_catalog          18555197 ns   18534212 ns         36    88.873MB/s
ParseFile/twitter                8912834 ns    8901508 ns         75   67.6581MB/s
ParseFile/floats               436844896 ns  436478028 ns          2   49.5332MB/s
ParseFile/signed_ints          320553309 ns  320362389 ns          2   72.5731MB/s
ParseFile/unsigned_ints        303152901 ns  302953363 ns          2   76.8034MB/s
ParseFile/small_signed_ints    169066171 ns  168948784 ns          4   69.9311MB/s
ParseString/jeopardy           795351806 ns  794811339 ns          1   66.6586MB/s
ParseString/canada              54493790 ns   54445449 ns         12   39.4297MB/s
ParseString/citm_catalog        19724418 ns   19707415 ns         36   83.5822MB/s
ParseString/twitter              9150619 ns    9144419 ns         77   65.8608MB/s
ParseString/floats             421472923 ns  421149121 ns          2   51.3361MB/s
ParseString/signed_ints        306999160 ns  306754574 ns          2   75.7924MB/s
ParseString/unsigned_ints      297069396 ns  296783050 ns          2   78.4002MB/s
ParseString/small_signed_ints  165137695 ns  164944718 ns          4   71.6287MB/s

After:

ParseFile/jeopardy             959820102 ns  958613968 ns          1   55.2684MB/s
ParseFile/canada                53412539 ns   53372580 ns         11   40.2223MB/s
ParseFile/citm_catalog          17567669 ns   17552345 ns         40   93.8444MB/s
ParseFile/twitter                8587726 ns    8581012 ns         80    70.185MB/s
ParseFile/floats               419401114 ns  419091029 ns          2   51.5882MB/s
ParseFile/signed_ints          315569793 ns  315206557 ns          2   73.7601MB/s
ParseFile/unsigned_ints        312478482 ns  312195920 ns          2   74.5296MB/s
ParseFile/small_signed_ints    157348675 ns  157226418 ns          4    75.145MB/s
ParseString/jeopardy           765028598 ns  764315312 ns          1   69.3183MB/s
ParseString/canada              52469064 ns   52414274 ns         13   40.9577MB/s
ParseString/citm_catalog        18649653 ns   18632984 ns         37   88.4018MB/s
ParseString/twitter              8899843 ns    8890877 ns         79    67.739MB/s
ParseString/floats             398706517 ns  398208512 ns           2   54.2936MB/s
ParseString/signed_ints        296954004 ns  296550211 ns          2   78.4005MB/s
ParseString/unsigned_ints      288880534 ns  288440238 ns          2   80.6678MB/s
ParseString/small_signed_ints  157066249 ns  156828785 ns          4   75.3355MB/s

Which represents a 3-10% speedup across the board.

  • Include sample usage where appropriate.

Once correctly implemented, this should not cause any API change whatsoever.

Is this something worth pursuing? If there's interest, I'll draft a clean PR for it.

enhancemenimprovement release item improvement proposed fix

Most helpful comment

We've been testing this on our very busy implementation (JSON parsing time is about 20% of the total profile), and found it to do exactly as advertised; noticeable boost in performance, no code changes required. Would indeed be nice to see this PR accepted.

All 18 comments

Yes, such a PR would be greatly appreciated!!!

I've just spent a bit of time working on this, and I've hit a pretty big snag: Making the json.parse({ptr, len}) syntax work is proving to be a massive pain.

In c++17, with template deduction guides, it would be much more straightforward. But for c++11, I think the best I could do is create a few specific overloads for common use-cases like parse(string), parse(istream&) etc..., and let the current polymorphic code take care of the rest...

This might be unrelated LALR parsers has been been suggested in contribution guideline but it is probably outdated (e.g. the parser is not recursive anymore).

Can re-implementing the parser in some other form help to improve the performance? What is the reason to believe they can the improve the performance compared to current hard-crafted parser? Are they theoretically more efficient than current parser or it is a matter of chance/code generation that we end up with a faster parser?

Yes, the comment is outdated. The parser now is not recursive any more. I used parser generators like bison in the past and I did not like the fact that this library parsed JSON in such a "manual" way. I guess now parsing performance could be rather improved by avoiding copies or allocations.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Has any more attention been put into this? I was running some profiles recently, and the polymorphic calls to the input/output adapters was showing up on the profile higher than I would like.

No unfortunately not. I will reopoen and hope for a PR.

My PR (for the input part of the problem) is still almost ready to go. The remaining problem is still supporting json.parse({ptr, len}).

The issue is that json.parse() needs to be templated on its input parameters, but brace-enclosed lists cannot participate in template resolution.

Edit: I've managed to work around the issue without reverting to the polymorphic interface by adding a new overload to the main API functions, relying on the fact that the parse({A, B}) syntax only ever operates on contiguous containers.

For reference, in my opinion, a nicer fix would have been to deprecate the bracket enclosed syntax, and add the two-parameter overload to parse() in order to match the from_*() functions. Agressively relying on implicit casting into a detail type like the current interface is not exactly great in the first place.

Can we add the overload function even if not deprecating the old one? As a faster path for those who care?

My old comment about having a fast path vs a slow path does not apply to my current PR. All code paths are "fast" while maintaining the current API, it's just that the user-facing interface is a little clumsy if you look at it from too close.

Yeah, I'm not a fan of that pattern. Okay, thanks. Would be nice to get this into the library. :)

We've been testing this on our very busy implementation (JSON parsing time is about 20% of the total profile), and found it to do exactly as advertised; noticeable boost in performance, no code changes required. Would indeed be nice to see this PR accepted.

One report; we're in the process of moving from clang 7.0.1 to 9.0.1 here. Apropos to this change, while 7.0.1 compiled clean, we see the following warning with 9.0.1:

In file included from Version.cpp:14:
nlohmann/json.hpp:4004:27: warning: explicitly defaulted move assignment operator is implicitly deleted [-Wdefaulted-function-deleted]
    input_buffer_adapter& operator=(input_buffer_adapter&&) = default;
                          ^
nlohmann/json.hpp:4021:23: note: move assignment operator of 'input_buffer_adapter' is implicitly deleted because field 'limit' is of const-qualified type 'const char *const'
    const char* const limit;
                      ^
1 warning generated.

And that's valid; the const data member prevents the type from being movable.

Good catch. It's unfortunate that no test caught this, despite coverage staying at 100%...

The latest update resolves the warning for me on the 9.0.1 compiler

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Was this page helpful?
0 / 5 - 0 ratings