Kotlinx.serialization: Serializer to get the raw json value of a key?

Created on 9 Sep 2020  路  7Comments  路  Source: Kotlin/kotlinx.serialization

What is your use-case and why do you need this feature?

I am looking for a way to deserialize a specific field in my json into a raw byte array representing the value json sub-document.

Basically my json documents have large/complex json sub-trees that I would like to avoid parsing to save cpu/allocations. But I still need the value so I can re-create the original json if needed.

In golang, for example, this can be achieved with json.RawMessage: https://golang.org/pkg/encoding/json/#RawMessage

In gson, a type adapter can be used to regenerate json during parsing. This is not particularly cpu/gc efficient, but it works: https://github.com/google/gson/issues/1368

In moshi, there is work being done to be able to skip over the value and consume it into a raw value field: https://github.com/square/moshi/issues/675

Describe the solution you'd like

I'm not familiar enough with the kotlin.serialization APIs to know if there is already a way to do this, or if it can be implemented within a custom serializer. Any pointers would be appreciated!

feature json

Most helpful comment

Thanks for the clarification and your input!

It's not something we are going to do right now (at least until 1.1.0 version), but thanks to your feedback, I've left the possibility to add this functionality in a backwards-compatible way both for custom serializers and regular JSON usages.
Let's see how it goes in Moshi and the demand on that.

Design idea: instead of using @RawJson annotation, introduce an inline class RawString(value: String) with its own custom serializer to provide a better type-safety and emphasis user intention in a type

All 7 comments

Please, check out Json Elements: https://github.com/Kotlin/kotlinx.serialization/blob/master/docs/json.md#json-elements
Does it do what you are looking at?

If I understood correctly, you want to save some part of JSON to a String (RawJson) property so it would be parsed later? We do not currently support this concept. JsonElement is an untyped version that does not do mapping on classes, although it still performs parsing to check that your JSON is valid

If I understood correctly, you want to save some part of JSON to a String (RawJson) property so it would be parsed later?

Yes, exactly. I want to defer parsing for parts of the document.

JsonElement is not a good fit because it still parses and creates objects. So the cpu/memory benefits of lazy parsing are lost.

How does kotlin.serialization handle ignores unknown keys when deserializing into an object? Are the keys skipped during or after parsing? (I'm wondering what the cpu/allocation overhead is in cases where keys are ultimately ignored)

The unknown keys are skipped without parsing (tokenizing only). However, the skipped string is not saved anywhere, so it requires some additional amount of work to support such a feature

The feature seems like a reasonable addition, tho it still has some open questions.

Are the keys skipped during or after parsing?

Could you please elaborate on your use-case here? Because "put all unknown keys in a separate String property with valid JSON string" and "Treat specifically marked property not as simple String, but as a valid JSON encoded in String" are completely different approaches.

JsonElement is not a good fit because it still parses and creates objects. So the cpu/memory benefits of lazy parsing are lost.

I wonder if there exist benchmarks (or maybe you have a relevant story to add?) to ensure that the performance boost is significant here. Because even without allocations of JsonElement, parser still has to 1) parse the JSON and extract the relevant sub-object 2) ensure that the whole sub-object is a valid JSON. And the second part is probably the slowest in the whole JSON decoding process, so I'm really interested in knowing how big is the performance improvement here.

"Treat specifically marked property not as simple String, but as a valid JSON encoded in String"

This is what I'm interested in and what is implemented by the other examples I provided.

I have one specific key in my json that contains a huge json subtree, with thousands of objects and several layers of nesting. I don't want to these create thousands of objects per json parse because it leads to severe GC pressure on many Android devices.

Thanks for the clarification and your input!

It's not something we are going to do right now (at least until 1.1.0 version), but thanks to your feedback, I've left the possibility to add this functionality in a backwards-compatible way both for custom serializers and regular JSON usages.
Let's see how it goes in Moshi and the demand on that.

Design idea: instead of using @RawJson annotation, introduce an inline class RawString(value: String) with its own custom serializer to provide a better type-safety and emphasis user intention in a type

Was this page helpful?
0 / 5 - 0 ratings