Kotlinx.serialization: Field based redaction

Created on 26 Feb 2019 · 9Comments · Source: Kotlin/kotlinx.serialization

It would be really nice to have a way to apply selective redaction (aka. hidden / secret values) on fields for classes marked @Serializable. I realize that you can probably write your own custom serializers but this seems like functionality that should probably be "out-of-the-box".

Coming from the java world, there are certain libraries that do this already (see example in java immutables), so it led me to request this feature here.

feature

Source

dotCipher

Most helpful comment

For this use-case you can write your own encoder that know what fields should be masked (based either on annotation or simply by field name) and masks them appropriately.

elizarov on 28 Feb 2019

👍2

All 9 comments

What is your use-case for that with respect to serialization? You can alway mark hidden / secret values with @Transient if you do not want them to leak to the outside work. What is the use of serializing them with *** if you cannot deserialize them later?

elizarov on 27 Feb 2019

Fair point about the deserialization step, I guess I was mainly referring to how the value would be displayed to the outside world for debugging purposes (ie. the use case of just calling toString()).

Framing this FR as a question, is there a way to have "out-of-the-box" behavior for treating a "secret" field of a @Serializable object to be hidden from the outside world via one code path, while also allowing the object to serialize the said field unabridged via another distinct code path?

I realize that conceptually this might mean implementing your own version of toString() on a @Serializable object, without marking the field @Transient, but that means hand writing it for each field on each object, instead of using the annotation based approach.

I realize @Transient might be a fair fit for the solution, but to my understanding it completely removes it from all serialization. So if I wanted to debug a class that was @Serializable and had a "secret" field marked @Transient, how would I be able to tell if that field held a value when I serialized it?

dotCipher on 27 Feb 2019

I'm lost. What's relation between serialization andtoString() on the object?

elizarov on 27 Feb 2019

Apologies, I can be terrible at explaining things sometimes. ☹️

I merely was referring to using toString() as a mechanism to achieve debug-ability as a separate code-path from an existing object using @Serializable.

To my understanding, @Transient currently doesn't serialize the underlying field for an object marked @Serializable right? If I had an object that contained a sensitive field that I however did want to serialize via the default code path of something like Json.stringify(Data.serializer(), Data(42)) (from your README) but also have a separate code path to present a mutation or mask of the field (proving that it has a value, but hiding said value), then @Transient wouldn't be able to accomplish this right?

If that's true, then the only other implementation path that I could think of (at least in terms of a design pattern for the @Serializable object) would be to implement a toString() that would have custom logic to apply the mask to that field of the object. Ideally a native feature for kotlinx.seralization would include something like that out-of-the-box, such that I wouldn't need to do any homespun logic for toString() for example.

Effectively, what this might look like would be using the same API with some different call, like:

Json.stringify(Data.serializer(redaction = true), Data(42))

for an class defined like:

@Serializable
data class Data(@Redacted val a: Int, @Optional val b: String = "42")

This might not be in scope for the kotlinx.serialization library, so apologies if I am shooting in the dark here, but the use case here for me personally is to leverage the @Serializable mechanisms presented in this library, but hide fields using a singular API when doing things like request call logging, where requests are mapped via @Serializable objects containing sensitive information like passwords, api keys, etc. So I would require both a code path to serialize the full object (with the sensitive info) as well as a redacted version of it (without the sensitive info).

Again, apologies if I am being naive here too and missing some obvious functionality that exists somewhere.

dotCipher on 27 Feb 2019

If I had an object that contained a sensitive field that I however did want to serialize via the default code path of something like Json.stringify(Data.serializer(), Data(42)) (from your README) but also have a separate code path to present a mutation or mask of the field (proving that it has a value, but hiding said value), then @Transient wouldn't be able to accomplish this right?

Right. But what's the use-case? Why would you need to do that? And how that use-case is related to _serialization_? Why would need masking in _serialization_? What is the use for it?

elizarov on 28 Feb 2019

My use case was that I wanted to serialize an object using the mechanisms presented in kotlinx.serialization in a 'masked field state' purely for debugging purposes in a distributed system.

To provide a concrete example of the use case, assume there are two microservices using the same serialization code for an object for their inter-service communications as well as request logging. If an administrator of one of the microservices wanted to know if a request was received, and it's content, but the object contained sensitive data, to my knowledge it would be impossible to have the same serialization code that was used for the original request (in the unabridged form, all values serialized) also write the object in the logs with a masked value on the sensitive data (that should be hidden from the administrator).

If this is out of scope for serialization, and there is something I am missing here, please let me know.

dotCipher on 28 Feb 2019

For this use-case you can write your own encoder that know what fields should be masked (based either on annotation or simply by field name) and masks them appropriately.

elizarov on 28 Feb 2019

👍2