Runtime: Consider allowing Exception instances to be serialized without requiring BinaryFormatter

Created on 12 Sep 2020 · 15Comments · Source: dotnet/runtime

Some remoting technologies use BinaryFormatter to serialize Exception instances across security boundaries, which puts them out of SDL compliance and potentially exposes consumers to security vulnerabilities. The current recommended way to serialize exception information safely is to call ToString on the exception instance, then transmit the resulting string across the wire. However, this does not create useful object models for consuming applications, as they can't interact with a simple string like they can a rich exception instance (accessing properties, using try / catch, etc.).

In an ideal world, an exception serialization tech would have the following characteristics:

The proper Exception-derived type will be instantiated after deserialization.
The human-readable message and stack trace are preserved.
The inner exceptions are preserved.
Vital information about the exception is preserved. This is normally information passed to the Exception ctor and exposed via properties, such as the argument name provided to an ArgumentOutOfRangeException.

To maintain SDL compliance and work with our linker technology, we'd need to enforce a few extra behaviors:

The payload cannot be the sole arbiter of type information. The deserializer needs to provide an allow list of legal Exception-derived types, and the payload cannot attempt to instantiate types outside that allowed list
The deserializer tech must go through normal instance validation and type safety checks, normally performed by the exception's ctor. This disallows using the existing SerializationInfo / StreamingContext infrastructure.
Cyclic or deeply-nested object graphs must not be constructed. This also disallows using the existing SerializationInfo / StreamingContext infrastructure.

It's possible that the deserialization tech would need to include special-case handling of each allowed Exception-derived type in order to fulfill these requirements. Perhaps this could be simplified by understanding canonical patterns like .ctor(string message, Exception innerException). But we'll cross that bridge when we come to it.

area-System.Runtime untriaged

Source

GrabYourPitchforks

👍2

Most helpful comment

For those who don't know what some TLAs mean, like me, I think SDL refers to Microsoft Security Development Lifecycle.

svick on 12 Sep 2020

👍3

All 15 comments

For those who don't know what some TLAs mean, like me, I think SDL refers to Microsoft Security Development Lifecycle.

svick on 12 Sep 2020

👍3

The proper Exception-derived type will be instantiated after deserialization.

What would happen in the case where the deserializing assembly/service/app/whatever does not know about or have access to the appropriate Exception subclass?

yaakov-h on 12 Sep 2020

What would happen in the case where the deserializing assembly/service/app/whatever does not know about or have access to the appropriate Exception subclass?

The case of "not having access" shouldn't be possible with this design. The deserializer needs to have ahead-of-time knowledge about all possible Exception classes that might be instantiated. (This characteristic is true of any secure polymorphic deserialization technology; it's not unique to Exception and derived types.)

This means that deserialization scenarios fall into only two buckets: (1) the deserializer is aware of the requested Exception-derived type and knows how to instantiate it; or (2) the deserializer is not aware of the requested Exception-derived type. There's no middle ground where the deserializer says "well, I found a Type that corresponds to what the payload is requesting, but I don't know how to instantate it."

In the case of the second bucket above, I imagine the deserializer would instantiate a placeholder Exception-derived type and include the original error message and the original stack trace.

GrabYourPitchforks on 12 Sep 2020

👍1

It would be nice if the new exposed something like .NET Framework's Exception.PreForRemoting method. That way exceptions could be rethrown while preserving the server-side stack trace.

AustinWise on 13 Sep 2020

That still relies on the Remoting infrastructure doing the serialization, and has been somewhat generalized with ExceptionDispatchInfo since .NET 4.5.

Perhaps being able to serialize a ExceptionDispatchInfo would solve this problem?

yaakov-h on 14 Sep 2020

👍1

@yaakov-h Thats a good point. EDI gets you most of the way there. All that PrepForRemoting adds is the extra markers in the stack trace that show a remoting boundary was crossed.

AustinWise on 14 Sep 2020

Speaking as a third party serializer author, my main issue with the existing design is that the Exception(Serialization, StreamingContext) constructor provides the only viable mechanism for unmarshalling an exception with all of its runtime metadata intact (Stacktrace, Data, etc.). Consequently, any serializer that needs to implement faithful exception deserialization needs to also support ISerializable. I would personally appreciate a serializer-independent mechanism that would let users augment a newly created exception instance with all necessary metadata, something like an ExceptionDispatchInfoBuilder.

The deserializer tech must go through normal instance validation and type safety checks, normally performed by the exception's ctor. This disallows using the existing SerializationInfo / StreamingContext infrastructure.

Assuming the highly experimental abstract static interface methods ever gets adopted, one could conceive of a type safe and linker-friendly successor to SerializationInfo/StreamingContext constructors. This is largely how languages with type classes implement type-safe serialization.

eiriktsarpalis on 14 Sep 2020

A 100% faithful reconstruction of the original Exception instance is not necessarily required. For example, perhaps it's good enough for our serializer to say that it doesn't deal with serializing the _Data_ dictionary. (If we were to try to handle that, we'd become a serializer for arbitrary data, and the .NET ecosystem does not need yet another arbitrary data serializer.) This also means that we'd lose Watson bucket info. But we'd definitely want to keep useful information like the human-readable stack trace around. Restoring that data might involve the creation of new API surface within the runtime.

GrabYourPitchforks on 14 Sep 2020

👍1

Fully aggree with the fact that an actual exception is not required. For years we used a simple stupid projection into a "Data" object (factory method is here; https://github.com/Invenietis/CK-Core/blob/master/CK.Core/CKExceptionData.cs#L143). This data is "as serializable as possible" and its only data.
And the information captured appeared to be "good enough" for (as I said) numerous years.

olivier-spinelli on 15 Sep 2020

A 100% faithful reconstruction of the original Exception instance is not necessarily required. For example, perhaps it's good enough for our serializer to say that it doesn't deal with serializing the Data dictionary. (If we were to try to handle that, we'd become a serializer for arbitrary data, and the .NET ecosystem does not need yet another arbitrary data serializer.)

Agreed, although that should be a design decision preferably made by the serializer implementation, rather than something forced by the underlying exception reconstruction mechanism.

eiriktsarpalis on 15 Sep 2020

I am surprised that the requirements in the description do not include a provision for dropping sensitive information from the exception so it's safe for remote parties with an Information Disclosure risk.

AArnott on 19 Sep 2020

@AArnott I don't think that's a viable requirement. In general it's a bad idea to send any object that contains privileged information through a serializer, as there's too high a risk that the sensitive information might be transmitted. This also runs the risk that for any given Exception object, sensitive information may or may not be disclosed depending on the exact Exception type, which isn't always under the dev's control. This makes the system very difficult to reason about from a security perspective.

Any threat model of this serializer would absolutely include information disclosure as a threat. But I suspect the answer to that will be "this serializer is not intended for use in environments where the recipient is not trustworthy."

GrabYourPitchforks on 21 Sep 2020

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

Dotnet-GitSync-Bot on 16 Oct 2020

Move to runtime, this is not a serialization issue.

HongGit on 23 Oct 2020

We just had a discussion about this in the context of https://github.com/dapr/dotnet-sdk/issues/414

Actor frameworks (or other RPC frameworks) tend to use exceptions of different types to signal different business concerns. There might be a different exception type for a transaction that's rejected for low balance or rejected because the item is out of stock, etc.

Ultimately application code wants to call some functionality on another server and then interact with the result in a strongly-typed way, likely with multiple catch blocks.

For this use case the majority of exceptions would be user-defined. Ideally a solution would support serializing additional user-defined properties as well (can be opt-in).