Orleans: Orleans Grains versioning

Created on 27 Jan 2017  ·  9Comments  ·  Source: dotnet/orleans

Orleans Grains versioning

Currently we assume that for a given Grain implementation, it will be the same across all the cluster.

Why

This is an issue when deploying a new version of a Grain with some method added/modified (signature of body). #2379 removed the assumption that all silos support the same set of types, but we need to go further and enable a scenario where we can have multiple grain versions for a given TypeCode in the cluster.

Out of scope (for now)

  • Grain state versioning
  • We still assume that Orleans binaries are compatibles across versions

Design

So far, we explored two different scenarios. There are not incompatibles, we might want to implement both on long term.

1. Versioning by adding interfaces

In this option, we consider the grain interface (contract) as immutable (as it is right now). You cannot add/remove method from an already implemented and deployed interface. To add new method, you must introduce a new interface.
csharp // In “V1” and “V2” silos public interface IMyGrain : IGrainWithIntegerKey { Task Method1(); } // In “V1” only public class MyGrain : Grain, IMyGrain { public Task Method1() { … } } // In “V2” only public interface IMyGrainV2 : IMyGrain { Task Method2(); } // In “V2” only public class MyGrain : Grain, IMyGrain2 { public Task Method1() { … } public Task Method2() { … } }
Pros:

  • “Natural” solution for service versioning

Cons:

  • After n versions, you have n different interfaces for the same grain in your codebase
  • Not possible to build/use “upgrade rules” in/from Orleans

2. Adding version attribute on interfaces

In this option, interfaces are not immutable anymore: we add an attribute on interface to specify their version when they change:
csharp // In “V1” [Version(1)] public interface IMyGrain : IGrainWithIntegerKey { Task Method1(); } // In “V2” [Version(2)] public interface IMyGrain : IGrainWithIntegerKey { Task Method1(); Task Method2(); }
Pros:

  • “Lighter” than the previous option
  • We can have a default value for version number for all grain interfaces (bump this number for all new deployment to avoid compatibility issues), but let developer the possibility to override it
  • We could easily expose automatic upgrade/downgrade rules in Orleans (mix V1 and V2 for new placements for A/B testing, authorize direct calls from a V1 Grain to a V2 but not a V3, ect.)

Cons:

  • Developer must be careful when grain implementation is changed: he must decide whether the version number on the interface need to be changed or not
enhancement

Most helpful comment

Yeah, I later realized that, thanks for clarifying. Also I realized your [TypeCodeOverride(X)] would allow for two versions of an interface to co-exist, so ignore the last part of 4.

All 9 comments

@benjaminpetit I like option 2. I think we also need to specify the [TypeCodeOverride(val)], too, in order to make it work. We can come up with a better name for that attribute, since we should expect everyone to use it.

How MS-Bond serializer supports versioning? It also uses C# interfaces to define the contract and must support versioning, so we might be able to learn from it.

I like option 2. also.

Slightly on tangential why I feel like prefering option 2, though grain state versioning is out-of-scope, the ADO.NET storage provider has the same idea. It doesn't have an explicit version attribute, but there's well-defined place for transforming data going in or out from the database (and a test transforming stored data from binary to XML/JSON, if I recall correctly). In specific grain versioning context it could mean a user defined mapping rule from a tuple (grain instance, function, parameters) function to some similar target tuple. Some of that can be automated. Technically this might be a list of interceptors/proxies (as in taking all interceptors and versioning ones running as the first/last) and so that calls are queued to ensure targets are available when an update is in process.

I'm sure you already know a lot of what follows, @gabikliot, but I'm writing it here for everyone, in order to help frame this discussion.

Bond is similar to ProtoBufs: it has an IDL where message fields have explicit, user-defined IDs. The C# attribute-based description is a mirror of that IDL in C# (for convenience). Protobuf-net (the one from the StackOverflow folks) does the same thing as Bond C#.

Neither of those serialization formats have the notion of a 'version' for a message. On the wire, each is represented as a collection of key-value pairs where the key is the field id. Fields which are not understood are kept on the object as extension data (see the Failsafe Exception Serialization PR where we do the same thing: #2633). This allows for safe round-tripping. In the .NET world, the IExtensibleDataObject interface exists for this reason.

Backwards & forwards compatibility is ensured by having the user follow some basic rules, eg:

ProtoBuf & Bond both have counterparts for defining RPCs: gRPC and Bond Comm respectively.

Neither gRPC or Bond Comm seem to have any explicit forwards/backwards compatibility support on the service interfaces: https://microsoft.github.io/bond/manual/compiler.html#service-definition
Bond Comm:

service Calculator
{
    Result Calculate(Operation);
    void Configure(Settings);
    Stats GetStats();
}

gRPC:

// The greeting service definition.
service Greeter {
  // Sends a greeting
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

I suspect (but don't know) that method overloading is not supported by gRPC and they seem to specify the method name in the request: http://www.grpc.io/docs/guides/wire.html#example. This differs from our handling, where a message consists of (among other things):

  • GrainId Target
  • int InterfaceId
  • int MethodId
  • object[] Arguments

Currently, we compute MethodId by hashing the method name and parameter type names. Maybe we should offer a way to override that id. We can sub-class the [TypeCodeOverride(X)] interface so it has a better name like [Id(X)] and use that. Otherwise, superficial changes such as to method name or parameter type name will break interop.

So if we can override InterfaceId & MethodId, then superficial signature changes are no longer an issue. We would still have issues with users adding or removing arguments or changing argument types.

In order to address that, we have at least two options:

  1. Document that users are not allowed to do that, it's not supported.
  2. Add some [Version(X)] attribute as suggested by @benjaminpetit above, allowing users to say "this consumer requires implementations of ISomeGrain to have a version >= X, so select or place an activation with that version (deactivating any old version) and throw if it cannot be done yet"

All of this solves only a part of the problem, ignoring that messages themselves are also subject to change. The solution for the latter is either:

  • We officially push Bond/ProtoBufs as the serialization format of choice
  • We invest in supporting tagged serialization, similar to Bond & ProtoBuf.

    • Type 'name' serialization would also have to be re-addressed so that renaming a type doesn't break things.

    • Objects would need to include some field similar to IExtensibleDataObject mentioned earlier to support round-tripping. Because we do not generate the classes from an IDL, we can't do it ourselves without jumping through hoops.

I'm not advocating either of those avenues more than the other - and we can discuss it on another issue.

Our aim should be to give users confidence that as long as they follow a set of official, well-defined rules, then they can feel safe making changes to interfaces.

Just droppping my 2 cents.

I like option 2 as well.

@ReubenBond gRPC don't support overloads... At least not last time I tried.

I think this TypeCode override is something we discussed in the past offline and we got to a point where we would have for either interface type or state just an attribute just like protobuf does where we define a string or an int as the type/member name and yes, we should follow the rule for never change the attribute type name, so we keep compatibility.

I like option 1 for several reasons:

  1. The interface has until now served as the sole way that a silo and client agree on a contract. What is it about versioning that requires some additional construct when a naming convention seems to achieve the same result?

  2. Option 1 uses a very familiar c# construct, while option 2 uses an orleans-specific one. One thing that makes orleans powerful is that the mental model of objects and interfaces in a standard c# program carries over very well. Option 2 seems to diverge from that.

  3. For option 2 how do folks imagine a client would specify which version of the interface is being targeted? It seems to necessitate a client-side change, as well as a server-side one. Somehow the old client needs to mention that it is targeting v1, and the new client needs to specify v2. How do non-dotnet clients handle versioning, such as nodejs?

  4. I would consider the possibility of having up to n versions of an interface a pro, rather than a con. At some point, developers will probably need to preserve previous versions of a interface within a codebase. Option 1 makes this trivial, while option 2 makes this basically impossible.

Option 1, plus issue #2379 basically gives orleans interface-based service discovery, which I think is a great direction to move in.

That ended up being more than 2 cents, but there it is :)

Thanks for the input, @TrexinanF14!

For option 2 how do folks imagine a client would specify which version of the interface is being targeted?

Whichever version the client is linked to is the one which is targeted. If the client is linked to the assembly with [Version(2)], then it's version 2.

I'm not vehemently opposed to Option 1, I just think it will get messy.

Yeah, I later realized that, thanks for clarifying. Also I realized your [TypeCodeOverride(X)] would allow for two versions of an interface to co-exist, so ignore the last part of 4.

Resolved via #2837 and #2913.

Was this page helpful?
0 / 5 - 0 ratings