Orleans: Suggestion: consistent hash based placement strategy for grains

Created on 2 Jun 2016  路  6Comments  路  Source: dotnet/orleans

I suggest to add to Orleans a new placement strategy that will allow a user to specify grains should be placed on silos based on their consistent hash, e.g. have deterministic placement that can be calculated based on grain primary key, with option to specify whether grains should be deactivated when the consistent ring changes and the grain should no longer be handled by the current silo.

The idea came up after talking with @sergeybykov about a way to use Orleans reminders as scheduled jobs distribution framework. Reminders are actively balanced between silos based on consistent hash, and they can keep their corresponding grains always active to execute their work, but there's no placement strategy that can guarantee the grains that are reminded will be actively balanced between silos. Imagine there are 20 reminders and 4 silos at startup, then each silo owns 5 reminders, and using random/activation based/local preference placement policy we optimally get 5 grains per silo. Once a silo joins, then reminders will be re-balanced to have 4 per silo, but grains remain active on their current silo so the new silo doesn't get its share of the work. By having a placement policy that can be configured to force deactivation for the sake of re-balancing, this issue can be solved elegantly.

P2 enhancement

Most helpful comment

I see a natural progression of complexity here.

1) Hash-based initial placement, no migration.
2) Eager deactivation after a cluster state change of grains that fell out of the ownership range of a silo. Reactivation will still be spread over time, based on the incoming requests/reminders for the deactivated grains.
3) Lazy migration of activations to the 'right' silo in the background.

I see potential scenarios and value in all three options. So I don't think we have to necessarily shoot for 3) right away, especially because this wouldn't be a default or even recommended option.

All 6 comments

Makes sense to have that as an option.

The reason not to use it by default is of course the fact that it forces a bunch of grains to be deactivated when new silo joins, which may be destructive to the app.
To counter the imbalance problem we have 2 options:
1) ActivationCountBased placement strategy will try to balance by number of activations. It does not move the existing ones, and only preferentially places new ones on the new silo, with great care to not overload it or get out of balance.
2) if we do want to migrate, its better to build a mechanism to move them in the background, in a controlled and slow manner, and not in one big "BOOM - lets move all grains at once, since we need to respond quickly to silo addition". I am talking about a mechanism similar to the one described in our Optimization paper in Eurosys. But here it can be much simpler - no need (as a first step at least) to do any graph partitioning. Just gradually migrate grains to maintain balance.

That all said, having yet another option, like the consistent hash based that you described above, cannot do any harm. I can imagine in some scenarios someone may prefer its simplicity.

I see a natural progression of complexity here.

1) Hash-based initial placement, no migration.
2) Eager deactivation after a cluster state change of grains that fell out of the ownership range of a silo. Reactivation will still be spread over time, based on the incoming requests/reminders for the deactivated grains.
3) Lazy migration of activations to the 'right' silo in the background.

I see potential scenarios and value in all three options. So I don't think we have to necessarily shoot for 3) right away, especially because this wouldn't be a default or even recommended option.

Agree.
And we can do all 1, 2, 3 because we have a distributed grain directory which allows us an arbitrary placement.
That is a big difference with other consistent-hash based placement schemes that not only place but also locate (lookup) via consistent-hashing (Service Fabric), thus they can only do 2, and cannot do 1 or 3.
Cc: @ReubenBond.

Just adding my 2 cents
In a scenario like gaming which load balancing can be important since work being done is kinda heavy compared to many other scenarios.
There are two valiable characteristics, one is having related grains in the same silo to avoid latency, second is having balanced load across silos to divide the work between them evenly.
To maximize both without hand picking the placement is too hard I understand so consistant hashing/count based placement help here. The question of moving grains in the background to rebalance the load is more broad than consistant hashing placement itself. In general I guess it's good to have the option to relocate the grains after we add silos to the system.
What I am not sure of is how to parameterize that to have locality for grains which prefer to be close to each other since latency is critical in scenarios like gaming and maybe financial software like stock trading.
consistant hashing kinda groups together the grains which have the same key (but are in different types, right? , or at least can be like that) but in other scenarios we should have a way to say that which grains we prefer to be nearby each other.

I understand this makes the placement complex and is moving toward hand picking which grain goes were and is not useful for all scenarios but still can open the door to some applications which currently might prefer akka.net / akka / Erlang to put grains which are latency critical in the same place.

When I think about this, currently easiest is to have all stuff which need to be local to each other as a single grain but it's not always possible.

Let's say I have a game world with dynamic zones which their size changes after we have more silos, I prefer all grains in a zone to be on the same silo but it's hard to achieve currently in Orleans.

We can add special placement strategies as users but maybe it's good to have pluggable relocation strategies which get notifications when silos get added/removed from the clusture.

Expanding on the ideas above, I think that there can be a placement interceptor. when a grain call is made to an non activated grain, the interceptor will be used to get a placement strategy object. one option can be something like StickyPlacementStrategy that holds a list of grains which the current activation should be "close" to. this kind of placement will create a placement graph that Orleans can try to eventually conform to. we can also add priority to the strategy - when several options exist.

I was browsing the oldest issues of Orleans and this is similar to #96

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bwanner picture bwanner  路  5Comments

gabikliot picture gabikliot  路  4Comments

turowicz picture turowicz  路  3Comments

Liversage picture Liversage  路  4Comments

galvesribeiro picture galvesribeiro  路  4Comments