Orleans: What's the real benefit of stateless workers?

Created on 31 Dec 2018 · 6Comments · Source: dotnet/orleans

I was just leaving a comment on someone's code, and I told them they could do something with a static class instead of using stateless workers. This got me thinking: what's the real benefit of stateless workers? They provide two benefits as far as I can tell:

They'll limit the number of concurrent operations, but at the added cost of a grain method call (which I believe can be significant if the operation itself is simple).
They'll provide single-thread guarantees.

I don't really know about number 1 (someone with more knowledge could probably shed some light), but number two is rather meaningless if there really is no state. Rather, the main benefit I can think of is when a stateless worker has some state, but the state is transient. Then, you could have timers and do some work based on previous state and whatnot (take the aggregator pattern for example), which makes the name stateless worker rather counter-intuitive. I can only assume stateless actually means no persistent state.

Am I missing something? More importantly, am I correct in assuming that any truly stateless operation can be implemented (probably with better performance) as a static method on a static class rather than a method on a stateless worker grain?

question

Source

Arshia001

Most helpful comment

There are also scenarios for StatelessWorker grains that don't involve clients. For example, if you need to report an aggregate metric across many thousands of grains in the cluster, it won't work if you try to have each grain sends its report to a single aggregator grain because it would become the obvious bottleneck. Instead, you can have those grains call a StatelessWorker 'pre-aggregator' grain. That would cause them invoking local (within the same silo) 1 or N (depending on what you specify) activations of the pre-aggregator.

The pre-aggregator grains can then periodically, on timer, report their individual aggregates to the final aggregator grain. Since there will be much fewer of those calls, the final aggregator won't be overloaded with calls from the pre-aggregators.

This pattern has been successfully applied in multiple production systems.

sergeybykov on 2 Jan 2019

👍2

All 6 comments

Consider the following scenario, which is actually how StatelessWorker grains came to be.

Client sends encrypted messages to the cluster. Inside each message there's an identity of the target user (grain). Before the message can be send to the user grain for processing, it needs to get decrypted and the user ID extracted, so that it can be passed to GetGrain<IUserGrain>(id). A StatelessWorker grain provides an 'endpoint' for clients to send such messages to. Activations of that grain are always local to the silo that receives a request (client gateway), and they automatically scale out with the load.

sergeybykov on 2 Jan 2019

👍2

My nodes have always been symmetric, so I wasn't considering silo clients at all... Thanks!

Arshia001 on 2 Jan 2019

This pattern has been successfully applied in multiple production systems.

sergeybykov on 2 Jan 2019

👍2

Yes, that's exactly the transient state I was thinking about. I think it's safe to assume that "putting stateless logic which is meant to be consumed from within silos inside static classes is a better choice". This implies that maybe two thirds of my stateless workers so far have only been slowing the system down... 🤦‍♂️

Arshia001 on 3 Jan 2019