I was just leaving a comment on someone's code, and I told them they could do something with a static class instead of using stateless workers. This got me thinking: what's the real benefit of stateless workers? They provide two benefits as far as I can tell:
I don't really know about number 1 (someone with more knowledge could probably shed some light), but number two is rather meaningless if there really is no state. Rather, the main benefit I can think of is when a stateless worker has some state, but the state is transient. Then, you could have timers and do some work based on previous state and whatnot (take the aggregator pattern for example), which makes the name stateless worker rather counter-intuitive. I can only assume stateless actually means no persistent state.
Am I missing something? More importantly, am I correct in assuming that any truly stateless operation can be implemented (probably with better performance) as a static method on a static class rather than a method on a stateless worker grain?
Consider the following scenario, which is actually how StatelessWorker grains came to be.
Client sends encrypted messages to the cluster. Inside each message there's an identity of the target user (grain). Before the message can be send to the user grain for processing, it needs to get decrypted and the user ID extracted, so that it can be passed to GetGrain<IUserGrain>(id). A StatelessWorker grain provides an 'endpoint' for clients to send such messages to. Activations of that grain are always local to the silo that receives a request (client gateway), and they automatically scale out with the load.
My nodes have always been symmetric, so I wasn't considering silo clients at all... Thanks!
There are also scenarios for StatelessWorker grains that don't involve clients. For example, if you need to report an aggregate metric across many thousands of grains in the cluster, it won't work if you try to have each grain sends its report to a single aggregator grain because it would become the obvious bottleneck. Instead, you can have those grains call a StatelessWorker 'pre-aggregator' grain. That would cause them invoking local (within the same silo) 1 or N (depending on what you specify) activations of the pre-aggregator.
The pre-aggregator grains can then periodically, on timer, report their individual aggregates to the final aggregator grain. Since there will be much fewer of those calls, the final aggregator won't be overloaded with calls from the pre-aggregators.
This pattern has been successfully applied in multiple production systems.
Yes, that's exactly the transient state I was thinking about. I think it's safe to assume that "putting stateless logic which is meant to be consumed from within silos inside static classes is a better choice". This implies that maybe two thirds of my stateless workers so far have only been slowing the system down... 🤦♂️
"putting stateless logic which is meant to be consumed from within silos inside static classes is a better choice".
Yes, unlike with statics, with stateless workers you get the single-threading guarantee.
Thanks!
Most helpful comment
There are also scenarios for
StatelessWorkergrains that don't involve clients. For example, if you need to report an aggregate metric across many thousands of grains in the cluster, it won't work if you try to have each grain sends its report to a single aggregator grain because it would become the obvious bottleneck. Instead, you can have those grains call aStatelessWorker'pre-aggregator' grain. That would cause them invoking local (within the same silo) 1 or N (depending on what you specify) activations of the pre-aggregator.The pre-aggregator grains can then periodically, on timer, report their individual aggregates to the final aggregator grain. Since there will be much fewer of those calls, the final aggregator won't be overloaded with calls from the pre-aggregators.
This pattern has been successfully applied in multiple production systems.