All,
I was hoping to get some comments on our proposed design and get some insight into how others have handled similar use cases.
Within our silos there is a stateful grain representing each entity within our system (for example aircraft). We need some way of allowing the user to query for entities of a particular type. Since these grains may be active or not we were thinking of allowing the grains to write out some meta data to a db that allows queries to get the corresponding grain Id. Once we have the correct grain id we would then get the complete data for the entity via rest api > orleans client > grain call.
We are not big fans of having 2 different DBs (a NOSQL for grain state & different DB for meta data) due to consistency issues, so we have been trying to figure out some way to allow basic queries on the grain state db. In order to do this we were thinking about writing a Cassandra storage provider that would save scalar values that we want to query on as well as the full grain state is a blob.
@galvesribeiro here is what we were talking about the other night on gitter.
@dsarfati That sounds like what relational provider would allow (see the script for an example), unfortunately I'm a bit stalled on thinking how to best test it.
@dsarfati good question. Today we don't have support to query grains from a catalog... The only way to do that, is if you have 1 grain to behave as a catalog, and some methods in it that allow the query to happen. One of the major problems I see, is that expressions can't be serialized today... (yes, there are some crazy nugets mentioning they do that with json but, please be careful)
I would love to see something like IGrainCatalog.Grains.OfType<IMyGrainType>().Where(g => g.SomeProperty == "someValue").ToList()
We had a discussion in the past on how to make that query possible but we struggled on how to make IGrainCatalog to not be a bottleneck when querying...
@galvesribeiro We were thinking of creating our own DAO what would be able to serialized with the normal orleans serializers. We were thinking of creating a grain catalog so we could treat it like a normal grain however I'm not sure how to to store the grains within? DB? Dictionary?? We have a prototype that has a simple dictionary inside that keeps track of all the grain ids.
Store the class is not a problem. As long as it is serializable by your storage providers its ok... What I mean is how to ship a lambda expression across Silo's boundaries... That is where expression serialization comes up... For instance if you have a grain like this:
public interface ICategoryGrain : IGrain
{
Task<IProductGrain> SearchProducts(Func<IProductGrain> predicate);
}
That would hit the first block, which is having the predicate (the expression tree) serialized... After you get that sorted out, still the problem on how to reliably query Orleans' Grain catalog globally across the cluster, and execute that predicate. That query can (and will) fan out to multiple nodes of the cluster and must respect the turn based threading model of Orleans' Task Scheduler which means that some side-effects may happen like slowing down new activations while a query is being executed...
@ReubenBond once mentioned to me about Project Bonsai which is suppose to make expression tree serialization easy but I don't know what or even when it would be something usable in a feasible time.
I'm not worried about storing the data I'm just more worried about saving the entire collection of grains by the storage provider each time a new grain is registered.
Well... Don't see a way for it. In fact maybe there is a missunderstanding about the catalog... It keeps a record of what grain is activated and where... Not if the grain "exist" or not... By definition, the grain always exist...
I agree that the catalog only contains activated grain info. I was just worried about writing a dictionary with hundreds of thousands of grain ids each time a new grain was activated. With this model the grain need to register its self within the OnActivateAsync?
@dsarfati, in case you're trying to query activated grains according to some criteria, then look at #602. It's not resolved yet, but maybe it'll help . If not, then what would be the purpose of a grain that "contains" the registered grains? wouldn't it be better to have each grain write as part of its state some boolean that says it's "registered", and then query the underlying storage?
This can also be a solution for querying activated grains, on activation set an "activated" boolean value of the state, and on deactivation set it to false. but when a silo crashes you'll be left with "activated" grains that might not be activated anymore. In this case, I think that adding the silo's address (ip:port:epoch) to the grain state would be the solution.
@shayhatsor perfect example! That comment has a very nice way to implement that by indexing grains: https://github.com/dotnet/orleans/issues/602#issuecomment-165957401
Maybe we can take that in consideration for further implementation...
After thinking about it for a while our team has decided to try and build a prototype of a DB driven query that will allow us to find all grains that have been activated regardless if they are currently active or not. Our plan is to write a custom storage provider that will write out the grain's state as both meta data that we want to be able query on as well as the full state for the grain to re hydrate from. There are some other requirements that are driving us towards Cassandra so we are going to allow each piece of meta data to be an individual column and see how that works.
@philbe has 2 PhD interns working on the indexing problem this summer. The target solution is supposed to support querying for either currently activated grains or for grains that were ever activated.
Great to hear others are working this as well. We will definitely keep up to date, so we can share lessons learned.
Most helpful comment
@philbe has 2 PhD interns working on the indexing problem this summer. The target solution is supposed to support querying for either currently activated grains or for grains that were ever activated.