Orleans: Implement Reminders for Consul

Created on 14 Jun 2016  路  11Comments  路  Source: dotnet/orleans

Hi Folks,

I am planning to use Consul with Orleans #1267 and would really like to make use of the Reminders mechanism. @PaulNorth or @gabikliot can you explain any reasoning why implementation of the IReminderTable implementation didn't get into the initial release, other than time and effort? (i.e. did you encounter an issue that remains a significant stumbling block?)

Assuming there is no massively unsolvable issue, I am going to spend some more time looking into implementing this. My initial thoughts are to implement something akin to what is in (InMemoryRemindersTable)[https://github.com/dotnet/orleans/blob/cc90465640710d82b2fc40bc9749af992faf41cf/src/OrleansRuntime/ReminderService/InMemoryRemindersTable.cs] using the Consul KV store.

Any feedback related to it is solicited.

enhancement hacktoberfest help wanted

Most helpful comment

I think both are very good ideas.
I would start with global list, filtered at silos. That will allow you to start using reminders and test correctness and will actually work pretty well for a small number of silos (lets say 3). Once you want to scale, the 2nd step could be buckets.

Here I would do the following:
partition the whole space into X buckets. X should be fixed and an order of magnitude larger than #silos. Lets say 1,000. Every bucket is same size and the partitioning to buckets is static.
What is dynamic is assignment of buckets to silos. There are multiple ways you can do that assignment, one easy way is via consistent hashing into the range that is given to every silo: Orleans gives silo a range (uint begin, uint end in ReadRows) and you will hash the bucket number and figure out what buckets fall into this silo range and read this bucket. So this will require only single step read: you figure the buckets without any reads at all (based on hashing into the range) and only need to read the bucket (with one recursive read).
I believe that strategy will work well since:
1) the ranges that Orleans gives to silo are pretty even (it is actualy 30 smaller sub ranges and together they form one multi range, which makes the multi ranges across silos pretty even)
2) your assignment of reminder buckets to silo can be even and proportional to silo range
3) number of reminders inside every bucket will be very even as long as you have a large number of reminders - one or more orders of magnitude more reminders than buckets.

This strategy will NOT work well if # reminders is close to #silos. In that case the reminders will not be spread evenly. But in this case who cares: you have a small number of reminders any way, there is no need to balance them evenly in this case.

All 11 comments

@normanhh3, I'm not an expert in Consul's KV store, but I suspect it doesn't provide range queries, which is a must for implementing Reminders. That's the reason I didn't implement Reminders after implementing ZooKeeper as a membership provider. Currently, Orleans has Reminders implementations for SQL Server, MySql and Azure. Where I work we use MySql in production.

Thanks for the pointer @shayhatsor.

Is the following definition from IReminderTable what you are referring to?

/// <summary> /// Return all rows that have their GrainReference's.GetUniformHashCode() in the range (start, end] /// </summary> /// <param name="begin"></param> /// <param name="end"></param> /// <returns></returns> Task<ReminderTableData> ReadRows(uint begin, uint end);
Consul's KV supports retrieving multiple _nested_ keys in one request using the ?recurse parameter.

Consider a candidate Key/Value storage scenario:

/v1/kv/[OrleansClusterIdentifier]/Reminders/[GrainRef.GetUniformHashCode]/[ReminderName]

With a single request:
curl http://localhost:8500/v1/kv/MyOrleansCluster/Reminders?recurse

I will get all of the reminders for the cluster, that data could then be optionally filtered to return just the items in the appropriate range as needed. The trade-off here is that the implementation would trade compute for storage expecting a small number of registered reminders would be built in totality.

Another option would be to use range buckets in the key address space something akin to /v1/kv/MyOrleansCluster/Reminders/Range-094800-094900/094801/MyCoolReminder

Where the range would be fixed at design time but whose size should _ideally be managed dynamically at runtime_ instead of design time.

A two-step access of the Consul KV would then be needed to list the child keys in the range space. Then identification of the appropriate actual ranges that could match the requested range values would be queried, and then those sub-keys returned and filtered for the appropriate results.

Thoughts?

I think both are very good ideas.
I would start with global list, filtered at silos. That will allow you to start using reminders and test correctness and will actually work pretty well for a small number of silos (lets say 3). Once you want to scale, the 2nd step could be buckets.

Here I would do the following:
partition the whole space into X buckets. X should be fixed and an order of magnitude larger than #silos. Lets say 1,000. Every bucket is same size and the partitioning to buckets is static.
What is dynamic is assignment of buckets to silos. There are multiple ways you can do that assignment, one easy way is via consistent hashing into the range that is given to every silo: Orleans gives silo a range (uint begin, uint end in ReadRows) and you will hash the bucket number and figure out what buckets fall into this silo range and read this bucket. So this will require only single step read: you figure the buckets without any reads at all (based on hashing into the range) and only need to read the bucket (with one recursive read).
I believe that strategy will work well since:
1) the ranges that Orleans gives to silo are pretty even (it is actualy 30 smaller sub ranges and together they form one multi range, which makes the multi ranges across silos pretty even)
2) your assignment of reminder buckets to silo can be even and proportional to silo range
3) number of reminders inside every bucket will be very even as long as you have a large number of reminders - one or more orders of magnitude more reminders than buckets.

This strategy will NOT work well if # reminders is close to #silos. In that case the reminders will not be spread evenly. But in this case who cares: you have a small number of reminders any way, there is no need to balance them evenly in this case.

@normanhh3 For the record you are correct, time, energy and the requirements of the project I was working on are why I never even looked at implementing reminders.

@PaulNorth I suspected that might be the case, but wanted to be aware of any major pitfalls to implementation that might have been encountered. Thanks for responding.

@gabikliot thanks for the detailed feedback, that was just the kind of insight I was looking for. I like your two stage approach as well.

Thanks again for the direction. :-)

I'm finally getting started on an implementation. My original plan is to implement a global reminder table facility. However, it looks like the internal class RangeFactory is used somewhat extensively in the current IReminderTable implementations. Because of the internal accessibility of the class, it isn't available to utilize outside the main Orleans codebase. Do you have any suggestions on how to effectively re-use that classes functionality, preferably without copy-paste?

At the moment, I'm tempted to just include that source into my codebase for now until a hypothetical release where RangeFactory may become available as a utility.

I can't look at the code at the moment, but feel free to make RangeFactory public as part of your PR, at least initially.

Looking at the code, I see no reason for RangeFactory to be internal.

@normanhh3, we're always open to these kinds of changes. Keeping stuff internal and exposing them on first use helps us pinpoint the reason for the change.

On the other hand, if we already have 3 other implementations (azure, zk, SQL) and did not need that class to be public, there is probably a reason why there was no need.

@gabikliot if I added an implementation into the core of Orleans, then it wouldn't need to be public as that implementation would have access. However, my plan at the moment is to make a copy of it to use for testing an implementation and contribute that fully tested implementation back to the main repo.

Honestly, it sounds like something that should live in an Oreleans "public" utility library. I'm not familiar enough with the codebase or core Orleans to suggest something at the moment.

Was this page helpful?
0 / 5 - 0 ratings