Orleans: DeploymentId and RemindersTable

Created on 9 Aug 2016 · 11Comments · Source: dotnet/orleans

I have a few test clusters set up using Sql Server as the System Store. Each cluster has a different DeploymentId, but they all use the same database. I am running into an issue where reminders are being triggered across deployments. I believe this is because the OrleansRemindersTable does not have a DeploymentId column, and thus the rows cannot be filtered by DeploymentId.

Am I missing something obvious?
Are the sql tables not designed to be used by multiple deployments/clusters? If the answer is no, then why do some tables include the DeploymentId?

I assume that DeploymentId is the same as ClusterId (i.e. any Silos I want to communicate with one another should have the same DeploymentId). Perhaps I am misunderstanding the intent of DeploymentId altogether.

I do see that the Sql RemindersTable has a ServiceId. Admittedly, I don't do not understand the intent behind ServiceId. In all of my testing, it has been an empty (zero) Guid.

question

Source

mtdaniels

Most helpful comment

I'll use the calendar service again, just as an example of a simple service.
Its current deployment in production has the following properties:

ServiceID = some GUID that means CalendarService to us. It's the same for all deployments of the calendar service.
The DeploymentID is CalendarServiceV1.3.

Now, let's say we have a new version of the service that's ready for production, CalendarServiceV1.4.
We need to think of a strategy for the upgrade:

If we prefer consistency over availability, we must shutdown cluster CalendarServiceV1.3 and only then start CalendarServiceV1.4.
If we prefer availability over consistency, we'll have CalendarServiceV1.3 and CalendarServiceV1.4 getting traffic simultaneously for a period as short a possible. This might introduce all sorts of concurrency issues, e.g. a user gets the same notification twice.

Notice that this isn't an Orleans issue, it's an availability vs. consistency issue. I think that the confusion stems from the fact that the DeploymentID is just a key silos use to decide they belong to the same cluster, and the ServiceID is just a key used by silos to query reminders that "belongs" to them. So in that regard, Reminders are just another persistent storage shared between two versions of the same service.

shayhatsor on 19 Aug 2016

👍3

All 11 comments

Am I missing something obvious?

You are missing something, but it's far from obvious. AFAIK it's undocumented. It is mentioned in the code:

ServiceId's are intended to be long lived Id values for a particular service which will remain constant
even if the service is started / redeployed multiple times during its operations life.

Consequently, you have to set different ServiceIDs for the two clusters to share the same table and have their own reminders. This is the case for any Reminders implementation.

shayhatsor on 10 Aug 2016

So... let's say I have two clusters (C1, C2) each with 2 Silos (C1S1, C1S2, C2S2, C2S2).

I had envisioned the DeploymentId would be C1 / C2 respectively... and the ServiceId might be C1S1, C1S2, etc. Mind you, I am focusing more on the coverage of an Id not the format (ServiceId is a Guid I believe).

If that is correct, then a reminder would not persist should its original silo go down. If you are suggesting that the ServiceId would be the same for both silos within a cluster, then I'm not sure what the difference between a DeploymentId and a ServiceId is.

A final point worth mentioning - the implementation for Azure appears to include the DeploymentId in the ReminderTableEntry here https://github.com/dotnet/orleans/blob/master/src/OrleansAzureUtils/Storage/RemindersTableManager.cs

mtdaniels on 10 Aug 2016

It is the same service ID for all silos in a cluster. The difference between DeploymentId and ServiceId is that deploymentIDs must change between deployments, ServiceIDs shouldn't if you want to persist reminders. For example, let's say you want to provide a calendar service for your users. there would be many deployments of this service (different DeploymentIDs) but you don't want them to loose the reminders they have set (same ServiceID). Keep in mind that there should never be two running clusters of the calendar service at the same time.

shayhatsor on 10 Aug 2016

Thanks @shayhatsor - that makes a lot more sense, now!

mtdaniels on 10 Aug 2016

I hate to reopen this - but I would like to get a better sense of actual use cases / best practices.

After reading and re-reading @shayhatsor's last comment, the ServiceId should be the same between deployments. The DeploymentId would change (i.e. it is version specific). In order to prevent reminders from causing grain activations in multiple clusters,

there should never be two running clusters of the calendar service at the same time

I think that "calendar service" is intended to read "same service". If that is the case, then you'd only ever have one deployment of a service running at a time, which seems to negate the need for separate ServiceIds and DeploymentIds.

Based on this http://dotnet.github.io/orleans/Frequently-Asked-Questions

...you need to provision and start a new deployment, switch incoming traffic to it, and then shut down the old deployment.

That indicates that you in fact have overlapping deployments, which seems to contradict the "never be two running clusters" note above. Based on @shayhatsor's information and the bit from the FAQ, it seems like I would be changing the ServiceId and DeploymentId with every new version, deploy side-by-side, switch incoming traffic, then take down the old deployment when it is idle. This strategy also seems to make ServiceId/DeploymentId redundant.

I am very likely missing something (again)... just need a bit of guidance :-)

mtdaniels on 18 Aug 2016

I'll use the calendar service again, just as an example of a simple service.
Its current deployment in production has the following properties:

ServiceID = some GUID that means CalendarService to us. It's the same for all deployments of the calendar service.
The DeploymentID is CalendarServiceV1.3.

Now, let's say we have a new version of the service that's ready for production, CalendarServiceV1.4.
We need to think of a strategy for the upgrade:

If we prefer consistency over availability, we must shutdown cluster CalendarServiceV1.3 and only then start CalendarServiceV1.4.
If we prefer availability over consistency, we'll have CalendarServiceV1.3 and CalendarServiceV1.4 getting traffic simultaneously for a period as short a possible. This might introduce all sorts of concurrency issues, e.g. a user gets the same notification twice.

shayhatsor on 19 Aug 2016

👍3

It seems sort of arbitrary that reminders are scoped to the Service rather than the Deployment (and that there is no flexibility to choose). The decision could even be made for each individual reminder via an option which specified if the reminder survived across deployments.

I suspect we will change the DeploymentId and the ServiceId when we deploy a new version. The old version will stay up until it has completed any outstanding work. This decision is sort of made for the same reason (I think) that the DeploymentId changes between versions. There is no expected compatibility between deployments. If reminders are triggered across deployments, then there becomes some expected compatibility - which seems a bit odd.

Since you brought up persistent storage, I took a gander at the implementation I'm using ( https://github.com/OrleansContrib/Orleans.StorageProviders.SimpleSQLServerStorage ) - which uses GrainReference.ToKeyString() as the primary key. I'm not sure if this is consistent across other implementations - but that seems to indicate that persistent storage would get used across deployments and services. In our implementation - that is a non-issue (the grain ids wouldn't overlap), but worth noting since we're on the topic.

Forgive me - I'm not trying to beat a dead horse (I try not to beat live ones either) - just want to get all of these details out there. Hopefully it will prevent me (and maybe someone else) from doing something stupid.

mtdaniels on 19 Aug 2016

@mtdaniels

It seems sort of arbitrary that reminders are scoped to the Service rather than the Deployment (and that there is no flexibility to choose). The decision could even be made for each individual reminder via an option which specified if the reminder survived across deployments.

We originally scoped reminders to Deployment ID (like most other things), and that caused troubles. With reminders intended for rather longer term timers, users were unexpectedly losing them after an upgrade. So we fixed that behavior by switching it to Service ID. Since reminders are part of persistent state, just like persistent grain state, they are meant to transcend deployments.

If somebody needs the opposite behavior, the potential workaround is to always set Service ID equal to Deployment ID. However, that won't work if at the same time one wants to scope persistent grain state to transcend deployments.

sergeybykov on 19 Aug 2016

@sergeybykov

reminders are part of persistent state, just like persistent grain state

I suspect you mean that conceptually. Practically, they seem to be independent (reminders are stored in the SystemStore, grain state is stored using the StorageProvider).

Scoping persistent grain state seems to be left up to the provider implementation. AzureTableStorage, for instance, includes the ServiceId in the key. I don't believe SqlStorageProvider includes ServiceId in the key.

mtdaniels on 19 Aug 2016

@mtdaniels The one in master uses ServiceId in the condition. Otherwise the key is the grain ID and grain type hashed to an index in a heap table, collisions solved by the actual names.

Also, as a disclaimer, this is a new one intended to replace the new version that was rather complicated to set up and operate. Which was the reason to introduce the SimpleSQLServerStorage originally, I believe.

veikkoeeva on 19 Aug 2016

@mtdaniels

I suspect you mean that conceptually.

Yes, I meant conceptually.

Scoping persistent grain state seems to be left up to the provider implementation.

Providers are supposed to behave consistently, so that an app can switch them safely. We just have no way to enforce such consistency across providers. No yet at least.

sergeybykov on 19 Aug 2016

Was this page helpful?

0 / 5 - 0 ratings