Azure-functions-durable-extension: DELETE for single entity rest API

Created on 10 Sep 2019  路  19Comments  路  Source: Azure/azure-functions-durable-extension

Would be nice to have a method supported of DELETE that would destroy an instance of an entity.

Somewhat related to this
https://github.com/Azure/azure-functions-durable-extension/issues/930

enhancement

Most helpful comment

Yes, I think that is a good idea. I think this also begs the question if we should add DeleteEntity/DeleteEntityAsync methods to IDurableEntityClient/IDurableOrchestrationClient.

This could avoid problems where users struggle to discover how to delete entities #932.

All 19 comments

Yes, I think that is a good idea. I think this also begs the question if we should add DeleteEntity/DeleteEntityAsync methods to IDurableEntityClient/IDurableOrchestrationClient.

This could avoid problems where users struggle to discover how to delete entities #932.

I agree - and think we should consider this for the final 2.0 release.

I think we need to simplify the current state management further. Currently, the entity state management (get/set state value or object) is separate from the entity lifetime management (entity is created when called, is deleted when destructOnExit).

I think it should all be just "state management". So we would just support (get/set/delete) on the entity state, without the "entity itself" (whatever that means) being subject to management operations.

To keep DELETE as simple as possible I think I want to treat it exactly like regular operation called "delete". Concretely, I plan to implement this as follows:

  • calling a DELETE http request on the entity is exactly equivalent to signaling an operation called 'delete'.

  • Inside DispatchAsync<T>, when the operation is called delete but T does not have a method called Delete, act as if there is a method

void Delete() 
{
   Entity.Current.DeleteState();
}

I like the simplicity, but could this confuse customers who have defined a delete operation in their entity already? In that case it seems that a DELETE API call would invoke their operation rather than deleting the entity state.

Yes, that is the intent. This can be beneficial, for example to allow users to delete additional state when the entity is deleted. Of course we have to explain it.

The important point to get across is to that DELETE is not some special management API, but just like any other user-defined operation, and has all the same behaviors (regarding locks, signal vs call, etc).

BTW, I briefly considered changing the behavior of POST to make it more similar to DELETE: meaning that when you call POST on an entity without specifying the operation name, we invoke an entity operation called 'post'. It is a breaking change though.

Correct me if I'm wrong, but from what I'm seeing here and in #932 is that the "delete" mechanism that is being moved towards is deleting the state, not removing the entire entity. Looking through the instances in the storage queue, I can see that "deleted" entities still exist and are shown running but they don't have any state (as intended).

What happens to those entities over time that have had their state deleted but still have an instance hanging around? Are they going to stay around indefinitely and create bloat or is there a way to get rid of them?

Yes, there is a distinction between the entity state (the user-defined state of the entity) and the entity scheduler data. The latter is used by the entity scheduler to implement in-order message delivery and distributed locking.

Currently, we offer no way to delete the metadata, so it will indeed create bloat.

This is somewhat undesirable. Perhaps we should consider automatically removing the metadata for deleted "idle" entities as soon as it is safe to do so? (We would need to ensure that the lock is released, no messages have been queued, and the last message was received more than DurableTaskOptions.EntityMessageReorderWindowInMinutes minutes ago, which is 30 by default).

Maybe this could be part of an automatic cleanup job that also removes completed orchestrations after some time.

I think automatic cleanup is a reasonable option, assuming that it could handle the race condition where the user hasn't written to the entity and starts writing again at the same time the cleanup job attempts to clean it up.

At present the expectation appears to be that users manage the cleanup of old orchestrations on their own, per the Azure documentation. If durable entities were exposed in a similar fashion I think that would work well.

As an example:
```C#
[FunctionName("CleanupDurableEntities")]
public static async Task Run(
[DurableClient] IDurableOrchestrationClient client,
[TimerTrigger("0 0 * * * *")]TimerInfo myTimer)
{
// Similar search as for orchestrations, but with a flag to filter based on having state or not?
var entities = await client.GetEntitiesAsync(
DateTime.MinValue,
DateTime.UtcNow.AddMinutes(-30),
false);

foreach (var entity in entities.Where(e => e.QueueLength == 0))
{
    // Lock the entity, do a final check to make sure it still has no state, and destroy it
}

}
```
This isn't running as an orchestration, so getting a proxy object and locking the entity isn't possible in this case, but it seems like those would be necessary to make sure we could destroy the entity and ensure nothing had written to it while doing so. Not sure what happens to messages that arrive during the destruction in a case like this.

I prefer this as an approach over a scheduled job as this is more visible as to how and when it is happening. It also would hopefully expose the ability to just destroy an entity directly without all of the locks and checks in the case that I was using it in a way that didn't cause issues to do so.

If an automated approach was preferred, would that be handled via some sort of TTL setting on the entity?

Giving an explicit delete option is slightly difficult because the precise conditions under which it is safe to delete the metadata are complicated and we don't necessarily want to expose those conditions to the application program.

If an automated approach was preferred, would that be handled via some sort of TTL setting on the entity?

I think we would keep the deletion "logically invisible" - that is, it would not interfere with the programming model in any way. It's more like collecting garbage that has no relevance. In particular, we would never delete an entity that exists (i.e. has state), or has anything happening on it (locks, timers, message queues).

Probably being overly pedantic here, but at what point would it be safe to write to an entity that is being deleted? Based on what you've suggested, any writing to an entity prior to deletion would cause the deletion to be delayed because the entity became active again.

For new entities I'm assuming that there is some sort of check that creates the entries in the tables to track that entity for the first time. Assuming the entity was locked for deletion, and queues at that moment were empty, and the delete on the underlying tables was taking place, would a new message that came in for that entity still kick off the re-creation of that entity without an issue?

The important part to understand is that what we maintain in storage are two things: (1) the entity state, as defined by the application, and (2) metadata used by our scheduler, invisible to the application. These two are not deleted at the same time!

Let's keep the terminology clear for 'deletion' vs. 'garbage collection'.

  • 'deletion' means deletion of the application-defined entity state. It is performed explicitly by the application code, and always within the context of an entity operation. It happens at the same time as that operation. Note that it is perfectly safe to access this entity again after deletion, it will just be recreated, just like when accessing the entity the first time. There is no way to "permanently delete" an entity: the entity state can be created, deleted, and recreated as many times as desired.

  • 'garbage collection' means the runtime removes metadata it maintains as part of the entity scheduler. This metadata is not deleted immediately when the entity state is deleted (for various reasons). However, the application programmer does not really need to worry about this, because the metadata is not relevant to the application behavior. Also, we plan to remove it automatically after the entity has been deleted and not accessed anymore for some time.

This capability will be available starting in the v2.1 release.

@cgillum .... I cannot find this capability in the 2.1 release. Did it get de-scoped or am I missing it?

I'm trying to delete the entire storage record for a durable entity, not just the application-defined data. This is the same affect as await client.PurgeInstanceHistoryAsync(instanceId); causes for the orchestration function, but for Durable Entities.

Entity.Current.DeleteState(); only removes the application-defined entity data but leaves the record in storage ("deletion" as described above).

Entity.Current.DestructOnExit(); does not seem to exist.

You may want to see the later part of this discussion for additional context #932, specifically this comment: https://github.com/Azure/azure-functions-durable-extension/issues/932#issuecomment-590205279)

Any pointers greatly appreciated.

The functionality is a bit hidden at the moment. You can find mention of it in the SignalEntity HTTP API reference docs: https://docs.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-http-api#signal-entity. But as you mentioned (and this is being discussed here and in the other thread) this only deletes the application state and not the full table storage record. For the latter you would need to use our purge APIs until we come up with a more first-class story for entity cleanup.

Thanks @cgillum .... am I correct in thinking that the purge API (PurgeInstanceHistoryAsync) only applies to orchestration state, not entity state?

For clarity, as of v2.1 (latest release on 26th Feb 2020) there is no built-in way to delete the full table storage record for an entity; we should look to build something separate that operates at the storage layer to delete the underlying table row for entities that have already had the application state deleted?

Thanks

There is no need to go directly to storage. You can use the purge API to delete the full table storage record for entities (because they are internally implemented as a special kind of eternal orchestration).

However, note that there is a reason we do not immediately delete that metadata: we need it to guarantee in-order delivery of entity messages. Since an entity can be re-created after it is deleted, we cannot immediately delete that metadata (otherwise the message sorting logic does not work).

So, our recommendation is:

Do not purge an entity unless it has been idle (i.e. has not executed any operations) for at least 30 minutes, which is the settingDurableTaskOptions.EntityMessageReorderWindowInMinutes.

Otherwise, if you reuse this same entity id again soon afterwards, you may see some issues (such as messages not being delivered for up to 30 minutes after the last activity).

@sebastianburckhardt , thank you for your update and insight around delaying the purge.

What can I use for the instance ID if I want to use PurgeInstanceHistoryAsync for an entity? Presumably, this is the PartitionKey for the entity as seen in the underlying storage?

From what I can see the partition key is made up as follows: @entityclassname@entityid

Yes, that is right.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cgillum picture cgillum  路  3Comments

SayusiAndo picture SayusiAndo  路  3Comments

tommasobertoni picture tommasobertoni  路  3Comments

mark-szabo picture mark-szabo  路  3Comments

val-janeiko picture val-janeiko  路  3Comments