"You are charged for RUs consumed, since data movement in and out of Cosmos containers always consumes RUs. You are charged for RUs consumed by the lease container."
Assuming that my container is billed as "manually provisioned throughput" and therefore will not automatically scale, what happens to the change feed if I exceed the provisioned RU/s?
⚠Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
Requests will fail for a pre-set amount of time. You can find out the "reset time" by examining the x-ms-retry-after-ms header returned on any such failed request.
@JasonBSteele Please get back to us if you have any further question or we can close this thread.
Requests will fail for a pre-set amount of time. You can find out the "reset time" by examining the
x-ms-retry-after-msheader returned on any such failed request.
What is the impact to a Function triggered by changes? Presumably it will stop receiving them for a while (how long?) and then start receiving them again with no changes lost?
Yes, but you WILL lose changes from the time span the service had been throttled.
Yes, but you WILL lose changes from the time span the service had been throttled.
If I understand correctly, you are saying that one or more changes may be lost if I exceed the RU/s? That a change could be made but never raised?
@JasonBSteele - I cannot say that definitely. It depends on how soon after the change you are able to resume normal operations and how you have subscribed to the change notifications. A reading of the time-limitations of those mechanisms should be able to help you further.
It is quite possible that your method allows for all changes to be retained durably for much longer periods of time.
My particular scenario is to use a Function trigger.
Function trigger, ok. But how? Directly? Through an EventHub? Via a queue? What?
Directly, using the change feed trigger:
https://docs.microsoft.com/en-us/azure/cosmos-db/change-feed-functions
Ah! Wait wait. I completely misunderstood your question. Let me take it from the top.
These changes are stored for a long long time by the backing engine. I don't believe that you ever lose these changes. You can, for example, have a 5-10 year old dataset [i.e., created in CosmosDB 5-10 years ago] and read all the changes that happened to those elements from that time to the present moment.
The only time you LOSE changes is if an item is deleted. Somehow, when you delete an item, you lose its change history. This is odd, yes, but it is consistent across all Azure resources -- as in you can see changes, but never deletions of things.
So what I believe will happen when you exceed your provisioned RU throughput is you should see some temporary throttling.
You can monitor this by examining the x-ms-retry-after-ms [there are other x-ms- headers too] HTTP header sent by Cosmos during the period of throttling. I am not sure if you can read them from your Change Feed Function, but if you can: read it, and loop-sleep until it reduces to zero / starts working again.
Ok, this is more like what I would have expected! However instead of:
You can monitor this by examining the x-ms-retry-after-ms [there are other x-ms- headers too] HTTP header sent by Cosmos during the period of throttling. I am not sure if you can read them from your Change Feed Function, but if you can: read it, and loop-sleep until it reduces to zero / starts working again.
I would expect that after the throttling, my Change Feed function would be called for each event that occurred during the throttling. The Change Feed function is only triggered, it is not responsible for looping through historic events.
Thanks for persevering with this!
Why don't you try that out with a test Cosmos DB instance set to the lowest throughout and exhaust them manually?
PS: haha. out of curiosity more than anything else.
Sure, I could do that, and it might work fine that time! But I would rather see a statement in the docs that, by design, an event will never be lost in these circumstances and will be delivered eventually.
I need to rely on this functionality for keeping systems in sync, so I need some guarantees that if we hit a spike of traffic that changes won't be lost.
docs-- that's for the silent @NavtejSaini-MSFT 's team to determine and update.
@JasonBSteele @sujayvsarma Really great to see the community helping each other. We are assigning this our author to check the requirement and make the change to doc if needed.
@markjbrown @SnehaGunda Please check the feedback and help with your insight.
Change Feed is not an op-log and does not contain every mutation of records within a container. It only contains the latest version of an item. It also does not contain deletes. If you want to track deletes use a tombstone flag on your data. This is and more details on Change Feed is explained in our docs here. https://docs.microsoft.com/en-us/azure/cosmos-db/change-feed.
To answer the op's question. Azure Functions uses our SDK automatically and will retry 10 times if rate limited on throughput. However this scenario is highly unlikely. The initial call to read the change feed is about 2 RU/s. Then each read out of change feed is a 1 RU/s point read. As such, for a Trigger Function is highly unlikely if not impossible for these to experience 429's because writes cost much more. You are more likely to be rate limited on the write path than you are on the read path using Change Feed. In a scenario where writes are outpacing the reads in Change Feed, it will simply fall behind and catch up when writes slow down or stop.
Thanks.
As such, for a Trigger Function is highly unlikely if not impossible for these to experience 429's
Well that's "fairly" reassuring, but I would have thought that this worth calling out in the docs? (It's taken me a long time to get to this response, and I am sure other users would like to know the situation)
Yeah sorry for the delay. Bit backlogged.
Thanks.
Hi @markjbrown, I quite understand, it really wasn't a complaint, I was attempting to point out that without the guidance in the docs others, and MSFT associates, will also spend this time to get to the same conclusion . Many thanks