Azure-docs: reliability?

Created on 3 Jun 2019  Â·  6Comments  Â·  Source: MicrosoftDocs/azure-docs

So based on these (bullets points), is it safe to say, that only "service bus" ensures reliable asynchronous message delivery, and the other two would not ensure reliability?


Document Details

⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

Pri1 cxp event-grisvc product-question triaged

All 6 comments

Hi @jwisener Thank you for your feedback! We will review and update as appropriate.

Is that statement I made correct? that the other two do not ensure reliability?

Hi @jwisener, not quite. All three services are built on Service Fabric in order to be highly reliable in their own right. Anytime data is passed to any of them, it is triply replicated before being acked so that it is being reliably stored in the system. That's the first part of reliability.

Next is delivery - all three of which also do reliably, but differently. Service Bus allows you to perform a peek-lock on receiving and processing a message meaning that you may read the message, but Service Bus won't consider it delivered until you complete the message giving you an at least once reliability guarantee.

Event Grid is push-based rather than pull based like Service Bus so it's handled a little differently. When Event Grid pushes to an endpoint, it expects a 200 - 204 HTTP status code in response. Anything else is considered a failure. In the event of a failure, Event Grid will retry delivery of the event in an exponential backoff model for up to 24 hours. After 24 hours, if a deadletter destination is configured, the event will be put in the deadletter for later consumption. In this way, Event Grid provides an at least once guarantee on reliable delivery.

Finally, Event Hubs is different from both of the others. You can consider data written to an Event Hub to be immutable and as such, there is no such thing as a destructive read. In fact, you can think of an Event Hub as similar to a continuous tap or ledger that is being written to. You use an offset to know where you are in reading the data on an Event Hub, and "checkpointing" allows you to store where you most recently successfully read on the server. This way, if you experience a failure or, you begin reading again at the checkpoint and can guarantee and at least once delivery of all data in the Event Hub.

I hope that helps explain how all three services ensure reliability, but in different ways.

Thanks for adding your valuable input here, @banisadr!

@jwisener Apologies for the delay in response. I hope the comment above answers your question. Since there isn't any change required to the documentation from this issue, we will now proceed to close this thread. If there are further questions regarding this matter, please open a thread on MSDN forums and we will gladly continue the discussion.

Hi @banisadr
Good explanation.
But I think peak-lock must be peek-lock.

Good catch @peterstouw, I must have been day dreaming about climbing mountains ;)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

paulmarshall picture paulmarshall  Â·  3Comments

Favna picture Favna  Â·  3Comments

JamesDLD picture JamesDLD  Â·  3Comments

spottedmahn picture spottedmahn  Â·  3Comments

JeffLoo-ong picture JeffLoo-ong  Â·  3Comments