Azure-sdk-for-net: [Design Discussion] Should Event/Message Batches be Scoped to a Single Send or Reusable?

Created on 15 Apr 2020  路  16Comments  路  Source: Azure/azure-sdk-for-net

Summary

The current design of the EventDataBatch in the Event Hubs client and ServiceBusMessageBatch considers them scoped to a single send operation; once a batch is full and has been published, it is intended to be disposed and a new batch created for any subsequent send operations. As a result, batches are read-only entities into which events/messages may be added and then are held by the batch until it is disposed.

Scope of Discussion

  • Should the batch support a Clear operation that could be called to allow it to be cleared after a send operation and reused?

  • Does allowing the batch to be emptied and reused cause confusion with respect to consideration for when it should be disposed?

  • Would including the ability to clear a batch be perceived as awkward without the ability to peek at events in a batch and/or remove a specific item?

Out of Scope

  • Access to individual events in a batch; some implementations do not hold a reference to the event/message when added to the batch. Instead, the event/message is translated to the resulting wire format (such as an AMQP message) and membership in the batch is based on the wire format.

Considerations

  • The maximum allowable size for a batch is determined by the service and communicated to the client. Different service SKUs allow different sizes for a single send operation; the reason that batch creation is asynchronous is to allow the service to be queried. The service value is cached and used for any subsequent batch creations. Only the first batch creation or send operation pays the tax of making the query.

  • To determine the size of a batch, the events/messages in the batch must be serialized to the wire format of their protocol (for example, an AMQP message); adding a message to the batch requires paying the serialization cost to measure the resulting size in bytes. It is not possible to defer that action and accurately predict the batch size.

  • Allowing visibility or manipulation of individual messages/events in a batch would potentially double the memory needed for some batch implementations due to the implementation in some languages our would potentially come with a performance cost. For this reason, manipulating and exposing individual events in the batch is not currently open to consideration.

  • Historically, one of the reasons that a Clear operation was not considered is because the batch was attempting some "clever" optimizations with respect to managing the events/messages to their respective wire format. Early previews of Event Hubs surfaced corner cases that resulted in a more straight-forward implementation.

  • There are some potential enhancements that are under consideration for the internal batch implementation to improve performance and lower resource costs. It will be important to consider the impact of any batch API changes against the potential improvements.

Client Event Hubs Service Bus design-discussion

Most helpful comment

We've received feedback that, in some cases, creating a batch is awkward due to the asynchonous nature of the call, and developers would prefer to have a flow something like:

- Create a batch

- While there are events/messages to publish:
    - Add events/messages to the batch
    - Publish the batch
    - Clear the batch

Hmm but publishing the batch is still asynchronous. And I guess creating the batch is only asynchronous on the first call - subsequent calls should complete synchronously.

All 16 comments

//cc: @richardpark-msft, @chradek, @conniey, @hemanttanwar, @srnagar, @KieranBrantnerMagee, @yunhaoling, @YijunXieMS, @JoshLove-msft, @ShivangiReja, @lilyjma

I can see why wanting to hold onto and re-send a batch might be useful independent of whether a Clear operation is exposed. For instance, if an app needs to send the same set of messages periodically on the same link. What are the benefits of having a Clear method? Is it just that we think this is something users will expect?

Another issue with exposing the messages in a batch (I know this is out of scope, but just wanted to mention this as we ran into this in the initial design for SB) is that users could change the message properties which would invalidate our size calculations.

I can see why wanting to hold onto and re-send a batch might be useful independent of whether a Clear operation is exposed. For instance, if an app needs to send the same set of messages periodically on the same link. What are the benefits of having a Clear method? Is it just that we think this is something users will expect?

We've received feedback that, in some cases, creating a batch is awkward due to the asynchonous nature of the call, and developers would prefer to have a flow something like:

- Create a batch

- While there are events/messages to publish:
    - Add events/messages to the batch
    - Publish the batch
    - Clear the batch

Once a batch is created and messages/events have been added, it can be sent multiple times, though events/messages are typically transient in nature and intended to be published once. I don't know of any scenarios off-hand where the same set of messages/events would be useful to send periodically. Event something as cyclical as a heartbeat event would normally take a snapshot of current state (even if that is just a timestamp) to offer context when it was received.

We've received feedback that, in some cases, creating a batch is awkward due to the asynchonous nature of the call, and developers would prefer to have a flow something like:

- Create a batch

- While there are events/messages to publish:
    - Add events/messages to the batch
    - Publish the batch
    - Clear the batch

Hmm but publishing the batch is still asynchronous. And I guess creating the batch is only asynchronous on the first call - subsequent calls should complete synchronously.

I have never seen a customer for asking that however I don't think it will hurt to support Clear API on the batch. Does CreateBatchAsync cause I/O at all calls or only for the first call? I assume once link is established, max-message-size will be available for the next calls w/o a need to go on the wire again.

@serkantkaraca It's only the first call that goes over the wire to get the max batch size.

The request comes up from time-to-time, but its not something that I've heard a large outcry for. I remember seeing a mention or two elsewhere, but here's a recent discussion that I was involved in:

https://github.com/Azure/azure-sdk-for-net/issues/9217

If I'm voting (and this is only because it's already been _released_) I don't think Clear() is good change now.

If I've designed my app, that EventDataBatch was previously a "append only" thing. So if I'm passing it around I had a guarantee that it could only be added to.

Now it's possible somebody might call Clear on it. Ie, I've broadened the interface in a way that breaks a prior constraint.

If I'm voting (and this is only because it's already been _released_) I don't think Clear() is good change now.

If I've designed my app, that EventDataBatch was previously a "append only" thing. So if I'm passing it around I had a guarantee that it could only be added to.

Now it's possible somebody might call Clear on it. Ie, I've broadened the interface in a way that breaks a prior constraint.

You're certainly not wrong, and this is an important point of consideration. The follow-up question that I'd ask is how many application authors are exposing Event Batches in a manner where a caller outside of their control would be able to interact with it. If we're discussing changes internally to a single application, I think that weakens the concern a bit.

The follow-up question that I'd ask is how many application authors are exposing Event Batches in a manner where a caller outside of their control would be able to interact with it.

Good question. I'm not sure how to answer. I guess we'll have to start crawling through GitHub source code to see what's all out there being used. :D

I can see why wanting to hold onto and re-send a batch might be useful independent of whether a Clear operation is exposed. For instance, if an app needs to send the same set of messages periodically on the same link. What are the benefits of having a Clear method? Is it just that we think this is something users will expect?

We've received feedback that, in some cases, creating a batch is awkward due to the asynchonous nature of the call, and developers would prefer to have a flow something like:

- Create a batch

- While there are events/messages to publish:
    - Add events/messages to the batch
    - Publish the batch
    - Clear the batch

Once a batch is created and messages/events have been added, it can be sent multiple times, though events/messages are typically transient in nature and intended to be published once. I don't know of any scenarios off-hand where the same set of messages/events would be useful to send periodically. Event something as cyclical as a heartbeat event would normally take a snapshot of current state (even if that is just a timestamp) to offer context when it was received.

@jsquire do the users want to pass an EventDataBatch from one function to another? The .clear() method will be helpful in this scenario. If adding and publishing events are in one function, the clear method doesn't help.

@jsquire do the users want to pass an EventDataBatch from one function to another? The .clear() method will be helpful in this scenario. If adding and publishing events are in one function, the clear method doesn't help.

That was the scenario that made sense to me.

The other potential win, at least in .NET, is that there's a connotation that people make with "async means IO calls" and therefore equate each call to create batch with "this slows down my code."

Though the reality is that only the first call pays the network tax and subsequent calls complete synchronously, it may be appealing from an optics perspective to Clear rather than await an async call to create the batch.

Potentially that has been already discussed internally - differences in batch usage between EventHubs and Service Bus customers. EventHubs is more telemetry focused, where re-using the batch would be very logical. Service Bus is more associated with business applications where events are discrete and a batch sending is taking place under certain conditions rather than continuously. Therefore a batch is used once and disposed of rather than re-used.

We have come to a point where the consensus opinion is that we don't feel this is enough of a pain point for those using the client libraries to prioritize it over other features. As a result, closing this out. When the time comes to revisit, a new discussion should be opened.

Was this page helpful?
0 / 5 - 0 ratings