Azure-functions-host: [ServiceBus] Adding AutoComplete option to host.json file

Created on 25 Oct 2017 · 25Comments · Source: Azure/azure-functions-host

Hi there,

I'm not really sure about where I should put this kind of request. Please, let me know if I'm in the wrong place.

We have moved some code dealing with ServiceBus messages to Azure Function. In our domain, when a message failed to be processed, we move it to the DeadLetter queue. Today, given the implementation of Microsoft.Azure.WebJobs.ServiceBus.MessageProcessor supposed an Azure Function receiving the BrokeredMessage, if we move the message to the DeadLetter queue, it calls the BrokeredMessage.CompleteAsync() resulting in Exception and retries.

Here is a simple sample code to reproduce the issue:

public static class MyDequeuer
{
    [FunctionName("SampleMessageExecutor")]
    public static async Task Run(
            [ServiceBusTrigger("test-queue", AccessRights.Listen, Connection = "ServiceBusConnectionString")]
            BrokeredMessage brokeredMessage, ILogger log, ExecutionContext ctx)
    {
        await brokeredMessage.DeadLetterAsync();
    }
}

If you look at the execution, you will get the following after the Function execution.

Exception thrown: 'System.ServiceModel.FaultException`1' in Microsoft.ServiceBus.dll
Exception thrown: 'System.ServiceModel.FaultException`1' in Microsoft.ServiceBus.dll
Exception thrown: 'System.ServiceModel.FaultException`1' in Microsoft.ServiceBus.dll
Exception thrown: 'System.ServiceModel.FaultException`1' in Microsoft.ServiceBus.dll
Exception thrown: 'System.ServiceModel.FaultException`1' in Microsoft.ServiceBus.dll
Exception thrown: 'Microsoft.ServiceBus.Messaging.MessageLockLostException' in Microsoft.ServiceBus.dll
Exception thrown: 'Microsoft.ServiceBus.Messaging.MessageLockLostException' in Microsoft.ServiceBus.dll
Exception thrown: 'Microsoft.ServiceBus.Messaging.MessageLockLostException' in Microsoft.ServiceBus.dll
Exception thrown: 'Microsoft.ServiceBus.Messaging.MessageLockLostException' in Microsoft.ServiceBus.dll
Exception thrown: 'Microsoft.ServiceBus.Messaging.MessageLockLostException' in Microsoft.ServiceBus.dll
Exception thrown: 'Microsoft.ServiceBus.Messaging.MessageLockLostException' in Microsoft.ServiceBus.dll

Is there any chance to add a way to customize the AutoComplete property from MessageOptions as we can do for AutoRenewTimeout and MaxConcurrentCalls from the host.json file in an Azure Function context.

Thanks,

help wanted improvement

Source

sescandell

Most helpful comment

@mathewc,
basically what i want to do is manage the state myself and not rely on the function's result. I am getting the same "The lock supplied is invalid" error that @kunalwadhwa is getting. That's because for certain exceptions, I don't want to retry it, but mark it as completed, while needing the function to fail so that I have a count of my failures in AppInsights.

The MessageProcessor code relies heavily on the function result. There is a chance to override that for result.Succeeded but we don't check that failures.

Like @sescandell suggested, can't we expose MessageOptions.AutoComplete through host.json and change to code to check that during failures too? That way, you leave the control to the user fully if they choose to do so using the MessageOptions.AutoComplete setting.

nebiyou-gebretatios on 24 Jun 2018

👍4

All 25 comments

I'm not 100% sure if the autocomplete property would help here, if our SDK is calling .Complete() after function execution completes.. thoughts?

paulbatum on 25 Oct 2017

From my point of view, if a user set the AutoComplete property to false, (s)he knows it's up to him(her) to call the Complete() function. Given the current MessageProcessor implementation it would work.

Maybe I didn't understand well your answer?

sescandell on 26 Oct 2017

Its possible I'm misunderstanding the MessageProcessor implementation. Here's the part I'm worried about:
https://github.com/Azure/azure-webjobs-sdk/blob/ebaed384f85c52a75f8aaf3fd65fda5245d865d7/src/Microsoft.Azure.WebJobs.ServiceBus/MessageProcessor.cs#L63-#L69

It checks to see if autocomplete is off, and if so it calls .Complete(). So the function would have to fail (such as throwing an exception) for .Complete() to not be called. I don't know this code super well so let me know if theres something I'm missing.

paulbatum on 26 Oct 2017

👍1

Throwing an exception makes logical sense but I agree I'm misunderstanding this also.

dcarr42 on 26 Oct 2017

The thing is, if we throw an Exception from the Executor (that would look weird), the Abandon() method would be call (neither AutoComplete is true or false). So, we still face the "same issue".

sescandell on 27 Oct 2017

They are marked as virtual can we override this default behaviour. a thought.

dcarr42 on 27 Oct 2017

@dallancarr If you set Max Delivery Count to 1 and then throw an exception instead of DeadLetter-ing, won't it do what you want?

mikhailshilkov on 12 Nov 2017

But the OP wants to call DeadLetter and not have it throw. So I think the OP needs to be assured of the correct behaviour.

dcarr42 on 14 Nov 2017

👍1

Hi @mikhailshilkov

Actually no it won't. If we throw an exception, the default behavior (the one coded currently) is to Abandon the message, not Deadletter-ing it.

The request here is to let the consumer to be responsible of the "message lifecycle" (completing, abandonning or deadleter-ing it)

sescandell on 15 Nov 2017

👍1

@sescandell Abandoning with Max Delivery Count = 1 will Dead Letter it

mikhailshilkov on 15 Nov 2017

@paulbatum

Maybe calling deadletter inside the function could be handled more gracefully? Or better still handle Application explicit dead lettering.

dcarr42 on 15 Nov 2017

@mikhailshilkov Oh, my mistake, I read Max Delivery Count, but thought Prefetch Count.

You're right, it should work... I'll try it.

That being said, if we can customize the AutoComplete option from the host file, that would be great too :)

sescandell on 17 Nov 2017

Setting MaxDeliveryCount is not a viable workaround i.m.h.o.

If a technical issue is causing the message not to be processed succesfully (e.g. network problem), I would like to throw the exception and have the runtime retry later (the default behavior).

But if a message has invalid content, there is no point in retrying delivery, that won't change anything, so I want to deadletter it myself, and the function runtime should not retry at a later time, nor throw an exception.

A simple sollution could be to check message.DeadLetterSource before calling Complete or Abandon.

mrwatts on 17 Apr 2018

👍1

+1 for an option where the function can be responsible of the "message lifecycle". Maybe the consumer wants to manually defer, or forgo abandoning and letting the message stay locked for 5 minutes before being retried. The way it's currently implemented, I've seen Azure functions burn through numerous retries in seconds causing messages to dead letter immediately.

kevin-meyer on 20 Apr 2018

👍4

Has there been any update to this open item? My customer is now running into the same exception which was not happening prior to June 12th.

JaDevGuy on 14 Jun 2018

Recently, I have started facing the same issue. Is there a work around for it or any solution. It throws the below exception only after completing/abandoning the message:

"_Microsoft.ServiceBus: The lock supplied is invalid. Either the lock expired, or the message has already been removed from the queue._"

kunalwadhwa on 14 Jun 2018

I believe there were some service bus related to updates the most recent release that would roughly line up to the mentioned June 12th date - @mathewc can comment further

paulbatum on 15 Jun 2018

The ServiceBus changes were simply to improve logging of background exceptions - likely that's what you are seeing now. The changes don't cause the exceptions, just surfaces them. See issue https://github.com/Azure/azure-webjobs-sdk/issues/1703.

Likely you had errors occurring before these changes but weren't aware - now they are logged, which is what you want really. Situatiouns like the message lock errors mentioned above by @kunalwadhwa are exactly what we made the logging changes for.

Usually the message locks are lost due to functions running longer than the AutoRenewTimeout which is configurable. See the discussion in https://github.com/Azure/azure-webjobs-sdk/issues/1697.

mathewc on 15 Jun 2018

The MessageProcessor code relies heavily on the function result. There is a chance to override that for result.Succeeded but we don't check that failures.

nebiyou-gebretatios on 24 Jun 2018

👍4

I still agree with this request, but wanted to share with the community a couple other issues that have occurred that's convinced me to stop using Service Bus triggers in Azure Functions.

I'll occasionally see a short delay (30 seconds - minutes) between message publication and our function code handling the message. This isn't a show stopper and I haven't debugged it much, but it feels like the function is is either waiting for a batch, or is being spun up to handle the message.
More concerning: we experienced a strange issue when we scaled our ASP the function lived in. Essentially, the function processed a couple of our messages multiple times, 20 hours apart, and the Portal and Service Bus explorer showed "-2" messages were on the bus for a little while. Microsoft premier support had no insight as to how that could happen or what "-2" messages even means. My hypothesis is that the functions serialize their state or something, so when we scaled, we ended up getting multiple instances processing the exact same messages with the same lock tokens. I have no proof of that and it doesn't really explain why the second processing happened 20 hours later. Either way, it's strange.
I'm pretty sure we've had messages vanish where it seems like the Service Bus trigger ate the message without being delivered to our function. It was another developer that shared it with our team so I can't be certain that it wasn't a bug in our custom function logic.

There's still some fuzziness in my mind regarding these issues. Whether they were problems with the service bus, the black box code the functions are running before our code runs, or if they were a result of a bug in our code. Regardless, we never experience issues with our message consumers that run as docker service or windows services, and we've had multiple issues in the short time using Service Bus triggered Azure Functions.

kevin-meyer on 24 Jun 2018

Sent PR https://github.com/Azure/azure-functions-host/pull/3179 to address this. Basically exposing AutoComplete for host.json config, and also fixing the WebJobs SDK AutoComplete handling issue pointed out by @paulbatum and others above.

@kevin-meyer if you're having issues, recommend opening new issues in our repo so they get attention, rather than tacking them on to this issue :)

mathewc on 24 Jul 2018

👍2

Did this already find its way into the current version of Functions?
Is it just for 1.x? What's the status for 2.x?
Shall we update the docs?

mikhailshilkov on 17 Aug 2018

👍3

Got directed here (by @mikhailshilkov) from my StackOverflow q regarding the same functionality / error others have been reporting.

https://stackoverflow.com/questions/51881103/calling-complete-or-deadletter-on-brokeredmessage-with-azure-functions-service/51889907

Would also love to know the status of this!