Azure-functions-durable-extension: Azure Durable Functions - Fan-Out - Requeueing ad infinitum?

Created on 19 Jun 2018  路  9Comments  路  Source: Azure/azure-functions-durable-extension

Issue: Online, it seems to either take forever to run. Locally, it is seen as requeueing Functions over and over again (RIP API call rate limiting).

The issue is with the DownloadSamples_Orchestrator. I have the feeling that the loop is re-executed and the tasks are requeued ad infinitum.

Starting code:

[FunctionName("DownloadSamples_HttpStart")]
public static async Task HttpStart(
    [HttpTrigger(AuthorizationLevel.Anonymous, "get","post", Route = null)]HttpRequestMessage req,
    [OrchestrationClient]DurableOrchestrationClient starter,
    TraceWriter log)
{
    string instanceId = await starter.StartNewAsync("DownloadSamples_Orchestrator", null);

    log.Info($"Started orchestration with ID = '{instanceId}'.");
}

[FunctionName("DownloadSamples_TimerStart")]
public static async Task TimerStart([TimerTrigger("0 30 9 * * *")]TimerInfo myTimer, [OrchestrationClient]DurableOrchestrationClient starter,
    TraceWriter log)
{
    string instanceId = await starter.StartNewAsync("DownloadSamples_Orchestrator", null);

    log.Info($"Started orchestration with ID = '{instanceId}'.");
}

[FunctionName("DownloadSamples_Orchestrator")]
public static async Task<string> RunOrchestrator([OrchestrationTrigger] DurableOrchestrationContext context, TraceWriter log)
{
    log.Info("Orchestrator starting...");
    List<SampleRepository> repositories = await context.CallActivityAsync<List<SampleRepository>>("DownloadSamples_GetAllPublicRepositories", null);

    var tasks = new Task<SampleRepository>[repositories.Count];
    for (int i = 0; i < repositories.Count; i++)
    {
        tasks[i] = context.CallActivityAsync<SampleRepository>("DownloadSamples_GetRepositoryData", repositories[i]);
    }
    await Task.WhenAll(tasks);


    var sampleRepositories = tasks.Select(x => x.Result).ToList();
    await context.CallActivityAsync<List<SampleRepository>>("DownloadSamples_SaveAllPublicRepositories", sampleRepositories);
    return context.InstanceId;
}
needs-discussion

Most helpful comment

Nothing to complain anymore here. 馃槈

Code runs as expected and was faster with 1.5 with the latest version of the Function Host.

All 9 comments

This code was based on this doc article.

So they all get rescheduled but not re-executed. The code basically die on await Task.WhenAll.

I tried to do ContinueWith but without success. The repositories variable contains 900 items.

Can you clarify a bit further? What do you mean when you say they are getting rescheduled but no re-executed?

The code goes through the loop again recreating the array with tasks all in a pending state. The console display that Functions are being scheduled. It hits the await then after a little while, the orchestrator function is executed again.

Rinse. Repeat.

So it recreates 900 Task and I have the impression it does that quite a lot... after having it running for over an hour... it finally hit the breakpoint after.

So since this task completed successfully, I've decided to restart it from scratch again.

Nothing changed to the code. Just a fresh "F5" with the breakpoints at the same place and it seems to go through now. I will continue to investigate and report on it.

900 loop iterations took 4 hours to run to completion locally.

I forget where we landed with this issue, but a couple things to point out:

  • Orchestrator functions get re-executed multiple times by design. This is called out in a couple places in the documentation but it's worth calling out since sometimes people miss this. The durable tasks (activity function calls, etc.) do not get re-executed - only the orchestrator code. That means the log statement will get hit many times.
  • The Azure Storage Emulator is insanely slow and CPU-heavy, so that may factor into the end-to-end execution time of the orchestration.

To reiterate, you should be able to find that all activity functions execute exactly once, regardless of the orchestrator function replay behavior. If that's not the case, let's reactivate this issue and investigate further.

Nothing to complain anymore here. 馃槈

Code runs as expected and was faster with 1.5 with the latest version of the Function Host.

Was this page helpful?
0 / 5 - 0 ratings