Azure-functions-durable-extension: Durable activities not launching?

Created on 11 Sep 2017 · 2Comments · Source: Azure/azure-functions-durable-extension

Hello,

First, I completely understand this is very much prerelease and that issues are to be expected. This is a fantastic addition to Azure Functions and I'm excited to see it mature. That said, I am seeing certain problems with my activities not launching and would like to make sure I'm not doing anything wrong.

My orchestrator code runs a workflow with many dependencies between tasks. Here's some pseudocode:

public static async Task Run([OrchestrationTrigger] DurableOrchestrationContext context, TraceWriter log)
{
     var taskA = DoTaskA();
     var taskB = DoTaskB();
     var taskC = await Task.Factory.StartNew(async () =>
                {
                    await taskA;
                    await DoTaskC();
                }, CancellationToken.None, TaskCreationOptions.DenyChildAttach, TaskScheduler.FromCurrentSynchronizationContext());
     var taskD = await Task.Factory.StartNew(async () =>
                {
                    await Task.WhenAll(taskA, taskB);
                    await DoTaskD();
                }, CancellationToken.None, TaskCreationOptions.DenyChildAttach, TaskScheduler.FromCurrentSynchronizationContext());
     await Task.WhenAll(taskC, taskD);
}

private static async Task DoTaskA()
{
     // prepare ActivityFunctionA input...

     await context.CallFunctionAsync("ActivityFunctionA", input);
}

private static async DoTaskB()
{
     // prepare ActivityFunctionB inputs...

     var taskList = new List<Task>();
     foreach (var input in inputs)
     {
          taskList.Add(context.CallFunctionAsync("ActivityFunctionB", input);
     }
     await Task.WhenAll(taskList);
}
// private static async DoTaskC()
// private static async DoTaskD()
//etc

Here are some issues I'm seeing:

It sometimes takes a couple of tries to launch the orchestrator function. The launching client function always gets a successful launch message with a returned instanceId, but the orchestrator doesn't always start. I don't think it's ever taken me more than two consecutive tries though.
Activities don't always launch consistently. Order of operations is always correct, but sometimes the workflow gets stuck at an activity and never moves forward after its completion.
DoTaskB() in the example above shows the orchestrator calling an activity function for several inputs. Sometimes the activity functions don't get triggered for all inputs.

Is the code above bad practice for orchestrator functions? I read about restrictions like no IO, no async calls other than to durable activities, etc. Am I violating those restrictions?

Thanks!

Source

guythetechie

Most helpful comment

@cgillum - thanks a lot for the feedback. I applied the suggested changes with some minor tweaks, and the workflow now works beautifully.

I'll keep an eye on the orchestration function launches - will create a new GitHub issue if needed. So far, so good though.

guythetechie on 12 Sep 2017

🎉3

All 2 comments

Hi @guythetechie, thanks for providing this well-written feedback.

Based on your pseudocode my guess is that your usage of Task.Factory.StartNew is causing the orchestration to get "stuck" as you mentioned in your second and third points. This API is problematic because it violates the "no async calls" rule by executing code on new worker threads. Sometimes we are able to catch this and throw useful exceptions, but not always. I'll make a note to call out this API explicitly in the docs. Over time you can expect we'll improve our detection of cases like this.

Anyways, this is some interesting task coordination you are doing. :) If I understand your pseudocode logic correctly, I think you can rewrite your main orchestrator function more simply as follows to comply with the rules:

public static async Task Run([OrchestrationTrigger] DurableOrchestrationContext context)
{
     var taskA = DoTaskA();
     var taskB = DoTaskB();

     var taskC = taskA.ContinueWith(a => DoTaskC());
     var taskD = Task.WhenAll(taskA, taskB).ContinueWith(ab => DoTaskD());

     await Task.WhenAll(taskC, taskD);
}

Try applying that code change to see if it resolves your issues (and please let me know either way).

I don't have any clues about why it would take multiple times to launch the orchestrator function. This has always been very reliable for me. Note that sometimes it may take several seconds between starting an orchestrator and having it actually run. Anyways, if you still see this issue after fixing your orchestrator code as I suggested, please open a new GitHub issue for just the startup problem and we can debug that in more depth.