This bug has been occurring randomly for a few months in one of our Function app and has been really hard to pin down.
We use a fan-out/fan-in pattern in order to split a job into multiple sub-orchestrations. We use Task.WhenAll(...) to wait for all the sub-orchestrations to complete and gather the results.
Apparently, sometimes one of the sub-orchestrations will return an unexpected result, which is actually the same result as another sub-orchestration. So at the end of the fan-in, we end up with one duplicate result, and one missing result.
We split a job in 17 parallel sub-orchestrations, and expect to receive 17 unique results.
We receive 17 results, but one is a duplicate, and therefore one result is missing.
Our results are large (a few KBs) so they are stored in the optimizerhub-largemessages blob container.
For the faulty sub-orchestration, if I correlate the message-{someid}-suborchestrationinstancecompleted.json.gz blob with the corresponding history-{rowkey}-suborchestrationinstancecompleted-result.json.gz blob, the content don't match.
The content of the history-{rowkey}-suborchestrationinstancecompleted-result.json.gz blob is actually a copy of another sub-orchestration result.
Retry the orchestration.
N/A
Azure storage account name: Can't share
Orchestration instance ID(s): 043e8885b43b4c41ae8655fbf6945d54
More details:
These blobs should contain the same result but they don't.
043e8885b43b4c41ae8655fbf6945d54/message-000000000000006c-suborchestrationinstancecompleted.json.gz
043e8885b43b4c41ae8655fbf6945d54/history-000000000000002c-suborchestrationinstancecompleted-result.json.gz
Instead, the result blob is a duplicate of:
043e8885b43b4c41ae8655fbf6945d54/history-000000000000002f-suborchestrationinstancecompleted-result.json.gz
We put a mechanism in place to detect the problem. We pass a SubOrchestrationIndex property to each sub-orchestration and expect to receive the same value in the output. If we receive a different value, it means there was a mix-up in the sub-orchestration results.
This has not happened yet. We will monitor the Function App and I'll update this issue if we detect the problem again.
Here's the code of we put in place:
var subOrchestrationTasks = new Task<Optimizer.Output>[problems.Length];
for (var i = 0; i < problems.Length; i++)
{
var problem = problems[i];
var task = context.CallSubOrchestratorWithRetryAsync<Optimizer.Output>("Optimization_SubOrchestrator",
new RetryOptions(TimeSpan.FromSeconds(10), 3),
new SubOrchestrator_Optimization.Input
{
SubOrchestrationIndex = i,
Problem = problem,
...
});
subOrchestrationTasks[i] = task;
}
var optimizerResults = await Task.WhenAll(subOrchestrationTasks);
// We detect sub-orchestration results mix-up in the following loop
for (var i = 0; i < problems.Length; i++)
{
var result = optimizerResults[i];
if (result.SubOrchestrationIndex != i)
{
throw new Exception($"Wrong SubOrchestration result! Expected {nameof(result.SubOrchestrationIndex)} to be {i}, but was {result.SubOrchestrationIndex} instead");
}
}
Thanks for the detailed analysis. Just to clarify, the problem that's impacting you specifically is that you're getting the wrong set of results (as described) when fanning in on your parent orchestration, right?
Thanks for the detailed analysis. Just to clarify, the problem that's impacting you specifically is that you're getting the wrong set of results (as described) when fanning in on your parent orchestration, right?
Yes, that is correct.
As mentionned above, we put a piece of code in place that just throws if the problem is detected.
In the last 24h, we detected this problem 12 times for 3 of our tenants, for a total of ~280 orchestrator executions.
Sample error messages:
'Optimization_Orchestrator' failed: Wrong SubOrchestration result! Expected SubOrchestrationIndex to be 7, but was 3 instead. TenantId: [tenant1]
'Optimization_Orchestrator' failed: Wrong SubOrchestration result! Expected SubOrchestrationIndex to be 30, but was 27 instead. TenantId: [tenant2]
'Optimization_Orchestrator' failed: Wrong SubOrchestration result! Expected SubOrchestrationIndex to be 0, but was 15 instead. TenantId: [tenant3]
@Costo
Can you share how you are generating the value of problems? Generally, when we see errors where the values of different Task objects are swapped, it indicates a non-deterministic code pattern, where non-deterministic behavior in the orchestrator is causing the code to replay differently then it did the on the execution when the Tasks were originally scheduled.
@ConnorMcMahon
The problems array is the result of a previous activity. Is it possible that the order of the items in the array is non-deterministic ?
@Costo
I have found a bug in the DurableTask.AzureStorage project in the way that large message blob names are created. The sequence number is what is used to construct the someid section of the blob name message-{someid}-suborchestrationinstancecompleted.json.gz. This unfortunately not guaranteed to be unique, as each instance of your application keeps track of it's own sequence number.
This means that it's possible for messages sent from different VMs to have the same naming scheme for blobs, which can cause the suborchestration that finishes later to overwrite the already finished message of another suborchestration before it uploads its content in the history table. This is why you are seeing this mismatched behavior.
I have a PR already to address this, and it will go into our next release.
Thanks for your patience with this issue.
Removed comment, after reviewing it appears that the issue is different enough IMO to open a separate issue.
Most helpful comment
@Costo
I have found a bug in the DurableTask.AzureStorage project in the way that large message blob names are created. The sequence number is what is used to construct the
someidsection of the blob namemessage-{someid}-suborchestrationinstancecompleted.json.gz. This unfortunately not guaranteed to be unique, as each instance of your application keeps track of it's own sequence number.This means that it's possible for messages sent from different VMs to have the same naming scheme for blobs, which can cause the suborchestration that finishes later to overwrite the already finished message of another suborchestration before it uploads its content in the history table. This is why you are seeing this mismatched behavior.
I have a PR already to address this, and it will go into our next release.
Thanks for your patience with this issue.