I have created an Durable Function following the structure of the Fan-out / Fan-In example, (using the recommended VS 2017 development, not the Azure Portal approach)
What I am experiencing is that very the first Azure Function that is called from the OrchestrationTrigger method is called over and over again, and never continues to the next statement. Now while I have read the Event Sourcing, Checkpointing, and Orchestrator Replay documentation and the Orchestrator Code Constraints, I am still a bit confused as debugging locally is very tricky due to the replay.
[FunctionName("ExportLogs")]
public static async Task ExportLogs([OrchestrationTrigger] DurableOrchestrationContext context)
{
SinceDate since = context.GetInput<SinceDate>();
var sinceDate = DateTime.Parse(since.since);
var conversationInfos = await context.CallActivityAsync<IEnumerable<ConversationInfo>>("GetConversations", sinceDate);
// !! I never get to this statement
var tasks = new List<Task<CallInfo>>(conversationInfos.Count());
foreach (var ci in conversationInfos)
{
Task<CallInfo> task = context.CallActivityAsync<CallInfo>("ProcessCall", ci);
task.Start();
var notused = task.ContinueWith(t => context.CallActivityAsync("PersistCallInStorage", t.Result));
tasks.Add(task);
}
await Task.WhenAll(tasks);
// Now fan-in and persist all results.
var callInfos = tasks.Select(t => t.Result);
var connectionString = ConfigurationManager.ConnectionStrings["DefaultConnection"].ConnectionString;
var repo = new DwRepository(connectionString);
repo.AddCallInfos(callInfos);
}
The first activity called is:
[FunctionName("GetConversations")]
public static IEnumerable<ConversationInfo> GetConversations([ActivityTrigger] DateTime sinceDate, TraceWriter log)
{
var exportService = ConstuctExportService();
var conversations = exportService.GetAllConversations(sinceDate);
return conversations;
}
Now, exportService.GetAllConversations(sinceDate); is calling upon an SDK that is fetching some call information from a cloud service via REST. It does need to loop and fetch the data in pages. So what is happening is that this get called over and over and never continues on, even though I am awaiting the call.
Here is the output from the Function Runtime output window:
[10/23/2017 7:25:43 PM] Function started (Id=8e62e6bd-34a9-49b7-a814-3a0e9e5f9f8e)
[10/23/2017 7:25:43 PM] Executing 'ExportRscCallLogs' (Reason='', Id=8e62e6bd-34a9-49b7-a814-3a0e9e5f9f8e)
[10/23/2017 7:25:43 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'ExportRscCallLogs (Orchestrator)', version '' started. IsReplay: False. Input: (22 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:25:44 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' scheduled. Reason: ExportRscCallLogs. IsReplay: False. State: Scheduled. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:25:44 PM] Function completed (Success, Id=8e62e6bd-34a9-49b7-a814-3a0e9e5f9f8e, Duration=114ms)
[10/23/2017 7:25:44 PM] Executed 'ExportRscCallLogs' (Succeeded, Id=8e62e6bd-34a9-49b7-a814-3a0e9e5f9f8e)
[10/23/2017 7:25:44 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'ExportRscCallLogs (Orchestrator)', version '' awaited. IsReplay: False. State: Awaited. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:25:44 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' started. IsReplay: False. Input: (23 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:25:44 PM] Function started (Id=3c33ff3b-5fe5-4e6b-a7d4-ac6c3e146633)
[10/23/2017 7:25:44 PM] Executing 'GetConversations' (Reason='', Id=3c33ff3b-5fe5-4e6b-a7d4-ac6c3e146633)
[10/23/2017 7:25:47 PM] Function completed (Success, Id=3c33ff3b-5fe5-4e6b-a7d4-ac6c3e146633, Duration=2824ms)
[10/23/2017 7:25:47 PM] Executed 'GetConversations' (Succeeded, Id=3c33ff3b-5fe5-4e6b-a7d4-ac6c3e146633)
[10/23/2017 7:25:47 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' completed. ContinuedAsNew: False. IsReplay: False. Output: (435216 bytes). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:25:54 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' started. IsReplay: False. Input: (23 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:25:54 PM] Function started (Id=d89b2fa4-f4d7-43e5-95f1-5cf436ebc85e)
[10/23/2017 7:25:54 PM] Executing 'GetConversations' (Reason='', Id=d89b2fa4-f4d7-43e5-95f1-5cf436ebc85e)
[10/23/2017 7:25:56 PM] Function completed (Success, Id=d89b2fa4-f4d7-43e5-95f1-5cf436ebc85e, Duration=1397ms)
[10/23/2017 7:25:56 PM] Executed 'GetConversations' (Succeeded, Id=d89b2fa4-f4d7-43e5-95f1-5cf436ebc85e)
[10/23/2017 7:25:56 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' completed. ContinuedAsNew: False. IsReplay: False. Output: (435216 bytes). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:26:05 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' started. IsReplay: False. Input: (23 bytes). State: Started. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
[10/23/2017 7:26:05 PM] Function started (Id=fb240d9c-90f0-4e4d-86db-b3d2c175e0eb)
[10/23/2017 7:26:05 PM] Executing 'GetConversations' (Reason='', Id=fb240d9c-90f0-4e4d-86db-b3d2c175e0eb)
[10/23/2017 7:26:06 PM] Function completed (Success, Id=fb240d9c-90f0-4e4d-86db-b3d2c175e0eb, Duration=1177ms)
[10/23/2017 7:26:06 PM] Executed 'GetConversations' (Succeeded, Id=fb240d9c-90f0-4e4d-86db-b3d2c175e0eb)
[10/23/2017 7:26:06 PM] b350f681b17b4ac8a772fff8fdca3aad: Function 'GetConversations (Activity)', version '' completed. ContinuedAsNew: False. IsReplay: False. Output: (435216 bytes). State: Completed. HubName: DurableFunctionsHub. AppName: . SlotName: . ExtensionVersion: 1.0.0.0.
As you can see, the logs are showing that it is Executing GetConversations over and over again, and it even reports that it was completed successfully each time. The IsReplay parameter is False - so I am not sure what's going on?
Thanks for filing the issue. I've not seen this kind of behavior before. One possible explanation is that there is some kind of internal failure that happens after the GetConversions function completes, causing the Durable Task Framework to retry indefinitely.
Can you share the list of nuget packages you are referencing in your project (or even your whole .csproj file)? I'd like to rule out a weird package conflict issue.
Also, what kind of storage account are you using for your local development?
Here is the function project .csproj:
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net461</TargetFramework>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Azure.WebJobs.Extensions.DurableTask" Version="1.0.0-beta" />
<PackageReference Include="Microsoft.NET.Sdk.Functions" Version="1.0.6" />
<PackageReference Include="Serilog" Version="2.5.0" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\..\X.DataAccess.DWH\X.DataAccess.DWH.csproj" />
<ProjectReference Include="..\..\X.RSC.Domain\X.RSC.Domain.csproj" />
<ProjectReference Include="..\X.RSC.Export\X.RSC.Export.csproj" />
</ItemGroup>
<ItemGroup>
<Reference Include="Microsoft.CSharp" />
<Reference Include="System.Configuration" />
</ItemGroup>
<ItemGroup>
<None Update="host.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Update="local.settings.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
<CopyToPublishDirectory>Never</CopyToPublishDirectory>
</None>
</ItemGroup>
</Project>
Here is the csproj for X.RSC.Export:
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="15.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" />
<PropertyGroup>
<Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>
<Platform Condition=" '$(Platform)' == '' ">AnyCPU</Platform>
<ProjectGuid>{94736929-1BAC-4C2A-9C52-995C93F1483D}</ProjectGuid>
<OutputType>Library</OutputType>
<AppDesignerFolder>Properties</AppDesignerFolder>
<RootNamespace>X.RSC.Export</RootNamespace>
<AssemblyName>X.RSC.Export</AssemblyName>
<TargetFrameworkVersion>v4.6.1</TargetFrameworkVersion>
<FileAlignment>512</FileAlignment>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Debug|AnyCPU' ">
<DebugSymbols>true</DebugSymbols>
<DebugType>full</DebugType>
<Optimize>false</Optimize>
<OutputPath>bin\Debug\</OutputPath>
<DefineConstants>DEBUG;TRACE</DefineConstants>
<ErrorReport>prompt</ErrorReport>
<WarningLevel>4</WarningLevel>
</PropertyGroup>
<PropertyGroup Condition=" '$(Configuration)|$(Platform)' == 'Release|AnyCPU' ">
<DebugType>pdbonly</DebugType>
<Optimize>true</Optimize>
<OutputPath>bin\Release\</OutputPath>
<DefineConstants>TRACE</DefineConstants>
<ErrorReport>prompt</ErrorReport>
<WarningLevel>4</WarningLevel>
</PropertyGroup>
<ItemGroup>
<Reference Include="ININ.PureCloudApi, Version=0.107.13.416, Culture=neutral, processorArchitecture=MSIL">
<HintPath>..\..\packages\PureCloudApiSdk.0.107.13.416\lib\ININ.PureCloudApi.dll</HintPath>
</Reference>
<Reference Include="Newtonsoft.Json, Version=9.0.0.0, Culture=neutral, PublicKeyToken=30ad4fe6b2a6aeed, processorArchitecture=MSIL">
<HintPath>..\..\packages\Newtonsoft.Json.9.0.1\lib\net45\Newtonsoft.Json.dll</HintPath>
</Reference>
<Reference Include="Polly, Version=5.3.1.0, Culture=neutral, processorArchitecture=MSIL">
<HintPath>..\..\packages\Polly.5.3.1\lib\net45\Polly.dll</HintPath>
</Reference>
<Reference Include="RestSharp, Version=105.2.3.0, Culture=neutral, processorArchitecture=MSIL">
<HintPath>..\..\packages\RestSharp.105.2.3\lib\net46\RestSharp.dll</HintPath>
</Reference>
<Reference Include="Serilog, Version=2.0.0.0, Culture=neutral, PublicKeyToken=24c2f752a8e58a10, processorArchitecture=MSIL">
<HintPath>..\..\packages\Serilog.2.5.0\lib\net46\Serilog.dll</HintPath>
</Reference>
<Reference Include="System" />
<Reference Include="System.Core" />
<Reference Include="System.Runtime.Serialization" />
<Reference Include="System.Xml.Linq" />
<Reference Include="System.Data.DataSetExtensions" />
<Reference Include="Microsoft.CSharp" />
<Reference Include="System.Data" />
<Reference Include="System.Net.Http" />
<Reference Include="System.Xml" />
</ItemGroup>
<ItemGroup>
<Compile Include="Models\ConversationInfo.cs" />
<Compile Include="ExportService.cs" />
<Compile Include="Properties\AssemblyInfo.cs" />
<Compile Include="Services\PureCloudService.cs" />
</ItemGroup>
<ItemGroup>
<None Include="packages.config" />
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\..\X.RSC.Domain\X.RSC.Domain.csproj">
<Project>{ee5639fe-d5e4-4135-baa9-01b51e147c1a}</Project>
<Name>X.RSC.Domain</Name>
</ProjectReference>
</ItemGroup>
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
</Project>
and it's packages.config:
<?xml version="1.0" encoding="utf-8"?>
<packages>
<package id="Newtonsoft.Json" version="9.0.1" targetFramework="net461" />
<package id="Polly" version="5.3.1" targetFramework="net461" />
<package id="PureCloudApiSdk" version="0.107.13.416" targetFramework="net461" />
<package id="RestSharp" version="105.2.3" targetFramework="net461" />
<package id="Serilog" version="2.5.0" targetFramework="net461" />
</packages>
As for storage, I am using the same Azure Storage account that is linked to the Function app - (not using local emulator)
Thanks - nothing obvious jumps out at me. Is the problem reproducible in Azure? If so, could you tell me the name of your function app either directly or indirectly. I may be able to get more details if you have a repro in Azure.
Region = USW
2017-10-24T17:33:48 Welcome, you are now connected to log-streaming service.
2017-10-24T17:33:55.483 Function started (Id=4e295585-5c7f-43d5-be2a-a5a68d190868)
2017-10-24T17:33:55.607 Function completed (Failure, Id=4e295585-5c7f-43d5-be2a-a5a68d190868, Duration=122ms)
2017-10-24T17:34:19.920 Function started (Id=b078f874-1689-4ef7-9ac7-e5c7d06c25ca)
2017-10-24T17:34:20.092 Started orchestration with ID = 'c3d5eb4136634dcd8a337cbbaa8182c2'.
2017-10-24T17:34:20.107 Function completed (Success, Id=b078f874-1689-4ef7-9ac7-e5c7d06c25ca, Duration=177ms)
I don't think its even reaching my OrchestrationTrigger method... something else is choking..
What region is your function app located in?
USW
I am getting the same issue here. The first activity function call is run nth times and doesn't return to the orchestration function.
Thanks, I was able to locate it.
One thing I noticed is that the GetConversations seems to be returning back a relatively large amount of data (482564 bytes, or 471 KB). Return values get serialized into queue messages, and Azure storage queues only support 64 KB messages. I suspect the runtime is not handling this gracefully and is resulting in it running continuously. Can you try rewriting the GetConversions function to return smaller payloads and see if that resolves the issue?
FYI, the issue tracking support for larger payloads is here: Large Message Support.
Thanks @cgillum - great find! I will modify the function, but I am wondering if I can yield return instead of returning the whole list in one shot. Would that work in this activity called by the OrchestrationTrigger ?
Phew, glad we identified the issue.
I'm afraid it's not possible to stream results back to the orchestrator. Activity functions operate just like regular functions so we don't have any ability to do tricks like yield return. You could temporarily work around this by manually serializing to JSON and compressing the results to get under the 64 KB limit, but longer term I think we'll just need to add support for the larger messages. At a minimum, I think we need to change the behavior to fail quickly and with an error message rather than have this infinite looping behavior.
The return value validation has been added in the 1.1.0-beta2 release.
Most helpful comment
Phew, glad we identified the issue.
I'm afraid it's not possible to stream results back to the orchestrator. Activity functions operate just like regular functions so we don't have any ability to do tricks like
yield return. You could temporarily work around this by manually serializing to JSON and compressing the results to get under the 64 KB limit, but longer term I think we'll just need to add support for the larger messages. At a minimum, I think we need to change the behavior to fail quickly and with an error message rather than have this infinite looping behavior.