Azure-webjobs-sdk: Concurrency issue in CompositeTraceWriter

Created on 24 Jan 2018  路  7Comments  路  Source: Azure/azure-webjobs-sdk

Seeing the following stack trace in the wild:

System.InvalidOperationException : Collection was modified; enumeration operation may not execute.
   at System.ThrowHelper.ThrowInvalidOperationException(ExceptionResource resource)
   at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
   at System.Collections.Generic.List`1.Enumerator.MoveNext()
   at Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.InvokeTraceWriters(TraceEvent traceEvent)
   at Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.Trace(TraceEvent traceEvent)
   at Microsoft.Azure.WebJobs.Host.Loggers.TraceWriterFunctionInstanceLogger.LogFunctionCompletedAsync(FunctionCompletedMessage message,CancellationToken cancellationToken)
   at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsync(IFunctionInstance functionInstance,CancellationToken cancellationToken)
   at async Microsoft.Azure.WebJobs.Host.Executors.ShutdownFunctionExecutor.TryExecuteAsync(IFunctionInstance instance,CancellationToken cancellationToken)
   at async Microsoft.Azure.WebJobs.JobHost.CallAsync(??)
   at async Microsoft.Azure.WebJobs.Script.Script

This is coming from the code here.

Most helpful comment

@brettsam @mathewc There is no code in CompositeTraceWriter that mutates the collection but if you look carefully you'll see we pass a collection in and we don't make a copy of it:

https://github.com/Azure/azure-webjobs-sdk/blob/7dd9681609613f79cb73cbc13d28c5161ca1d7c3/src/Microsoft.Azure.WebJobs.Host/CompositeTraceWriter.cs#L21-L25

Is that deliberate or by accident? If its by accident, then perhaps simply copying the collection here would be enough to prevent this.

All 7 comments

Interesting. I don't see us modifying that collection anywhere so I'm not sure where this is happening. Warrants investigation.

We're running into a potentially similar issue. We only noticed this when our app service got into a funky state prior to crashing. Here's the stack trace:

"Exception": "Exception: Probable I/O race condition detected while copying memory. The I/O package is not thread safe by default. In multithreaded applications, a stream must be accessed in a thread-safe way, such as a thread-safe wrapper returned by TextReader's or TextWriter's Synchronized methods. This also applies to classes like StreamWriter and StreamReader.\n 聽 at System.Buffer.InternalBlockCopy(Array src, Int32 srcOffsetBytes, Array dst, Int32 dstOffsetBytes, Int32 byteCount)
聽 聽at System.IO.StreamWriter.Write(Char[] buffer, Int32 index, Int32 count)
聽 聽at System.IO.TextWriter.WriteLine(String value)
聽 聽at Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.InvokeTextWriter(TraceEvent traceEvent)
聽 聽at Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.Trace(TraceEvent traceEvent)
聽 聽at Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.InvokeTraceWriters(TraceEvent traceEvent)
聽 聽at Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.Trace(TraceEvent traceEvent)
聽 聽at Microsoft.Azure.WebJobs.Script.CompositeTraceWriter.Trace(TraceEvent traceEvent) in C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Diagnostics\CompositeTraceWriter.cs:line 35
聽 聽at Microsoft.Azure.WebJobs.Script.InterceptingTraceWriter.Trace(TraceEvent traceEvent) in C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Diagnostics\InterceptingTraceWriter.cs:line 31
聽 聽at Microsoft.Azure.WebJobs.Host.TraceWriter.Info(String message, String source)

@brettsam @mathewc There is no code in CompositeTraceWriter that mutates the collection but if you look carefully you'll see we pass a collection in and we don't make a copy of it:

https://github.com/Azure/azure-webjobs-sdk/blob/7dd9681609613f79cb73cbc13d28c5161ca1d7c3/src/Microsoft.Azure.WebJobs.Host/CompositeTraceWriter.cs#L21-L25

Is that deliberate or by accident? If its by accident, then perhaps simply copying the collection here would be enough to prevent this.

@brettsam Is this change included in Microsoft.Azure.WebJobs v.2.2.0?

Yes -- https://github.com/Azure/azure-webjobs-sdk/releases/tag/v2.2.0 shows it as included in that release.

Are you seeing the same error? If so -- can you share the stack trace with us here?

I can still see the error with v.2.2.0

Probable I/O race condition detected while copying memory. The I/O package is not thread safe by default. In multithreaded applications, a stream must be accessed in a thread-safe way, such as a thread-safe wrapper returned by TextReader's or TextWriter's Synchronized methods. This also applies to classes like StreamWriter and StreamReader.

See below the stacktrace, parsed by AppInsights ;-)

[{"parsedStack":[{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":0,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+d__16.MoveNext","level":1,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":2,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":3,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+d__13.MoveNext","level":4,"line":0}],"severityLevel":"Error","outerId":"0","message":"Exception while executing function: Check4KustoIngestionFailures","type":"Microsoft.Azure.WebJobs.Host.FunctionInvocationException","id":"33566856"},{"parsedStack":[{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Buffer.InternalBlockCopy","level":0,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.IO.StreamWriter.Write","level":1,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.InvokeTextWriter","level":2,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.Trace","level":3,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.InvokeTraceWriters","level":4,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.CompositeTraceWriter.Trace","level":5,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Script, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null","method":"Microsoft.Azure.WebJobs.Script.CompositeTraceWriter.Trace","level":6,"line":38,"fileName":"C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Diagnostics\CompositeTraceWriter.cs"},{"assembly":"syngo.Cloud.Monitoring.Common, Version=5.4.1.0, Culture=neutral, PublicKeyToken=null","method":"syngo.Cloud.DevOps.Common.AzureFunctions.KeyVaultHelper.GetSecret","level":7,"line":0},{"assembly":"syngo.Cloud.Monitoring.KustoIngestion, Version=5.4.1.0, Culture=neutral, PublicKeyToken=null","method":"syngo.Cloud.Monitoring.KustoIngestion.Check4KustoIngestionFailures.GetKustoIngestConnectionSb","level":8,"line":0},{"assembly":"syngo.Cloud.Monitoring.KustoIngestion, Version=5.4.1.0, Culture=neutral, PublicKeyToken=null","method":"syngo.Cloud.Monitoring.KustoIngestion.Check4KustoIngestionFailures.Run","level":9,"line":0},{"assembly":"Anonymously Hosted DynamicMethods Assembly, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null","method":"lambda_method","level":10,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.VoidMethodInvoker2.InvokeAsync","level":11,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionInvoker2+d__9.MoveNext","level":12,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":13,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":14,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+d__24.MoveNext","level":15,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":16,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":17,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+d__23.MoveNext","level":18,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":19,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":20,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+d__22.MoveNext","level":21,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw","level":22,"line":0},{"assembly":"mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089","method":"System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification","level":23,"line":0},{"assembly":"Microsoft.Azure.WebJobs.Host, Version=2.3.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35","method":"Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+d__16.MoveNext","level":24,"line":0}],"severityLevel":"Error","outerId":"33566856","message":"Probable I/O race condition detected while copying memory. The I/O package is not thread safe by default. In multithreaded applications, a stream must be accessed in a thread-safe way, such as a thread-safe wrapper returned by TextReader's or TextWriter's Synchronized methods. This also applies to classes like StreamWriter and StreamReader.","type":"System.IndexOutOfRangeException","id":"3782646"}]

Thanks for the stack trace. I was able to write a test to see this. PR to fix it is: #2003

Was this page helpful?
0 / 5 - 0 ratings