Runtime: Unreproducible InvalidProgramExceptions at random places

Created on 6 Jun 2018  ·  76Comments  ·  Source: dotnet/runtime

Our .NET Core Web Apps running on Azure App Service occasionally start throwing InvalidProgramExceptions at seemly random places in code. For example, see the following stack traces:

On .NET Core 2.0:

On .NET Core 2.1, so the issue doesn't seem toe be fixed:

Due to the unpredictable nature of these exceptions, I suspect the problem is in the runtime and not in the ASP.NET Core libraries used. In all of the cases the exception continued to be thrown every time the affected code was executed. The problem disappears after restarting the process, so I don't have steps to reproduce it. I do have memory dumps that were captured by the Application Insights Snapshot Collector. If a Microsoft employee is interested in analysing those, in the understanding that they probably contain sensitive data, please contact me using the email in my profile.

Most helpful comment

Good news: We believe we have solved the issue causing these failures and we are testing a proposed fix right now. The underlying cause for the issue we've been debugging was an uninitialized variable deep within Microsoft Instrumentation Engine (actually one of its helper DLLs). As soon as this fix hits a release that can be used, I will update this thread with a link to the fixed version for everyone to install.

The Mixed-ish news: We've also seen reports (from this thread and elsewhere) that many people believe there's another failure that looks exactly like this one (InvalidProgramException at seemingly random places). In all cases where we've had a crash dump to investigate this potential second issue, we've found that it was actually Microsoft Instrumentation Engine in the process all along (even when it wasn't expected to be).

With that said though: EF Core in particular is worrisome as I _believe_ they use a lot of dynamic code, and if IL is generated incorrectly we will also throw the same kind of exception. I'd like to get the MIE fix out for everyone first. If you are still hitting the problem after that we are happy to take a look at the crash dumps and go from there.


Lastly I wanted to say: Sorry this issue took forever to solve. This was a non-deterministic failure that was REALLY tough to track down because the thing that put us in a bad state happened a long time ago in the process and it wasn't clear who was at fault.

If there's a problem in EF Core or other components we should be able to track it down a lot faster next time for two reasons: First, it's rare for a bug to be this tough for us to solve, a lot of things have to go wrong. Second, we've built a bunch of diagnostic tools to help solve this MIE issue, so we'll be able to use them if we are still seeing other InvalidProgramExceptions.

I'll update the thread as soon as I have details on where to pick up the fix. Thank you all for your patience.

All 76 comments

Looking at the 2.1 stack trace, this happens when a collection is being iterated and something modifies it in the middle of the iteration. That is something that's not allowed. Based on the unpredictable nature of the issue, the "something" is most likely another thread. It can be a problem in asp.net core or in your application, it is hard to say (I am no expert on asp.net).

There are more reports of the same thing with EF Core on 2.1: https://github.com/aspnet/EntityFrameworkCore/issues/12242

Original issue filed by @coffeymatt:

We had an outage of our identity server in production in Azure. The site is running the latest .NET Core 2.1 and the most up to date stable version of EF Core as of 05 June 2018.

The exception caught by application insights does not give us a clear root cause, but comes from inside entity framework and I'm posting the error here to see if anyone has any insight.

(Restarting our app restored the application).

Exception message:

Message | Connection id "0HLEA7A8JGC55", Request id "0HLEA7A8JGC55:00000004": An unhandled exception was thrown by the application. |  
-- | -- | --
Exception type | System.InvalidProgramException |  
Failed method | Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.get_QueryAnnotations

Stack trace:

System.InvalidProgramException: Common Language Runtime detected an invalid program.
 at Microsoft.EntityFrameworkCore.Query.QueryCompilationContext.get_QueryAnnotations()
 at Microsoft.EntityFrameworkCore.Query.Internal.IncludeCompiler..ctor(QueryCompilationContext queryCompilationContext, IQuerySourceTracingExpressionVisitorFactory querySourceTracingExpressionVisitorFactory)
 at Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.OptimizeQueryModel(QueryModel queryModel, Boolean asyncQuery)
 at Microsoft.EntityFrameworkCore.Query.RelationalQueryModelVisitor.OptimizeQueryModel(QueryModel queryModel, Boolean asyncQuery)
 at Microsoft.EntityFrameworkCore.Query.EntityQueryModelVisitor.CreateAsyncQueryExecutor[TResult](QueryModel queryModel)
 at Microsoft.EntityFrameworkCore.Storage.Database.CompileAsyncQuery[TResult](QueryModel queryModel)
 at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileAsyncQueryCore[TResult](Expression query, INodeTypeProvider nodeTypeProvider, IDatabase database)
 at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.<>c__DisplayClass24_0`1.<CompileAsyncQuery>b__0()
 at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddQueryCore[TFunc](Object cacheKey, Func`1 compiler)
 at Microsoft.EntityFrameworkCore.Query.Internal.CompiledQueryCache.GetOrAddAsyncQuery[TResult](Object cacheKey, Func`1 compiler)
 at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.CompileAsyncQuery[TResult](Expression query)
 at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler.ExecuteAsync[TResult](Expression query, CancellationToken cancellationToken)
 at Microsoft.EntityFrameworkCore.Query.Internal.EntityQueryProvider.ExecuteAsync[TResult](Expression expression, CancellationToken cancellationToken)
 at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ExecuteAsync[TSource,TResult](MethodInfo operatorMethodInfo, IQueryable`1 source, Expression expression, CancellationToken cancellationToken)
 at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.ExecuteAsync[TSource,TResult](MethodInfo operatorMethodInfo, IQueryable`1 source, LambdaExpression expression, CancellationToken cancellationToken)
 at Microsoft.EntityFrameworkCore.EntityFrameworkQueryableExtensions.FirstOrDefaultAsync[TSource](IQueryable`1 source, Expression`1 predicate, CancellationToken cancellationToken)
 at Microsoft.EntityFrameworkCore.Internal.EntityFinder`1.FindAsync(Object[] keyValues, CancellationToken cancellationToken)
 at Microsoft.EntityFrameworkCore.Internal.InternalDbSet`1.FindAsync(Object[] keyValues, CancellationToken cancellationToken)
 at Microsoft.AspNetCore.Identity.EntityFrameworkCore.UserStore`9.FindByIdAsync(String userId, CancellationToken cancellationToken)
 at Microsoft.AspNetCore.Identity.UserManager`1.FindByIdAsync(String userId)
 at Microsoft.AspNetCore.Identity.UserManager`1.GetUserAsync(ClaimsPrincipal principal)
 at Microsoft.AspNetCore.Identity.SignInManager`1.<ValidateSecurityStampAsync>d__32.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()
 at Microsoft.AspNetCore.Identity.SecurityStampValidator`1.<ValidateAsync>d__4.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
 at Microsoft.AspNetCore.Authentication.Cookies.CookieAuthenticationHandler.<HandleAuthenticateAsync>d__20.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at Microsoft.AspNetCore.Authentication.AuthenticationHandler`1.<AuthenticateAsync>d__47.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at Microsoft.AspNetCore.Authentication.AuthenticationService.<AuthenticateAsync>d__10.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.<Invoke>d__6.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
 at Microsoft.AspNetCore.Server.IISIntegration.IISMiddleware.<Invoke>d__11.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.GetResult()
 at Microsoft.AspNetCore.Hosting.Internal.RequestServicesContainerMiddleware.<Invoke>d__3.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
 at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
 at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
 at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.Frame`1.<ProcessRequestsAsync>d__2.MoveNext()

Steps to reproduce

Issue occurred out of the blue and has been resolved by restart. Don't know how to reproduce. Identity server is currently operating as expected.

```

Further technical details

EF Core version: 2.1.0
Database Provider: Microsoft.EntityFrameworkCore.SqlServer
Operating system: Azure App Service

Comment from EF post from @flagbug

This exact same is also something we're seeing very very sporadically (once every few months) with EF Core versions before 2.1.0 hosted on Azure App Service.

Sadly it's super hard to gain any more information for this because we have to immediately restart our instances so our users aren't impacted.

Comment from EF issue by @jarz

I've started seeing the InvalidProgramException today within my API that uses ASP.NET Core 2.1.0 & EF Core 2.1.0.

System.InvalidProgramException:
   at Microsoft.EntityFrameworkCore.Metadata.Internal.EntityMaterializerSource.TryReadValue (Microsoft.EntityFrameworkCore, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at lambda_method (Anonymously Hosted DynamicMethods Assembly, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null)
   at Microsoft.EntityFrameworkCore.Query.ExpressionVisitors.Internal.UnbufferedEntityShaper`1.Shape (Microsoft.EntityFrameworkCore.Relational, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncQueryingEnumerable`1+AsyncEnumerator+<BufferlessMoveNext>d__12.MoveNext (Microsoft.EntityFrameworkCore.Relational, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy+<ExecuteAsync>d__7`2.MoveNext (Microsoft.EntityFrameworkCore.SqlServer, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncQueryingEnumerable`1+AsyncEnumerator+<MoveNext>d__11.MoveNext (Microsoft.EntityFrameworkCore.Relational, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Linq.AsyncEnumerable+<SingleOrDefault_>d__381`1.MoveNext (System.Interactive.Async, Version=3.0.3000.0, Culture=neutral, PublicKeyToken=94bc3704cddfc263)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncLinqOperatorProvider+TaskResultAsyncEnumerable`1+Enumerator+<MoveNext>d__3.MoveNext (Microsoft.EntityFrameworkCore, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.EntityFrameworkCore.Query.Internal.AsyncLinqOperatorProvider+ExceptionInterceptor`1+EnumeratorExceptionInterceptor+<MoveNext>d__5.MoveNext (Microsoft.EntityFrameworkCore, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryCompiler+<ExecuteSingletonAsyncQuery>d__21`1.MoveNext (Microsoft.EntityFrameworkCore, Version=2.1.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification (System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)
   at OnlineOrdering.WebAPI.Services.UserService+<GetUser>d__8.MoveNext (OnlineOrdering.WebAPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=nullOnlineOrdering.WebAPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: D:\a\1\s\OnlineOrdering.WebAPI\Services\UserService.csOnlineOrdering.WebAPI, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null: 33)

Other comments in EF thread:

@danmosemsft @karelz Can we get some guidance here? There is nothing special about the methods or stack traces here. For example, the property at the top of the stack trace in the first example is very simple code:
C# private readonly List<IQueryAnnotation> _queryAnnotations = new List<IQueryAnnotation>(); public virtual IReadOnlyCollection<IQueryAnnotation> QueryAnnotations => _queryAnnotations;
Also, the stack trace for the second case is very different. So based on these things I would be surprised if this is actually an EF issue.

So, couple of questions:

  • Have you seen anything similar to this on your side?
  • What's the best thing to ask for that will help get a root cause on this?

/cc @divega

Can the customers attach before the issue and break on the exception (eg with SOS) and give us a full dumpfile at that point? It would need to include the Jitted code, ie., not just a minidump.

I assume that's the next step, if we can'get a repro -- @jkotas anything to add?

@ajcvickers

So based on these things I would be surprised if this is actually an EF issue.

While I'm not saying that this is an EF Core issue, what I've noticed is that when our app fails with an InvalidProgramException, it's always somewhere in the EF code. The program also never recovers from this and the only way to mitigate it is to restart the Azure App Service instance.

@danmosemsft At least for me it's basically impossible to reproduce or catch this exception, since it happens sporadically every few weeks on a web app hosted on Azure App Service. Any ideas how I can give you more information?

Same for me. Happy to help in any way I can, but the error is one of those tricky sporadic rare types. It's happened just the once in production so far for me.

Closing the EF issue as a duplicate of this one.

@ ajcvickers: Just a thought. These are all pretty deep call stacks with async await inside it. Could the thread ran out of stack space at the wrong place to cause this? Memory dumps should give more clues what went wrong.

@AndyAyersMS do you have any suggestions for making progress on an InvalidProgramExceptoin that only sporadically occurs in production? What causes such an exceptoin, bad IL?

Could be invalid IL, yes. Given the variety of stack traces I wonder if the underlying problem is something low level.

Do we know if these these runs had profilers attached when they crashed?

Do we know if these these runs had profilers attached when they crashed?

We have the Application Insights extension installed in App Service, which attaches a profiler occasionally, so I guess it could be. I am not seeing any profile traces from around the last time we saw the crash, however.

Just upgraded one of our web apps from 2.0 to 2.1. Locally all runs fine. After deploying to Azure App Service the following error occurs on startup:

2018-06-10 14:57:18.622 +00:00 [Error] Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware: An unhandled exception has occurred while executing the request.
System.InvalidProgramException: Common Language Runtime detected an invalid program.
   at Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_Properties()
   at Microsoft.AspNetCore.Mvc.ModelBinding.ModelMetadataProviderExtensions.GetMetadataForProperty(IModelMetadataProvider provider, Type containerType, String propertyName)
   at Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.CreatePropertyModel(PropertyInfo propertyInfo)
   at Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.OnProvidersExecuting(ApplicationModelProviderContext context)
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.BuildModel()
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.GetDescriptors()
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.OnProvidersExecuting(ActionDescriptorProviderContext context)
   at Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.UpdateCollection()
   at Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.get_ActionDescriptors()
   at Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.GetTreeRouter()
   at Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.RouteAsync(RouteContext context)
   at Microsoft.AspNetCore.Routing.RouteCollection.RouteAsync(RouteContext context)
   at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)
   at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.Invoke(HttpContext context)
2018-06-10 14:57:18.631 +00:00 [Error] Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware: An exception was thrown attempting to execute the error handler.
System.InvalidProgramException: Common Language Runtime detected an invalid program.
   at Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_Properties()
   at Microsoft.AspNetCore.Mvc.ModelBinding.ModelMetadataProviderExtensions.GetMetadataForProperty(IModelMetadataProvider provider, Type containerType, String propertyName)
   at Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.CreatePropertyModel(PropertyInfo propertyInfo)
   at Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.OnProvidersExecuting(ApplicationModelProviderContext context)
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.BuildModel()
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.GetDescriptors()
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.OnProvidersExecuting(ActionDescriptorProviderContext context)
   at Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.UpdateCollection()
   at Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.get_ActionDescriptors()
   at Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.GetTreeRouter()
   at Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.RouteAsync(RouteContext context)
   at Microsoft.AspNetCore.Routing.RouteCollection.RouteAsync(RouteContext context)
   at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)
   at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.Invoke(HttpContext context)
2018-06-10 14:57:18.632 +00:00 [Error] Microsoft.AspNetCore.Server.Kestrel: Connection id "0HLEES05VKEFG", Request id "0HLEES05VKEFG:00000001": An unhandled exception was thrown by the application.
System.InvalidProgramException: Common Language Runtime detected an invalid program.
   at Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_Properties()
   at Microsoft.AspNetCore.Mvc.ModelBinding.ModelMetadataProviderExtensions.GetMetadataForProperty(IModelMetadataProvider provider, Type containerType, String propertyName)
   at Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.CreatePropertyModel(PropertyInfo propertyInfo)
   at Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.OnProvidersExecuting(ApplicationModelProviderContext context)
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.BuildModel()
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.GetDescriptors()
   at Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.OnProvidersExecuting(ActionDescriptorProviderContext context)
   at Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.UpdateCollection()
   at Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.get_ActionDescriptors()
   at Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.GetTreeRouter()
   at Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.RouteAsync(RouteContext context)
   at Microsoft.AspNetCore.Routing.RouteCollection.RouteAsync(RouteContext context)
   at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)
   at Microsoft.AspNetCore.StaticFiles.StaticFileMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Diagnostics.ExceptionHandlerMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Localization.RequestLocalizationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at Microsoft.AspNetCore.Server.IISIntegration.IISMiddleware.Invoke(HttpContext httpContext)
   at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application)

Seems related to the other errors reported above. After manually restarting the App Service, the error went away. If this would happen again, I'd be happy to help debug this issue. Could anyone please advice on how to produce a proper memory dump in Azure?

Well a quick repro has got to help. @avickers so you know how to debug in Azure App Service? Also, @jkotas any suggestion of what to break on?

@jcemoller if you can repro immediately at startup, is any reduction feasible eg removing extensions, dependencies, simplifying code in a binary search fashion? If we could get this to repro over here it would obviously help.

@ajcvickers see issue comment above

Sorry @avickers. I wish autocomplete of names worked in the mobile site . Usually I switch to desktop site just for this purpose.

@jkotas any suggestion of what to break on?

Break when InvalidProgramException is thrown and capture dump at that point that can be investigated. If running under debugger is not an option, you can subscribe to AppDomain.CurrentDomain.FirstChanceException, call Environment.FailFast immediately if the exception throw is InvalidProgramException.

https://blogs.msdn.microsoft.com/kaushal/2017/05/04/azure-app-service-manually-collect-memory-dumps/ lists different options you can use to capture the crash dumps.

@jkotas as I mentioned I have dumps. How do I go from there?

Sorry for the delay. @chrarnoldus, can you please file an issue via https://developercommunity.visualstudio.com ? Go to the website, click ".NET". Then "Report a Problem", and create an issue. Please upload the crash dump you have collected as an attachment to that ticket. (Be sure to zip the .dmp file as they tend to compress well.)

Also note that the attachment size through the website is limited to 2gb. I think the "report an issue" function via Visual Studio does not have a size limit, so let me know if you need those instructions. Hopefully the compressed crash dump will not be too large.

If you reply back here with a link to the issue you created I will take a look as soon as we get it.

Thanks!

@leculver the link is https://developercommunity.visualstudio.com/content/problem/271142/invalidprogramexceptions-in-net-core.html. The dumps are inside the .diagsession files. Please e-mail me for the password (these dumps are from a production app), address is in my GitHub profile.

I started taking a look at these last night, I'm continuing to work on them today. I'll give you an update as soon as I have progress. Thanks!

Sorry for the delay here, some of our diagnostics tools (SOS in particular) aren't playing nice with these .diagsession files (and the crash dumps in them) and that's made progress very difficult. I've asked the diagnostics team to help take a look at that issue in parallel.

The crashes here are interesting, you have two crashes that are InvalidProgramExceptions. These types of failures are usually due to a problem in the JIT, where it cannot compile a piece of code (or in some way hits failures when trying to do so). The other two crashes come from different sources: One CryptographicException and one InvalidOperationException.

These second to exceptions may be unrelated to the InvalidProgramExceptions you are getting (the process that snaps crash dumps might have gotten overzealous and grabbed them too, even though they are not the same issue). Or it could be that something is corrupting memory in the process and is causing odd failures in places that shouldn't be throwing. I won't be able to tell until I have fixed the problem with our diagnostics tools. I'll see what the diagnostics team has to say about it.

@leculver thanks for looking into this.

The other two crashes come from different sources: One CryptographicException and one InvalidOperationException.

Their inner exceptions are InvalidProgramExceptions too, so maybe they are related. I believe is just that ASP.NET Core caught and wrapped them before rethrowing.

the process that snaps crash dumps might have gotten overzealous and grabbed them too, even though they are not the same issue

I haven't configured it to be selective. Just to make sure we are on the same page, the dumps were collected by the Application Insights Snapshot Collector (documented here: https://docs.microsoft.com/en-us/azure/application-insights/app-insights-snapshot-debugger).

@leculver: One thing that might be worth mentioning is that EF does a lot of expression compilation and execution of the resulting delegates. The stack traces are not in that code, so I don't think that the crash is a direct result of this, but it might be relevant if something in the expression compilation is causing the corruption that later causes the crash. Other parts of ASP.NET do similar things, and I believe there is even some Ref.Emit code in ASP.NET, which might be another source of corruption.

We have the same exception after an automated deployment to production (through VSTS) of an ASP.NET Core 2.1 (Web API) application that also uses EF Core 2.1. The build definition uses Core SDK 2.1.300.
The solution that this API is part of goes live today, so we are in an uncomfortable situation.

Here is our Stacktrace to maybe aid the troubleshooting:

System.InvalidProgramException: Common Language Runtime detected an invalid program.
   at DisplayMetadata Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_DisplayMetadata()
   at int Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_Order()
   at ModelPropertyCollection Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_Properties()+(ModelMetadata p) => { }
   at void System.Linq.EnumerableSorter<TElement, TKey>.ComputeKeys(TElement[] elements, int count)
   at int[] System.Linq.EnumerableSorter<TElement>.ComputeMap(TElement[] elements, int count)
   at int[] System.Linq.EnumerableSorter<TElement>.Sort(TElement[] elements, int count)
   at List<TElement> System.Linq.OrderedEnumerable<TElement>.ToList()
   at List<TSource> System.Linq.Enumerable.ToList<TSource>(IEnumerable<TSource> source)
   at ModelPropertyCollection Microsoft.AspNetCore.Mvc.ModelBinding.Metadata.DefaultModelMetadata.get_Properties()
   at ModelMetadata Microsoft.AspNetCore.Mvc.ModelBinding.ModelMetadataProviderExtensions.GetMetadataForProperty(IModelMetadataProvider provider, Type containerType, string propertyName)
   at PropertyModel Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.CreatePropertyModel(PropertyInfo propertyInfo)
   at void Microsoft.AspNetCore.Mvc.Internal.DefaultApplicationModelProvider.OnProvidersExecuting(ApplicationModelProviderContext context)
   at ApplicationModel Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.BuildModel()
   at IEnumerable<ControllerActionDescriptor> Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.GetDescriptors()
   at void Microsoft.AspNetCore.Mvc.Internal.ControllerActionDescriptorProvider.OnProvidersExecuting(ActionDescriptorProviderContext context)
   at void Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.UpdateCollection()
   at ActionDescriptorCollection Microsoft.AspNetCore.Mvc.Internal.ActionDescriptorCollectionProvider.get_ActionDescriptors()
   at TreeRouter Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.GetTreeRouter()
   at Task Microsoft.AspNetCore.Mvc.Internal.AttributeRoute.RouteAsync(RouteContext context)
   at async Task Microsoft.AspNetCore.Routing.RouteCollection.RouteAsync(RouteContext context)
   at async Task Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)
   at async Task Microsoft.AspNetCore.Builder.Extensions.MapMiddleware.Invoke(HttpContext context)
   at async Task Microsoft.AspNetCore.Cors.Infrastructure.CorsMiddleware.Invoke(HttpContext context)
   at async Task Microsoft.AspNetCore.ResponseCaching.ResponseCachingMiddleware.Invoke(HttpContext httpContext)
   at async Task Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
   at async Task <OMMITED>.Middleware.ExceptionCatchMiddleware.Invoke(HttpContext context) in C:/agent/_work/19/s/<OMMITED>/Middleware/ExceptionCatchMiddleware.cs:line 47

EDIT: I am implementing Application Insights Snapshot Collector now for the API. If the exception returns, we will have a Snapshot ready for curious Microsoft employees.

@danmosemsft unfortunately I had already tried restarting the app service and in the four days since then the error has not been reproducible. I’m expecting the same error when pushing to our live servers and am trying to prep the solution with debugging snapshot support beforehand. Problem was that Application Insights never caught the error at all, guess it was thrown before AI had booted.

@zuckerthoben did a restart eliminate the error for you too?

@jcemoller Yes a restart eliminated the error. This means for us that we have to check the general state of the application everytime we have deployed.

I apologize for the delay. We've fixed the diagnostics issue that was blocking deeper investigation. Still working on it.

@ChrisAhna on my team took a deeper look at this. Unfortunately, there wasn't enough data in these crash dumps to track down the problem. The state of the process is unwound since the error has occurred. Is it possible to put this under a debugger (such as windbg) as chris mentions below? Alternatively can you ask Azure to snap crash dumps on a particular exception code?

Here is what he had to say:


I looked at the three dumps and unfortunately don't think they shed any light on the root cause.

We need insight into "what went wrong" in the JIT, which basically equates to needing to know where the JIT originated the native exception that eventually propagated into the InvalidProgramException. The origination point is often/always the following statement in the fatal(int) function in clr\src\jiterror.cpp:

RaiseException(FATAL_JIT_EXCEPTION, EXCEPTION_NONCONTINUABLE, 1, &exceptArg);

where FATAL_JIT_EXCEPTION is

error.h:13:#define FATAL_JIT_EXCEPTION 0x02345678

In each of the three dumps, this origination point information is long gone by the time the dump is snapped (since it has been stomped over by EH dispatch for the initial FATAL_JIT_EXCEPTION, then creation of the InvalidProgramException, then EH dispatch for the InvalidProgramException itself).

If we could repro under the debugger, we'd use "sxe 02345678" to break in at the origination point. Since we can't do that, we need something like:

  • Get the Azure diagnostics environment to snap dumps whenever 02345678 exceptions originate (no idea if this is possible, and wouldn't be surprised if it isn't).
  • Get the JIT team to add a mode to .NET Core that failfasts (instead of throwing) in FATAL_JIT_EXCEPTION cases, and then get the customer to pick up the new .NET Core and run their service in the new mode.

@leculver this site extensions looks like it could be useful: https://blogs.msdn.microsoft.com/asiatech/2016/01/14/tips-of-using-crash-diagnoser-on-azure-web-app/

I'll configure it to catch first chance non-managed exceptions with code 02345678 in dotnet.exe. The issue with non-reproducibility remains of course.

I've encountered a similar exception (System.BadImageFormatException: Bad IL format) fairly consistently, but in random places, and was able to stop it from occurring by disabling the Application Insights Service Profiler. I got suspicious when I was seeing these random image format errors in my own test apps and noticed this profiler hanging around, since I believe profilers are able to mess around with the IL.

@chrarnoldus and others on this thread, is the App Insights Profiler enabled on your App Service? I believe it is enabled by default. Instructions for enabling it are [here](https://docs.microsoft.com/en-us/azure/application-insights/app-insights-profiler#installation , you can follow them and check if it's enabled on your site and disable it if it is enabled.

If disabling it stops these errors from occurring, that may be the source of the problem. We'll have to reach out to someone on the App Insights Profiler team to see if they can investigate anything.

I have also seen the BadImageFormatException once yesterday. There is an open issue for that https://github.com/dotnet/coreclr/issues/18448

I will try disabling the AI Service Profiler.

@anurse I've also seen BadImageFormatException occur in addition to InvalidProgramException. The Application Insights profiler is indeed enabled and I fairly regularly see errors in the Application Events tab of App Service that I think may be caused by it (like .NET Runtime version 4.0.30319.0 - Loading profiler failed. Failed trying to receive from out of process a request to attach a profiler. HRESULT: 0x8007006d. Process ID (decimal): 12272. Message ID: [0x250d].). I will try disabling it.

Whats the status on this?
Today our Identity Server in production (.NET Core / ASP.NET Core / EF Core 2.1.1) got hit by the InvalidProgramException rendering the solution broken until an engineer analyzed the situation and restarted the app in Azure.

@leculver @anurse who owns the next action on this one..

We were not able to make progress with the crash dumps provided. The next step here is for someone who is encountering the issue to capture a crash dump at the first chance 0x02345678 exception, as I listed above. We'll only be able to track down what is causing these failures when the callstack is not unwound past where all of the useful state is.

If anyone is able to come up with a crash dump caught there we'd be happy to take a look.

Lastly, I strongly suspect that a lot of these are caused by profilers generating bad IL. This is only a theory until I can get a crash dump that shows us what's going on, but if you are hitting this in production and need immediate relief, the best course of action is to try turning off App Insights service, Microsoft Instrumentation Engine, etc. (Well, the absolute best course of action is to do that after sending us a crash dump at stopped at the exception code above...)

@noahfalk would https://github.com/dotnet/coreclr/issues/18448 always give BadImageFormatException, or could it possibly give InvalidProgramException?

@danmosemsft - I'm not aware of a way that dotnet/coreclr#18448 would generate IPE.

I'm still having this intermittent error on our production sites. Application insights doesn't detect the app is down so we aren't aware until customers start complaining.

I've just had to restart two sites just now, what is the status of this issue?

@coffeymatt Use Application Insights Availability Tests with alerts to test your sites. That will reduce the reaction time until the solution is available.

We see a similar issue and can reproduce it consistently

Repro steps

  1. Controller calls Repo.FindItems (async call await here)
  2. Repo calls SQL DB with a .ToListAsync(); (await NOT here)
  3. SQL returns null when the model expects non null type

Throws an exception that we cannot recover from unless we restart the azure web app. Any future sql calls without an app restart hang even though the SQL is no longer returning null (i.e. fixed)
We have seen System.InvalidOperationException, System.InvalidProgramException and most recently System.BadImageFormatException (Bad IL format). All from the same code on the same azure web app instance.

Observations

  1. This does not happen when debugging on a local dev machine. This does not happen on all webapps. We have 1 slot out of 4 where it only occurs.

  2. Re-deploying code does not 'fix' the issue.

  3. Before we added a top level exception handler the process would exit. After the process does not exit BUT the EF calls to SQL are hung.

  4. We are running .NET core 2.1 & EF core 2.1

  5. When running in remote debugger mode for the webapp we see the error thrown and its not in our code. Its exactly as the example stack shown below.

  6. I have not created dmp file and used windbg to analyze it as I suspect I will find more exceptions from either .net core or EF. This is my next step as time permits.

  7. All our webapps have the same dependencies as they run the same app, but only this slot has the problem. We are currently chalking this one up to a corrupt web app image (vm?) or .net core installation (installed by azure). We are doing a crash recovery test on all our web apps before deploying to PROD.

  8. A side note which is a REAL bummer. We cannot use azure webapp for 64 bit .net core out of the box since azure windows does not support .net core 64 bit (but linux version does ??, I understand the why, its the principal). Yes, there are work-arounds but, yeah... This makes it fun when someone provides a dmp file and all the stars were not aligned. Thank goodness for extensions such as this.

_Example stack_
System.InvalidOperationException:
at Microsoft.EntityFrameworkCore.Metadata.Internal.EntityMaterializerSource.ThrowReadValueException (Microsoft.EntityFrameworkCore, Version=2.1.1.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
at Microsoft.EntityFrameworkCore.Metadata.Internal.EntityMaterializerSource.TryReadValue (Microsoft.EntityFrameworkCore, Version=2.1.1.0, Culture=neutral, PublicKeyToken=adb9793829ddae60)
at lambda_method (Anonymously Hosted DynamicMethods Assembly, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null)

I've been having this issue also. I may have just worked around it, but I can't be sure.

Our project is for dotnet core 2.1.0 running on Azure, the webapp starting blowing up in random places during a database call. I have some dumps from insights if anyone wants them.

We were deploying from TeamCity using a custom script, the TeamCity server was running dotnet cli 2.1.401, I deployed from my machine running 2.1.402 and the issue seems to have gone away. So maybe there's a difference there? Maybe not. Just thought I'd add some details to the issue.

We are also seeing this issue very intermittently like once a month, but when it happens our entire app service is dead in the water until we restart it.

What's the version of Application Insights extension is installed on web app with the repro? I'm pretty sure we fixed one of those year or two ago.

We've uninstalled the Application Insights extension and I don't believe we've seen issues like these since. I don't remember exactly which version we were using, but I'm sure it was a recent version at the time I posted the issue.

Usually this error occurs deep down inside of EF Core somewhere after running a while. However, today, it happened during app startup from the looks of the trace. for The SiteName property on our CeaOptions class is a simple get/set. Restarting the App Service fixed the error.

Environment Notes:

  • AspNetCore 2.1.2
  • App Insights extension not installed for app service

Exception:
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.InvalidProgramException: Common Language Runtime detected an invalid program.
at Cea.Services.Models.CeaOptions.get_SiteNamel()
--- End of inner exception stack trace ---
at System.RuntimeMethodHandle.InvokeMethod(Object target, Object[] arguments, Signature sig, Boolean constructor, Boolean wrapExceptions)
at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture)
at System.Reflection.RuntimePropertyInfo.GetValue(Object obj, BindingFlags invokeAttr, Binder binder, Object[] index, CultureInfo culture)
at System.Reflection.RuntimePropertyInfo.GetValue(Object obj, Object[] index)
at System.Reflection.PropertyInfo.GetValue(Object obj)
at Microsoft.Extensions.Configuration.ConfigurationBinder.BindProperty(PropertyInfo property, Object instance, IConfiguration config, BinderOptions options)
at Microsoft.Extensions.Configuration.ConfigurationBinder.BindNonScalar(IConfiguration configuration, Object instance, BinderOptions options)
at Microsoft.Extensions.Configuration.ConfigurationBinder.BindInstance(Type type, Object instance, IConfiguration config, BinderOptions options)
at Microsoft.Extensions.Configuration.ConfigurationBinder.BindProperty(PropertyInfo property, Object instance, IConfiguration config, BinderOptions options)
at Microsoft.Extensions.Configuration.ConfigurationBinder.BindNonScalar(IConfiguration configuration, Object instance, BinderOptions options)
at Microsoft.Extensions.Configuration.ConfigurationBinder.BindInstance(Type type, Object instance, IConfiguration config, BinderOptions options)
at Microsoft.Extensions.Configuration.ConfigurationBinder.Bind(IConfiguration configuration, Object instance, Action1 configureOptions) at Microsoft.Extensions.Options.NamedConfigureFromConfigurationOptions1.<>c__DisplayClass1_0.<.ctor>b__0(TOptions options)
at Microsoft.Extensions.Options.ConfigureNamedOptions1.Configure(String name, TOptions options) at Microsoft.Extensions.Options.OptionsFactory1.Create(String name)
at Microsoft.Extensions.Options.OptionsManager1.<>c__DisplayClass5_0.<Get>b__0() at System.Lazy1.ViaFactory(LazyThreadSafetyMode mode)
at System.Lazy1.ExecutionAndPublication(LazyHelper executionAndPublication, Boolean useDefaultConstructor) at System.Lazy1.CreateValue()
at System.Lazy1.get_Value() at Microsoft.Extensions.Options.OptionsCache1.GetOrAdd(String name, Func1 createOptions) at Microsoft.Extensions.Options.OptionsManager1.Get(String name)
at Microsoft.Extensions.Options.OptionsManager1.get_Value() at Cea.WebApi.Middleware.AppServiceRouteAuthorizationMiddleware..ctor(RequestDelegate next, IOptions1 options, IHostingEnvironment environment, ILogger`1 logger) in E:\BuildAgents\vsts-agent-1_work\32\s\DotNetCore\Source\Cea.WebApi\Middleware\AppServiceRouteAuthorizationMiddleware.cs:line 28
--- End of stack trace from previous location where exception was thrown ---
at Microsoft.Extensions.Internal.ActivatorUtilities.ConstructorMatcher.CreateInstance(IServiceProvider provider)
at Microsoft.Extensions.Internal.ActivatorUtilities.CreateInstance(IServiceProvider provider, Type instanceType, Object[] parameters)
at Microsoft.AspNetCore.Builder.UseMiddlewareExtensions.<>c__DisplayClass4_0.b__0(RequestDelegate next)
at Microsoft.AspNetCore.Builder.Internal.ApplicationBuilder.Build()
at Microsoft.AspNetCore.Hosting.Internal.WebHost.BuildApplication()

@chrarnoldus we did the same and inactivated the extension. Havent seen the error in a week or so. This was .net core 2.1 and version 2.6.5 of the extension. Is it possible that there might be a conflict with the following line in the code?

services.AddApplicationInsightsTelemetry(Configuration);
2.4.1 is the version of Microsoft.ApplicationInsights.AspNetCore that is referenced.

I'm experiencing this issue on Azure App Service, with Core 2.0, and version 2.6.5 of the app insights extension. A restart resolves it.

Link to stack trace: https://gist.github.com/craigpopham/0b656cb0afbcc4f1a3c420d5421192a0
I have a debug snapshot for this stacktrace from the AI Snapshot debugger I can share privately if that's useful.

@tonygomez Thanks! We had a nullable column in SQL and was non-nullable in model that triggered the cascade. The first error is report correctly, and pointed us to the culprit.

@leculver I have a dump from one of our app services that is hopefully what you need. First chance exception on 02345678. Uploaded zip to https://developercommunity.visualstudio.com/content/problem/358361/invalidprogramexception-in-net-core.html.

Please shoot me an email for the zip password. Email is in my GitHub profile.

I wanted to give a quick update here and some details for anyone curious. The crash dump that @jsheetzati provided has helped immensely. We've been working on the issue since we received them, but this problem isn't easy or straightforward to solve.

The underlying problem (in the crash dump that we have from jsheetzati) is that the profiler API to replace IL method bodies was used to replace an incorrect method. So instead of replacing a method with IL code for instrumentation purposes, we are apparently replacing the _wrong_ method. The challenge is to figure out why we replaced the wrong method with the wrong IL now that we understand what is crashing.

This is been assigned to the proper teams and we are working on it. In parallel I'm working on some internal tooling to help us detect and diagnose this issue faster next time. Thank you for your patience here!

@leculver Would the likely suspect be the AppInsights profiler?

We've had issues with the InvalidProgramException on Azure with our production instance and haven't seen it happening since disabling the application insights profiler.

Havent seen the issue since we disabled it from kudo so it solved our issue also.

We had to disabled it as well.

@leppie There's definitely something going wrong with the interaction between AppInsights and CLR's profiler layer. At this point we aren't sure if the problem is in the profiler (AppInsights) itself, or if they are giving us a valid IL edit and CLR is mis-applying that edit to the wrong method.

Right now, the AppInsights team is working on new instrumentation to try to collect more information about the problem. This is unfortunately one of the most difficult problems we've seen in a while. Huge thank you to @jsheetzati who has been working with us via email. Thanks to him we've narrowed the problem down greatly, but we haven't yet been able to solve the issue. I'm hoping our next round of instrumentation will catch it red-handed.

@leculver One way we have seen it manifest quite 'regularly' is using EF Core. Create a nullable DateTimeOffset column in the DB, but make it not nullable in the model. Then perform some CRUD on it. It will first throw the correct exception, but cascades after in the same app service. Hope this can help you repro it easier.

@leppie Can you generate a full crash dump of the issue you describe? If so, can you create a new issue via VS developer community? Be sure to password protect the zip if it's sensitive, you can email me the password to it at "leculver" {at} microsoft.com, or I'll reach out via github.

Good news: We believe we have solved the issue causing these failures and we are testing a proposed fix right now. The underlying cause for the issue we've been debugging was an uninitialized variable deep within Microsoft Instrumentation Engine (actually one of its helper DLLs). As soon as this fix hits a release that can be used, I will update this thread with a link to the fixed version for everyone to install.

The Mixed-ish news: We've also seen reports (from this thread and elsewhere) that many people believe there's another failure that looks exactly like this one (InvalidProgramException at seemingly random places). In all cases where we've had a crash dump to investigate this potential second issue, we've found that it was actually Microsoft Instrumentation Engine in the process all along (even when it wasn't expected to be).

With that said though: EF Core in particular is worrisome as I _believe_ they use a lot of dynamic code, and if IL is generated incorrectly we will also throw the same kind of exception. I'd like to get the MIE fix out for everyone first. If you are still hitting the problem after that we are happy to take a look at the crash dumps and go from there.


Lastly I wanted to say: Sorry this issue took forever to solve. This was a non-deterministic failure that was REALLY tough to track down because the thing that put us in a bad state happened a long time ago in the process and it wasn't clear who was at fault.

If there's a problem in EF Core or other components we should be able to track it down a lot faster next time for two reasons: First, it's rare for a bug to be this tough for us to solve, a lot of things have to go wrong. Second, we've built a bunch of diagnostic tools to help solve this MIE issue, so we'll be able to use them if we are still seeing other InvalidProgramExceptions.

I'll update the thread as soon as I have details on where to pick up the fix. Thank you all for your patience.

Same issue here I think

System.InvalidProgramException: Common Language Runtime detected an invalid program.
   at Microsoft.EntityFrameworkCore.Update.Internal.CommandBatchPreparer.get_StateManager()
   at Microsoft.EntityFrameworkCore.Update.Internal.CommandBatchPreparer.CreateModificationCommands(IReadOnlyList`1 entries, Func`1 generateParameterName)
   at Microsoft.EntityFrameworkCore.Update.Internal.CommandBatchPreparer.BatchCommands(IReadOnlyList`1 entries)+MoveNext()
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(DbContext _, ValueTuple`2 parameters, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IReadOnlyList`1 entriesToSave, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Omx.Data.DataService`1.<>c__DisplayClass4_0.<<Save>b__0>d.MoveNext()

Probably related:

Microsoft.WindowsAzure.Storage.StorageException: Common Language Runtime detected an invalid program. ---> System.InvalidProgramException: Common Language Runtime detected an invalid program.
  at System.Net.WebHeaderCollection.get_AllowHttpRequestHeader()
  at System.Net.WebHeaderCollection.get_Item(HttpRequestHeader header)
  at Microsoft.WindowsAzure.Storage.Core.Auth.SharedKeyTableCanonicalizer.CanonicalizeHttpRequest(HttpWebRequest request, String accountName) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Auth\SharedKeyTableCanonicalizer.cs:line 74
  at Microsoft.WindowsAzure.Storage.Auth.Protocol.SharedKeyAuthenticationHandler.SignRequest(HttpWebRequest request, OperationContext operationContext) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Auth\Protocol\SharedKeyAuthenticationHandler.cs:line 76
  at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ProcessStartOfRequest[T](ExecutionState`1 executionState, String startLogMessage) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 932
  at Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:line 664

We have the same issue. Seemed it coincided with Azure Insights Autoscaling and Updating the hosting plan.

We had this weird situation where if a browser connected to the site successfully it would continue to work until cookies were cleared, then it would stop working again.

Turns out that the site working or not depended on which instance of the app the browser connected to.

We reduced the instances down to 1 and turned off autoscale. Problem fixed for now.

We are seeing this starting today for a Full Framework .NET App using Windows Azure Storage

Message: Common Language Runtime detected an invalid program.
System.Net.WebHeaderCollection.get_AllowHttpResponseHeader():-1
System.Net.WebHeaderCollection.get_Item(HttpResponseHeader header):0

Microsoft.WindowsAzure.Storage.Core.Executor.ExecutionState`1.set_Resp(HttpWebResponse value) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\Common\Core\Executor\ExecutionState.cs:303
Microsoft.WindowsAzure.Storage.Core.Executor.Executor.ExecuteSync[T](RESTCommand`1 cmd, IRetryPolicy policy, OperationContext operationContext) in c:\Program Files (x86)\Jenkins\workspace\release_dotnet_master\Lib\ClassLibraryCommon\Core\Executor\Executor.cs:677

I got this today :(
I have no application insight switched on, the only difference we made is we migrate our team city and octopus deploy to different server, the SDK which being installed on build agent can be different but Im not sure. Hope this will help you to find the root cause asap.

image

This is the final update here before I close the issue. A new version of ProductionBreakpoints_x86.dll and ProductionBreakpoints_x64.dll are being deployed, the version with the fix for this issue is 2018110902.

If you are encountering InvalidProgramExceptions here are the steps you need to follow:

  1. Obtain a crash dump of this crash. This is the only way to make progress here, so you must do this.
  2. Find the version number of ProductionBreakpoints*.dll in the crash dump. (See below for instructions.)
  3. If ProductionBreakpoints*.dll is not in the process, then you are hitting a DIFFERENT ISSUE than the one we've root caused. (See below for "What do I do...?".)
  4. If ProductionBreakpoints*.dll IS loaded into the process but its version number is less than 2018110902, then you are simply hitting this issue and haven't received the update yet.

How to check the version number of ProductionBreakpoints in a crash dump:

If you are unsure how to check the version number, here's how to do it with Debugging Tools for Windows:

cdb -z c:\path\to\crash.dmp -c "lmvm ProductionBreakpoints*;q"

This will produce output similar to this:

00000001`80000000 00000001`800c9000   ProductionBreakpoints_x64 
        [snip]
    Information from resource tables:
        [snip]
        ProductVersion:   2018110902 45311b8ef93b29719923a44c283ddac3eafec06d ProductionBreakpoints-Release
        FileVersion:      2018110902 45311b8ef93b29719923a44c283ddac3eafec06d ProductionBreakpoints-Release

The important number being here: "ProductVersion: 2018110902 45311b8ef93b29719923a44c283ddac3eafec06d ProductionBreakpoints-Release"

What do I do if I have the fix or ProductionBreakpoints_*.dll is not loaded into the process?

If you are still encountering InvalidProgramExceptions when you either have the fix or if ProductionBreakpoints*dll is not loaded into the process, then you are encountering a different bug than the one we've been tracking down. If that happens, follow these steps:

  1. Compress your crash dump into a .zip file (or .7z).
  2. Set a password on it. Email this password to [email protected]
  3. Create an issue on VS Developer Community and upload the zip there. Be sure to put this text in the issue: "Please ping Lee Culver internally to look at this issue".
  4. File a new issue in this repo about the bug [I will update this text if there's a new bug]. Be sure to @leculver so I'll get an email notifying me about it.

We'll take a look from there. (Also sorry it's a bit of a process, VS Dev Community is the only way we have to reliably get files right now.)

Alternatively if you have a way to share files (onedrive, your own webserver, etc) please feel free to use that and ping me with the details.

Last Notes

One thing to note, we will NOT be able to help without a crash dump on this kind of issue. Your first two steps are getting a crash dump, then checking the ProductionBreakpoints*.dll version.

Please see the reply below from @WilliamXieMSFT for more info.

@leculver Is a new version of the Application Insights site extension being published with the fixed dlls too or is that not necessary? Thanks!

@jsheetzati ApplicationInsights has moved towards a Preinstalled Site Extension model and does not have the offending bits anymore. In addition, the private Site Extension will not be updated as it is estimated to be deprecated within a month. You should be able to upgrade to the preinstalled site extension through the ApplicationInsights configuration in the AppService blade in Azure.

However, if you want an immediate solution for the private site extension, you can workaround the issue by removing the following two files in the site extension and restarting the site:

  • Instrumentation32\ProductionBreakpoints_x86.config
  • Instrumentation64\ProductionBreakpoints_x64.config

This will effectively remove the part of AppInsights extension causing issues.

Hope that helps!

@jsheetzati you can find more about the new Application Insights enablement here: https://docs.microsoft.com/en-us/azure/application-insights/app-insights-azure-web-apps#run-time-instrumentation-with-application-insights
ProductionBreakpoints will be enabled if you turn on "Show local variables for your application when an exception is thrown"

Just out of curiosity I would like to know if you @leculver see that this could also reproduce on .Net framework.

We have been having issues with periodic InvalidProgramExceptions in some of our web applications. With one application as often as once a week or so. We are targeting .net framework 4.5 with the specific application, but the Azure App Service naturally uses 4.7 runtime.

Today I found out that application insights site extension has been enabled on the application with most problems just few days before the issues started occuring. Most of the exceptions are coming from entityframework related calls. EF version is 6.1.3. It was really hard to find any recent information on issues like this on .net framework.

On the other hand we are using ApplicationInsights as nuget package in some or our applications and with instrumentations done to the application code directly. These applications are also targeting more recent .net framework. And with those we have not seen the issue.

I also ran into this. Why is this feature enabled by default in production? Modifying other people's IL without their permission seems dangerous. I mean I don't let anyone do that because they get it wrong (I've seen it a lot in my career) but Microsoft got it wrong which is worrisome.

Anyway thanks to all of you who figured it out, it's easy for me to disable this and move on.

I created a separate issue for .Net framework to verify it is the same issue seen there:
421032 - System.InvalidProgramException - Common Language Runtime detected an invalid program

So far have not seen issues again after applying fix mentioned here (two weeks ago) and could not reproduce the issue in test env although I was able to do it twice before applying the fix.

@aviita Sorry I missed your previous reply here. Yes the same issue can happen with desktop CLR (the issue is with ProductionBreakpoints*.dll and that works on both .Net Core and desktop CLR). I also verified that the crash dumps you sent do have a version of AppInsights with the bug in them. So disabling the plugin or updating these DLLs will fix the issue.

Running into this issue deploying a Web Api as an API App on Azure. An initial request to any endpoint would result in the expected response; however, subsequent requests would return the same Common Language Runtime error. I figured out the problem started when I enabled Recommended collection level on the Application Insights blade on my web app. I set recommended and enabled all radio buttons. Reverting this change stopped the error. For reference, the API I am running is running Microsoft.ApplicationInsights 2.8.1

image

Running into this issue deploying a Web Api as an API App on Azure.

@Adrian10988, please provide:
1) repro steps
2) How did you create the webapi app
3) Are there any details about the error

Could you also try setting "Recommended" mode, but disabling all the features below (Profiler, Snapshot debugger, etc) and see if the error reproes.

PS: That could be a separate issue, we might need to start a new issue.

I had InvalidProgramException issue one year ago and it went away for a while but these days it came back.
Remarks:

  • Error happen on a query that is not done through EF, it's with Dapper
  • One of the latest things that I've changed into the code is adding _telemetryClient.TrackTrace(requestBody); _telemetryClient.TrackException(ex); for logging excention details on Insights
  • I use .Net Core 2.2

Now I've turned on the new application insights extension and the server was restarted. Now everything works fine and I will wait to see if it comes again.

Are you (@leculver) still interested in more occurrences of this issue, even after ApplicationInsights switched to the Preinstalled Site Extension model and a lot of time has passed?

I've run into this dratted InvalidProgramException two days ago and it took me those two days to find this thread... 🤦‍♂️ In my situation we switched the App Service from x86 to x64 and ran into some unrelated probems. To diagnose those, we turned on all Application Insights features in the App Services config blade - and we soon forgot about that, because we were deep into that other problem(s) that basically showed a really messy project setup (that we had to cleanup but which was, as it turned out, not part of the problem) and were confident that activating some SQL statement logging in Application Insights should "just work". 😉

Long story short: we cleaned up the solution and deployed our app as a 64 bit self-contained .Net Core 2.2 app. It works on our local machines (of course 🥳) but we never could bring it to run in the App Service environment. Switching off the Application Insights Profiler did the trick - the app now works fine.

So there might still be some other issue connected to the Application Insight extension. Hence the question: would you still be interested in a crash dump to investigate this?

Sure, if you have a crash dump I'd be happy to take a look to see if there's a new issue or if you are hitting the old one. Please don't share crash dumps of production services publicly, as they may contain sensitive info on the heap. You can email me offline at my github account name @microsoft.com.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Timovzl picture Timovzl  ·  3Comments

jchannon picture jchannon  ·  3Comments

v0l picture v0l  ·  3Comments

bencz picture bencz  ·  3Comments

aggieben picture aggieben  ·  3Comments