Pomelo.entityframeworkcore.mysql: InvalidOperationException: A second operation started on this context before a previous operation completed

Created on 6 Feb 2020 · 11Comments · Source: PomeloFoundation/Pomelo.EntityFrameworkCore.MySql

Steps to reproduce

I recently migrated from Core 2.2 to Core 3.1. In the past, I used MySql.Data.EntityFrameworkCore. However, after upgrading to Core 3.1 I switched to Pomelo.EntityFrameworkCore.MySql 3.1.1 But now I am getting the following error

InvalidOperationException: A second operation started on this context before a previous operation completed. This is usually caused by different threads using the same instance of DbContext. For more information on how to avoid threading issues with DbContext, see https://go.microsoft.com/fwlink/?linkid=2097913.

Here is a sample of my code which explains at what point in time I get the error

public TestController : Controller
{
    private readonly AppDbContext Context;

    public TestController(AppDbContext context)
    {
        Context = context;
    }

    public async Task<IActionResult> Test()
    {
        var task1 = Context.Users.Where(x => x.Authenticated).ToListAsync();

        var task2 = Context.Schools.Where(x => x.Visible).ToListAsync();

        Task.WhenAll(task1, task2);

        return View(new SomeViewModel
        { 
            Users = task1.Result,
            Schools = task2.Result,
        });
    }
}

As you can see then calling Task.WhenAll(task1, task2); I get the error listed above. However, the same code previously worked fine with MySql.Data.EntityFrameworkCore on .NET Core 2.2.

The issue

InvalidOperationException: A second operation started on this context before a previous operation completed. This is usually caused by different threads using the same instance of DbContext. For more information on how to avoid threading issues with DbContext, see https://go.microsoft.com/fwlink/?linkid=2097913.

Here is the stack trace

Microsoft.AspNetCore.Mvc.Infrastructure.ActionMethodExecutor+TaskOfIActionResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, object controller, object[] arguments)
System.Threading.Tasks.ValueTask<TResult>.get_Result()
System.Runtime.CompilerServices.ValueTaskAwaiter<TResult>.GetResult()
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeActionMethodAsync>g__Awaited|12_0(ControllerActionInvoker invoker, ValueTask<IActionResult> actionResultValueTask)
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeNextActionFilterAsync>g__Awaited|10_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, object state, bool isCompleted)
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Rethrow(ActionExecutedContextSealed context)
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.Next(ref State next, ref Scope scope, ref object state, ref bool isCompleted)
Microsoft.AspNetCore.Mvc.Infrastructure.ControllerActionInvoker.<InvokeInnerFilterAsync>g__Awaited|13_0(ControllerActionInvoker invoker, Task lastTask, State next, Scope scope, object state, bool isCompleted)
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeNextResourceFilter>g__Awaited|24_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, object state, bool isCompleted)
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Rethrow(ResourceExecutedContextSealed context)
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.Next(ref State next, ref Scope scope, ref object state, ref bool isCompleted)
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeFilterPipelineAsync>g__Awaited|19_0(ResourceInvoker invoker, Task lastTask, State next, Scope scope, object state, bool isCompleted)
Microsoft.AspNetCore.Mvc.Infrastructure.ResourceInvoker.<InvokeAsync>g__Logged|17_1(ResourceInvoker invoker)
Microsoft.AspNetCore.Routing.EndpointMiddleware.<Invoke>g__AwaitRequestTask|6_0(Endpoint endpoint, Task requestTask, ILogger logger)
Microsoft.AspNetCore.Authorization.AuthorizationMiddleware.Invoke(HttpContext context)
Microsoft.AspNetCore.Authentication.AuthenticationMiddleware.Invoke(HttpContext context)
Microsoft.AspNetCore.Diagnostics.StatusCodePagesMiddleware.Invoke(HttpContext context)
Microsoft.AspNetCore.Diagnostics.DeveloperExceptionPageMiddleware.Invoke(HttpContext context)

Further technical details

MySQL version: 8.0.16
Operating system: Windows 7 64bit
Pomelo.EntityFrameworkCore.MySql version: 3.1.1
Microsoft.AspNetCore.App version: .Net Core 3.1.1

closed-question type-question

Source

magic-john

All 11 comments

The problem with you code is exactly what the exception tells you:

InvalidOperationException: A second operation started on this context before a previous operation completed. This is usually caused by different threads using the same instance of DbContext. For more information on how to avoid threading issues with DbContext, see https://go.microsoft.com/fwlink/?linkid=2097913.

When you take a look at the link that the exception provides, you find further information about the underlying cause:

Entity Framework Core does not support multiple parallel operations being run on the same DbContext instance. This includes both parallel execution of async queries and any explicit concurrent use from multiple threads. Therefore, always await async calls immediately, or use separate DbContext instances for operations that execute in parallel.

EF Core has never supported multi-threaded access and there have been checks added in recent versions, to ensure that users don't accidentally violate this rule.

It might have worked before with the Oracle provider under EF Core 2.2, because there were fewer checks and Oracles underlying ADO.NET provider isn't really asynchronous (it implements the appropriate asynchronous method contracts, but all async methods internally just call their synchronous counterparts).

Pomelo uses MySqlConnector, which is a high performance and truly asynchronous database driver.

The easiest way to fix your code is to ensure, that you access DbContext only from one thread at a given time and in sequence, which can still be asynchronous as in the following example:

c# return View(new SomeViewModel { Users = await Context.Users.Where(x => x.Authenticated).ToListAsync(), Schools = await Context.Schools.Where(x => x.Visible).ToListAsync(), });

lauxjpn on 6 Feb 2020

@lauxjpn hmmm. Changing the code to the suggested code will block the main thread. It will defeat the purpose of calling async. The idea of async calls is to do more work in parallel. In this case he is trying to retrieve multiple datasets asynchronously/or in parallel.

I am thinking that ADO.NET has connection pool which allowed EntityFrameworkCore to make multiple calls to the database using multiple connections from the pool which would explain why it worked previously. Whereas the MySqlConnector does not have a connection pool rather a single connection per context which also explain why it is failing now.

CrestApps on 6 Feb 2020

@CrestApps

Changing the code to the suggested code will block the main thread. It will defeat the purpose of calling async.

No it doesn't. It will run asynchronously (which is different than running in parallel). Basically, it means that instead of blocking/waiting for a pending operation to complete (like a database call over the network or accessing a file), it will return the execution context back to the calling thread (for example the main thread in UI applications or one of the thread pool threads responsible for request processing in an ASP.NET Core application). When the operation later completes, it will restore the context and continue execution from there.
It is exactly the case async/await has been introduced for.

For UI applications this means, that the UI will not block, but stay responsive, because the message pump is continued to be processed. For ASP.NET Core applications it means that the current thread can handle another request in the meantime.

The idea of async calls is to do more work in parallel.

Async/await is for running code asynchronously, not in parallel. It is possible to use it to run code not just asynchronously, but also in parallel, but this is just a case for which asynchronous programming can be leveraged for. It is not the main case. What the application can do in the meantime, depends on its type and the framework.
This is a common misconception.

ASP.NET Core's architecture is optimized for asynchronous operations, not for parallel operations.

EF Core does encourage you to use asynchronous operations as much as you want as well, but it does not allow parallel operations, because it is not thread-safe.
You can think of it in a similar way, as UI applications only allow UI altering operations from the main thread (which is just an implicit way to ensure, that no multi-threaded access can corrupt the internal state).
EF Core's DbContext is more flexible here, as it does not care about the thread it is called from. But it still only allows one call at a time.

Whereas the MySqlConnector does not have a connection pool rather a single connection per context which also explain why it is failing now.

Both, the Oracle database driver and also MySqlConnector do support connection pooling. MySqlConnector has it enabled by default with a max. of 100 connections as the default.

DbContext will only use one connection. This is true for previous versions of EF Core like 2.2 and also current versions like 3.0+. This makes sense, because EF Core allows only one database query at the same time anyway (see previous post), so it does not need more than one connection.

When using ASP.NET Core, one DbContext per request will be created by default. When the DbContext needs a database connection (usually when the first query gets executed), it will request one from the database driver, which will reuse one from the connection pool if a free connection is available, or create a new one otherwise.

I am thinking that ADO.NET has pool connection which allowed EntityFrameworkCore to make multiple calls to the database using multiple connections from the pool which would explain why it worked previously.

No, that is not correct. I explained the reason why it worked previously above. You can also do some performance tracing on 2.2 with and without running the queries in parallel. They should result in similar performance.

The reason for that is, that MySQL does not support MARS (multiple active result sets), so the database driver must either block until the first operation completes or it must throw an exception (this exception would have nothing to do with the one described above; EF Core threw that exception, because it detected that it was used it in a multi-threaded way, which is not allowed).

Because of that, the fixed code I posted should run with similar performance as the original code did on 2.2.

Parallel queries are dangerous to run on any EF Core version, including 2.2, because they can corrupt the DbContext or the internal EF Core state and produce unpredictable behavior and bugs, which are hard to track down due to their non-deterministic nature. So it's good that the EF Core team has tightened their violation detection to prevent theses scenarios as best as possible.

lauxjpn on 6 Feb 2020

👍1

Thank you for explaining the behavior.

Question Would it be better to create a new AppDbContext instance for every call so I am not waiting for one query to finish before I start the second?

For example, instead of registering my AppDbContext with the scope lifetime, I would register it using intransient lifetime and I would change my code to this

Using the await as recommended the code will take the following steps

At 1:00:00.000 PM the first query is sent to the server
At 1:00:10.000 PM the results from the first query are back
At 1:00:10.005 PM the second query is sent to the server
At 1:00:15.000 PM the results from the second query are back

This took 15 seconds to execute both queries using a single context/connection as a pipeline.

But if we create a new connection for every call "as the code below explains" the code will take the following steps

At 1:00:00.000 PM the first query is sent to the server
At 1:00:00.005 PM the second query is sent to the server
At 1:00:05.000 PM the results from the second query are back
At 1:00:10.000 PM the results from the first query are back

Both queries are sent to the server roughly at the same time, the results from the second query came first and 5 seconds later the results from the first query came back. This took 10 seconds to execute both queries using 2 separate connections.

public TestController : Controller
{
    private readonly IServiceProvider Sp;

    public TestController(IServiceProvider sp)
    {
        Sp = sp;
    }

    public async Task<IActionResult> Test()
    {
        var task1 = GetUsers();

        var task2 = GetSchools();

        Task.WhenAll(task1, task2);

        return View(new SomeViewModel
        { 
            Users = task1.Result,
            Schools = task2.Result,
        });
    }

    private async Task<List<User>> GetUsers()
    {
        using var context = Sp.GetService<AppDbContext>();

        return await context.Users.Where(x => x.Authenticated).ToListAsync();
    }

    private async Task<List<School>> GetSchools()
    {
        using var context = Sp.GetService<AppDbContext>();

        return await Context.Schools.Where(x => x.Visible).ToListAsync();
    }
}

Obviously create a new dbcontext per query will consume a lot more connection from the pool so the connection pool will need to be increased to accommodate a higher traffic website.

magic-john on 6 Feb 2020

@magic-john Yes, using different DbContext instances for individual queries will work in the way you describe. Be aware however, that because you retrieve the entities from different DbContext instances, they do not navigate to each other, unless you manually detach them from one context and attach them to the other.

Generally, you would implement such an approach only when really necessary, because moving entities between contexts is cumbersome. Examples for this would be, if either the network performance is slow (i.e. the database server is on the other side of the world so any round trip is expensive), or the query itself performs poorly and cannot be optimized on a LINQ or database level (indices).

In case of network performance, it might be easier to just use a database server that is physically nearby (or use a master-slave replication for read operations with a read-only server nearby). If that isn't possible, than the only other way to optimize would be to reduce round trips and execute all-in-one queries, though they need to be carefully crafted, to not slow down the database server. So bad network performance can be a valid scenario, where the approach you propose makes sense.

In case the queries perform slow, it's most of the time much easier and better in the long run to just optimize the queries on a LINQ and database level.

If your ASP.NET Core applications returns responses in a reasonable/acceptable amount of time, than I wouldn't bother optimizing the database access times at all, because while the async operations happen, ASP.NET Core will scale by handling other requests with the current thread (so CPU time does not go to waste). And if there are not constant/enough requests coming in, then there is no need for optimizations anyway.

So i think that optimizing in the way you propose only make sense, if the response time of the ASP.NET Core application is unacceptably slow and other optimization strategies have already been exhausted.

As a real-life example, I had an EF Core 2.2 WebApi application, that performed bad in a way that every request took over a second to be handled. While the web servers were hosted on site, the database servers used RDS from AWS as a high availability solution and database round trips took about 15 ms each. The database model of the app had a lot of relational entities and because EF Core 2.2 does not query them using joins but sends individual SELECT statements, just because so many different entities needed to be queried, over 400 ms were just network latency.
Upgrading to EF Core 3.0 cut that time down to 150 ms due to using JOIN statements. The remaining 600 ms could have been cut down to almost nothing by introducing a local cache. I did not do that however, but instead I just optimized three individual queries (one was missing an index and the other two just needed some LINQ fine-tuning).

lauxjpn on 6 Feb 2020

If your ASP.NET Core applications returns responses in a reasonable/acceptable amount of time, than I wouldn't bother optimizing the database access times at all, because while the async operations happen, ASP.NET Core will scale by handling other requests with the current thread (so CPU time does not go to waste). And if there are not constant/enough requests coming in, then there is no need for optimizations anyway.

I am not sure that I understand what you're are suggesting here.

Assume I have the following three method calls.

public async Task<IActionResult> Test()
{
    var users = await GetUsers();
    CallApi();
    var schools = await GetSchools();

    return View(new SomeViewModel
    { 
        Users = users,
        Schools = schools,
    });
}

CallApi() will not be invoked until GetUsers() completely finish. How would ASP.NET Core scale by handling other requests? Are you saying that the main thread will be available to other HTTP request "from other users"? So if I am not using async calls at all, will each HTTP request block one thread?

A side note, the whole reason I used Task.WhenAll() instead of individual await was an attempt to reduce the TTFB. The app runs fairly fast, but I am trying to fine-tune anything where possible. It would be nice if EF supports async call over the same connection. In theory, that would boost the app performance.

magic-john on 6 Feb 2020

Are you saying that the main thread will be available to other HTTP request "from other users"?

Yes, that is what happens when you await an async call. The thread that issued the call is free to do something else until the call returns. So ASP.NET Core will use the thread to handle another request (usually from another user).

So if Bob sends a request against your Test() method and the method awaits GetUsers(), then ASP.NET Core can use this thread to handle a completely different request from Alice against your Test() method (or any other method) and so on.

So if I am not using async calls at all, will each HTTP request block one thread?

Exactly. Every request is handled by a dedicated thread. If that request synchronously calls a method that blocks/waits for a network operation to complete, the thread blocks as well until the operation completes and finally the response is returned to the calling user. Then the thread will be returned to the thread pool and reused.

So by using asynchronous programming, you can conserve server resources (threads are expensive, as are thread context switches). You will need fewer threads for the same amount of requests. This will only make a difference if enough requests are hitting the server. Your server will scale better an a way, that it will be able to handle more requests. However, the individual requests will not be handled significantly faster.

A side note, the whole reason I used Task.WhenAll() instead of individual await was an attempt to reduce the TTFB.

That is basically the argument I am making. If your TTFB is unacceptably slow, then you need to optimize, though using multiple DbContext instances would not be my first, but a later optimization strategy, as it is likely to make the code much more complex than it needs to be. Optimizing the database itself and the LINQ queries should usually be done first, because they add little complexity and offer high potential for performance improvements.

The app runs fairly fast, but I am trying to fine-tune anything where possible.

If there is no need to optimize, then I wouldn't do it. Most optimizations add complexity. Having a fast and simple app is better than having an ultra fast but complex app.
Just define the limits the app should run within and make sure that these get measured. When you start hitting these limits, it's time to optimize. Otherwise, I would spend my time where it is more valuable.

lauxjpn on 6 Feb 2020

@lauxjpn Thank you for your amazing explanation and help with this request. I'll take your advice and change my code to await each query with one DB context and go from there.

magic-john on 6 Feb 2020

@CrestApps BTW, there are two great videos on the whole Async/Wait, Parallel Tasks, Multi-Threading confusion:

The more recent The promise of an async future awaits by @BillWagner
and the very fundamental Performing Asynchronous I/O Bound Operations by @JeffreyRichter

The first one has the best analogy I ever came across about the differences between sync, async and parallel execution (within the first 10 minutes), while the second one explains to someone who is unfamiliar with Win32 and Windows kernel programming, how a complete async operation through all the different OS layers can actually look like, and why there is no need for another thread to be involved, in a way and tempo that really anybody can follow.

On a side note, careful about the many videos featuring @brminnick. While he has great practical advice about the async/await subject (about the do's and don't and how to handle common cases in the field), the background or context information and explanations he provides are often imprecise or just wrong (my skin craws every time when he mentions a thread being spun-up inside of an async operation).

lauxjpn on 18 Feb 2020

👍1

Thanks for the feedback @lauxjpn.

Async/Await is a complex subject. When teaching complex topics and condensing them into practical actions, it’s best to teach the information in layers.

Let’s use Physics as an example:

When learning physics, we are first taught Newtonian Physics. This teaches us about force and velocity, giving us practical equations that we can use to understand how the world works.

Next, we are taught Einsteinian Physics, where we learn about relativity. It builds onto Newtonian Physics, teaching us that everything is relative to the observer.

(Continuing on, we are taught Particle Physics, String Theory, Spacetime etc.)

Each layer we learn expands on the previous. Does learning about Relativity mean that v=Δx/Δt is no longer valid? No, but it expands on it, demonstrating that velocity is relative to the observer.

For async/await, the concepts I teach are at the Newtonian-layer. It is practical advice that will help you learn best practices for async/await in .NET; similar to how you don't need to know Particle Physics to calculate the speed/acceleration of a car.

You are correct that not every async method generates a thread, and to understand "why" requires going a layer deeper which is difficult to condense into a 20-minute video or a 45-minute conference session. Do share your blog posts and videos on the topic, because I'd love to promote them in my sessions as a way to go deeper and learn more about async/await!

brminnick on 18 Feb 2020

I fully agree @brminnick . Subjects like async/await are probably best told in multiple sessions. The ones I am referencing here are probably more on the "Applied Physics for Engineers" level, like the Correcting Common Mistakes When Using Async/Await in .NET conference talk, that is directed at developers who already had some/first experiences with async/await (and probably encountered some issues).

I truly believe, that the practical advice in the talk is a great collection of the common issues people will run into when actually using async/await in real-word applications. So kudos for making this into a talk, because finding all the pieces by yourself as a developer is cumbersome, and will otherwise probably only happen after encountering those bugs yourself and after long debugging sessions.

Just be careful when talking about threads or multi-threading in this context, because they have only very little to do with the actual async/await concept.

The only part, where a thread needs to be mentioned in my opinion is, when talking about the current thread that executes the operation and when talking about ConfigureAwait. Otherwise, in a pure asynchronous talk, I don't think you need to mention threads at all, because who really knows if the async operation (meaning some 3rd library or Windows code down the stack) did just create a thread, is asynchronously waiting for a process or did just send an IRP to some driver that is now doing work on its own circuit.

Actually, making this clear in the talk might be a good idea, so that the audience is on the same page and understands, that there are a lot of different scenarios that can result in asynchronous operations.

Multi-Threading should probably not mentioned at all. Async/await can be used as a way to access multi-threaded behavior (because Task.Run can be called async, and it might be simpler this way), but this should be a separate talk about Multi-Threading then (like two physics concepts can be connected to each other, but are part of different main areas of physics).

If you take this into consideration in the future, I think you have a great talk and I am looking forward to upcoming iterations with new practical stuff.

Also, while we are chatting about the talk here, If you want to expand on one of the background areas, the whole "why can calling SomeMethodAsync().Wait()" (and its derivations) under specific circumstances and with specific project types in specific C# versions lead to a deadlock, might help a lot of people to understand why and how to avoid this issue, and where you will actually be able to get away with it. (Just be sure you know the bits of it, if you do this.)

lauxjpn on 18 Feb 2020

Was this page helpful?

0 / 5 - 0 ratings