Divide the whole case into the following parts:
Good idea. I'll take care of this in a couple weeks when I get back from my honeymoon :)
@caleblloyd I've tried write 10,000 records. Our lib(1.1.0) takes 18s, but oracle takes 9s. :sob:
Pomelo 1.0.0 takes 2s. It seems 1.0.0 is more faster.
I'm surprised you got Oracle's to work at all! They are not async, they will always consume more resources until they go async
I don't know why....
Write 10,000 records:
| Pomelo 1.1.0 | Pomelo 1.0.0 | Oracle |
|-|-|-|
|18s|2s|9s|
```c#
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.EntityFrameworkCore;
namespace ConsoleApp2
{
public class Program
{
public static void Main(string[] args)
{
Task.Run(async () =>
{
var DB = new Models.PertContext();
//await DB.Database.EnsureCreatedAsync();
Console.WriteLine("Initing DbContext...");
for (var i = 0; i < 10000; i++)
{
DB.Tests.Add(new Models.Test { Title = "Pert test #" + i });
}
Console.WriteLine("Writing...");
var time = DateTime.Now;
await DB.SaveChangesAsync();
Console.WriteLine("Finished in " + (DateTime.Now - time));
});
Console.Read();
}
}
}
```c#
using Microsoft.EntityFrameworkCore;
//using MySQL.Data.EntityFrameworkCore.Extensions;
namespace ConsoleApp2.Models
{
public class PertContext : DbContext
{
public DbSet<Test> Tests { get; set; }
protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
base.OnConfiguring(optionsBuilder);
optionsBuilder.UseMySql("server=localhost;uid=root;pwd=123456;database=perftest");
}
}
}
Just ran that code, seeing 18s with 1.1.0 also with async. It seems that large amounts of async writes are slow. If you change Pomelo 1.1.0 to sync it is much faster
Pomelo 1.1.0 Async:
Finished in 18.84 seconds
Pomelo 1.1.0 Sync:
Finished in 5.75 seconds
Since Oracle always runs in Sync, they are benefiting from this workload being faster in Sync. I agree that Async should be at least as fast as sync here though. I'll see what I can find, this is likely related to the ADO.NET driver though.
In MySqlConnector, 10,000 straight line Async inserts inside of a transaction takes on average 5-6 seconds on my machine. Sync is slightly faster, but not by much, it takes on average 5 4 seconds.
Pomelo.EntityFrameworkCore.MySql takes around 18 seconds to do 10,000 async straight line inserts vs 6 seconds when using sync. When using a batch size of 100, this goes down to 6 seconds for async inserts and 2-3 seconds for sync.
It seems that running SaveChangesAsync with a high number of changes has a slowdown compared to SaveChanges. I think the performance slowdown is somewhere in our library because MySqlConnector holds steady.
It's always better to use a large batch size in high volume operations triggered by SaveChanges or SaveChangeAsync, example:
optionsBuilder.UseMySql("connection string", options => options.MaxBatchSize(100));
I'll need to continue to investigate cause for slowdown in our library.
@caleblloyd @bgrainger I still think we should optimize MySqlConnector, because Pomelo.Data.MySql 1.0.0 with ef core was able to insert 10,000 records in 2s.
I haven't viewed the source of MySqlConnector carefully, but I think we should ensure use async in IO scenes only. Avoid abusing async will improve the performance.
Besides, I've seen so many ConfigureAwait(false) in MySqlConnector, thread/execution context changing is also a large time-wasting thing. Narrow down the amount of thread changing will also enhance the async performance.
Hope implement sync methods independently. Global async will generate lots of codes into async state machine, state switching will cost many times.
So, there is a scene, I want to do something exclusively, I don't concern the CPU utilization ratio but the sync operations must be executed without thread changing and state switching. In other words, I don't want to get narrow and short records still invoking the async method(the sync method which is calling async one).
Implement sync methods independently should be better?
If we have no plan to implement sync methods independently. Maybe sync methods waiting behavior caused the low performance. I don't know will .GetAwaiter().GetResult() suspend the thread or not(I saw it will invoke InternalWait), perhaps SpinWait or SpinLock could solve the thread suspending issue.
Here are the more scientific results, 3 runs of 10,000 straight line inserts on each. These times are on a Core i7 Ultrabook.
MySqlConnector Sync
Finished in 00:00:03.9685150
Finished in 00:00:03.9135100
Finished in 00:00:03.5935250
MySqlConnector Async
Finished in 00:00:05.4760860
Finished in 00:00:05.4317970
Finished in 00:00:04.9882900
MySql.Data 7.0.6-IR31 Sync
Finished in 00:00:03.8172580
Finished in 00:00:03.9439550
Finished in 00:00:03.7664650
MySql.Data 7.0.6-IR31 Async
Finished in 00:00:03.9189610
Finished in 00:00:03.8368190
Finished in 00:00:03.7065780
Pomelo.Data.MySql 1.0.0 Sync
Finished in 00:00:02.9926240
Finished in 00:00:03.0759430
Finished in 00:00:03.0032930
Pomelo.Data.MySql 1.0.0 Async
Finished in 00:00:03.4852800
Finished in 00:00:03.2989580
Finished in 00:00:03.2217890
MySql.Data and Pomelo.Data.Mysql don't implement real async, they just map to sync calls, that's why their sync and async numbers match up. Real async is ~20% slower, but supports far higher concurrency. MySqlConnector's results are in-line with that expectations.
I think that Pomelo.Data.MySql is slightly faster because it is speaking latin1:

MySql.Data and MySqlConnector speak utf8 by default, which I believe has a little more overhead.
So, I want to see CPU unilization ratio when invoking async methods.
It won't be much different between sync and async in a straight line test. None of async's concurrency benefits are triggered here.
The real benefit is less threads in high concurrency situations. I blogged about it a while back, using Thread.Sleep (sync) vs Task.Delay (async)
I've written code to avoid the async state machine in the past (e.g., https://github.com/mysql-net/MySqlConnector/commit/0449b822fa9b6ab7c3da30be18fa9e4ef2b15492) but it's been a while since I've done performance profiling (and it may have been regressed since then). It makes me think that some automated performance regression tests (e.g., like arewefastyet.com) would be worthwhile.
I'm still looking for the magic bullet that would get us over the 1.5K RPS ceiling in performance tests. golang can do 3K RPS easily and I'm at a loss for why .NET Core can't.
I'm starting to think it may just be a limitation of the .NET async event loop. Microsoft.AspNetCore.Server.Kestrel can serve way above 1.5K RPS, but they import libuv which makes me think they are running their own async event loop.
Just posted some experimental LibUV performance results over at https://github.com/mysql-net/MySqlConnector/issues/163
I found some issues with the async I/O in MySqlConnector; details at mysql-net/MySqlConnector#164.
I haven't read the full thread neither I have reviewed your socket usage/implementation but I have a recomendation.
For high performance networking you should'nt use any new awaitable async socket API since in .NET this is currently extra overhead on top of former sync calls. Yet, the way to go is async, but you must create a custom layer on top of the SocketAsyncEventArgs class for high performance networking.
Please read this for more details https://blogs.msdn.microsoft.com/pfxteam/2011/12/15/awaiting-socket-operations/#comments
I have implemented something similar with good results.
@YandyZaldivar See SocketAwaitable.cs in MySqlConnector (the underlying MySQL library).
Seems you are on the good path, +1. If combined with a good buffering implementation to minimize read calls and SocketAsyncEvenArgs reusing, everything should work fine.
MySqlConnector 0.19.0 includes a lot of performance fixes across all areas of the library: mysql-net/MySqlConnector#245
@caleblloyd Could you re-run the pomelo performance test, I want to compare our lib to others.
I can do it in a few days. I'll try to get it working against Oracle and Sapient.
I built a new concurrency benchmark that I believe does a better job at exhibiting the benefits of this library. I could not get it to work with MySql.Data.EntityFrameworkCore because there were exceptions when trying to run the code. I did get it working with SapientGuardian.EntityFrameworkCore.MySql however, which uses a modified version of MySql.Data underneath and does not implement truly asynchronous methods.
I've posted the results on a wiki page here: I've posted the results on a wiki page here: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/wiki/Concurrency-Benchmark-Results
In ASP.NET Core EF DbContext objects use a scope approach, meaning a new DbContext object is created and disposed per request. Would it be practical with Pomelo to use a singleton approach associating every DbContext object to a connection? In this case with 1000 clients we will have 1000 open db connections but because requests are scheduled to use an appropriate number of concurrent threads the 1000 db connections won't hurt. Is this thinking correct, or there is a better reason to use scoped DbContext?
@YandyZaldivar scoped DbContext is fine. The underlying DbConnection is only opened when a query is being executed and then it is closed when the query is done executing. 1000 DbContexts does not mean 1000 open connections since a connection is only used for the short life of the query. Scoped ensures the proper tracking when loading and saving entities. Scoped also ensures that the DbContext is always cleaned up when the user request is finished, and this is good.
@caleblloyd Thank you for the precise answer, I have another doubt. If the same user makes various requests it is not a good thing to keep tracking the entities to avoid extra queries to the database. As I see it right now the scoped DbContext only track entities for a single request, new requests start with an empty cache (new DbContext). Wouldn't be it better to reuse the same DbContext for more than one request of the same user? I know the current design of scoped DbContext is right but I don't see the whole picture yet. I am implementing my own system with your EF provider (outside from ASP.NET) and I have proper dependency injection and IoC in place but still figuring it out the internal workings of DbContext and how to better use it.
@YandyZaldivar yes, you can shard DbContext to handle one set of entities and use it from request to request. You are talking about sharing on a User key. Keep in mind that the user and all of the user data you load in the sharded DbContext should not be written by another DbContext, otherwise you will have stale data. You must also implement Read/Write locking on access to data in that DbContext by using ReaderWriterLockSlim or some other means of Read/Write locking (ReaderWriterLockSlim is not async-friendly, so an async alternative would be better)
You must keep in mind if you have more than 1 App Server, you must always route user's request to the same App Server (User Affinity). Otherwise you will end up with multiple App Servers having multiple out-of-sync versions of that user in their DbContext.
This is somewhat involved getting correct but I have done it in apps before and it is extremely fast and efficient. ASP.NET uses a per-request scoped DbContext by default because it is more trivial to implement - no worries about Locking, Stale Data, etc.
@caleblloyd thanks again, great explanation. In my server I process user requests simultaneously, but requests from a single user are queued and processed sequentially, because of that behavior I am assuming a thread safe usage of the DbContext and no need for locks (I ensure the DbContext is used only by the same user session and by a single thread at a time).
Most helpful comment
I built a new concurrency benchmark that I believe does a better job at exhibiting the benefits of this library. I could not get it to work with
MySql.Data.EntityFrameworkCorebecause there were exceptions when trying to run the code. I did get it working withSapientGuardian.EntityFrameworkCore.MySqlhowever, which uses a modified version ofMySql.Dataunderneath and does not implement truly asynchronous methods.I've posted the results on a wiki page here: I've posted the results on a wiki page here: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/wiki/Concurrency-Benchmark-Results