Efcore.pg: Idle Connections remaining open after DbContext is disposed.

Created on 2 May 2018  路  6Comments  路  Source: npgsql/efcore.pg

I've run into an issue that's hard to track down. We have an app that needs to connect to 70 different databases on the same server to perform migrations, insert data, and generally bootstrap our dbs. We are running into the postgres exception FATAL: sorry, too many clients already. And when we do, I see that there are 100 idle connections open on different databases. Because these are different databases I don't think Pooling will help, however we are not turning pooling off.

I switched our code to wrap all DbContexts in using statements to ensure they would be disposed, this had no effect.

I created a little test app, and debugged it to see if the idle connection was killed after the context was disposed, it was not. I checked this by running select * from pg_stat_activity on my database.
This seems like a bug to me

Microsoft.NETCore.App: 2.1.0-preview2
Microsoft.EntityFrameworkCore: 2.1.0-preview2-final
Npgsql.EntityFrameworkCore.PostgreSQL: 2.1.0-preview2

Here is my experiment to try to isolate this problem:

public class Program
{
    public static int Main(string[] args)
    {
        var optionsBuilder = new DbContextOptionsBuilder<PrivateDataContext>();
        optionsBuilder.UseNpgsql("Host=127.0.0.1;Username=foo;Port=5432;Password=bar;Database=baz", cfg =>
        {
            cfg.EnableRetryOnFailure(15);
        });

        using (var context = new DbContext(optionsBuilder.Options))
        {
            context.Database.Migrate();
            // DB Connection open.
        }

        // DB Connection left open but Idle.
        return 0;
    }
}

All 6 comments

This is the expected behavior.

Internally, DbContext instances create and open instances of NpgsqlConnection. Unless specifically disabled, NpgsqlConnection pools internally - closing an NpgsqlConnection doesn't actually close the physical connection, but rather keeps it internally in a pool in case a new NpgsqlConnection is opened to the same database. Since the default Max Pool Size is 100, if at any point you have 100 active DbContext instances, it makes sense for 100 physical connections to exist to that server, and they will stick around after the DbContexts are disposed.

You have several options for dealing with this:

  1. Turn off pooling entirely by adding Pooling=false in your connection string. This will ensure that whenever an NpgsqlConnection is closed or disposed, the physical connection to PostgreSQL will be closed as well. This means that every DbContext will start up (and destroy) its own physical connection, which is going to be very slow if you have lots of DbContext instances. This option usually makes sense only if you use a specific DbContext instance to do a lot of work, and so the time to establish and destroy a physical connection is marginal.
  2. Reduce the Connection Idle Lifetime parameter, which defaults to 300 seconds. This parameter controls how long an idle physical connection sticks around in the pool before it's destroyed. Reducing this will allow you to keep enjoying the benefits of pooling, while at the same time making sure unneeded connections are closed relatively quickly.

Am closing this since nothing seems to be wrong, but if you have any more questions and need advice don't hesitate to post back here.

@roji Thanks so much!

After I turned off pooling for the Bootstraper app it fixed my issue. This way the many dbContexts I create in that app don't leave dangling connections.

I left pooling on for our aspnet service though, I'll probably need to toggle the connection lifetime and max pool a bit to see what's the sweet spot.

Given that we have many single tenant databases all on the same postgres server each with an average of 5-10 users, do you have any recommendations on what max pool size I should set?

Thanks!

I don't have a clear picture of your setup and needs, so it's hard to say... But it's important to understand that Max Pool Size is a hard limit of how many physical connections your application will have. If you set Max Pool Size to 5 and then try to open 6 connections, the 6th attempt will block until one of the other 5 releases (closes) the connection, this could impose a big delay in your application.

You also need to take into account how many instances of your ASP.NET program are going to be running, since all of them will be accessing your database(s). In some situations it makes sense to use a single, centralized pool (take a look at pgbouncer) in addition to (or instead of) Npgsql's built-in pool.

Make sure you understand exactly what Max Pool Size and Connection Idle Lifetime do - these should allow you to write a low-latency, performing application that doesn't overload your database.

If you have any other specific questions don't hesitate to ask.

Since Idle connections are not reused, does that mean Max Pool Size is database specific? Since we have 70 DBs on the same server, I would have thought idle connections would be re-purposed for other database connections, but that does not happen.

Thanks so much for your help on this.

Since Idle connections are not reused, does that mean Max Pool Size is database specific? Since we have 70 DBs on the same server, I would have thought idle connections would be re-purposed for other database connections, but that does not happen.

That's a good question. A pool is specific to a connection string - each connection string has its own pool, and physical connections never move around different pools. That means that if you're connecting to two difference connection strings, you have two pools, each with its own Max Pool Size, managed completely apart. Note that this holds for any change in the connection string, not just the database - even a change in Application Name will trigger a different pool being created and managed separately.

For some background this, it's useful to understand that in PostgreSQL, a physical connection is always to a specific database, and you cannot change databases for an existing connection (unlike with some other databases). PostgreSQL actually spins up a backend process for each physical connection, tied to a specific database.

I hope this provides more context. It means that from a client perspective you can treat your 70 databases as truly separate - it does not matter whether they're running on the server or not. On the server side, the PostgreSQL max_connections setting in your postgresql.conf defines maximum total physical connections, across all databases on the server (or more precisely, on what PostgreSQL calls the "cluster"). This probably means that you should decrease your Max Pool Size considerably (and probably also Connection Idle Lifetime) to prevent fragmentation, where one database pool has lots of idle connections hogging server resources, and another pool is starved.

@roji Excellent explanation. Thanks so much!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bikeladam picture bikeladam  路  3Comments

fernandolguevara picture fernandolguevara  路  3Comments

win32nipuh picture win32nipuh  路  4Comments

macon picture macon  路  4Comments

austindrenski picture austindrenski  路  5Comments