Sqlclient: System.Data.SqlClient.SqlConnection does not roll back open transaction on dispose

Created on 26 Sep 2016 · 18Comments · Source: dotnet/SqlClient

SQL Server version is 2014 if it matters. Trivial reproduction causes timeout unless ;pooling=false is appended to the connection string.

    using System;
    using System.Data;
    using System.Data.SqlClient;

    /*
    CREATE TABLE Scratch (ID INTEGER NOT NULL);
    INSERT INTO Scratch VALUES (1);
    */

    namespace ConsoleApplication
    {
        public class Program
        {
            public static void Main(string[] args)
            {
                if (args.Length == 0) {
                    Console.WriteLine("Usage: dotnet disposeconnection.dll connectionstring");
                    return ;
                }
                using (var c1 = new SqlConnection(args[0]))
                {
                    c1.Open();
                    using (var c2 = new SqlConnection(args[0]))
                    {
                        c2.Open();
                        Exec(c2, "BEGIN TRANSACTION;");
                        Exec(c2, "UPDATE Scratch SET ID = 2;");
                    }
                    Exec(c1, "UPDATE Scratch SET ID=3;");
                }
            }

            private static void Exec(SqlConnection c, string s)
            {
                using (var m = c.CreateCommand())
                {
                    m.CommandText = s;
                    m.ExecuteNonQuery();
                }
            }
        }
    }

We got here through a much larger chunk of code handling thrown exceptions. We have no idea what the connection state is and it might have an executing data reader on it so we have no good way of passing the rollback command ourselves.

Source

jhudsoncedaron

Most helpful comment

This issue happens because unmanaged transaction is not rolled back before the connection moves back the pool. That means this error only happens when user initiates unmanaged transaction by executing BEGIN TRANSACTION command manually in code, and the transaction is not finished properly before the connection moves back to the pool.

Recommended (managed) way of initiating transaction is using SqlTransaction object by beginning with SqlConnection.BeginTransaction() as following:

using (var sqlConnection = new SqlConnection(connString))
{
    sqlConnection.Open();

    using (var trans = sqlConnection.BeginTransaction())
    {
        using (var cmd = new SqlCommand("UPDATE Scratch SET ID=1;", sqlConnection, trans))
        {
            cmd.ExecuteNonQuery();
        }
    }
}

In this case, incomplete transaction is rolled back automatically when SqlTransaction object is disposed in out of using scope. Even if user does not specify using block for transaction object, SqlTransaction is disposed by GC eventually, and rollback for incomplete transaction happens automatically.

But if user initiates transaction by executing SQL command BEGIN TRANSACTION manually (unmanaged) :

using (var sqlConnection = new SqlConnection(connString))
{
    sqlConnection.Open();

    using (var cmd = new SqlCommand("BEGIN TRANSACTION;", sqlConnection, trans))
    {
        cmd.ExecuteNonQuery();
    }

    using (var cmd = new SqlCommand("UPDATE Scratch SET ID=1;", sqlConnection, trans))
    {
        cmd.ExecuteNonQuery();
    }
}

In this case, the transaction is never rolled back when the connection moves back to the pool because we do not have control for transaction in runtime since the transaction runs without SqlTransaction object we can access. We cannot even know if it is in the middle of transaction or not in runtime when the connection moves back to the pool.

One possible way to resolve this issue is putting following line in SqlConnection.CloseInnerConnection() method, which rolls back whatever remains as incomplete transaction in SqlConnection object before the connection moves back to pool :

this.BeginTransaction().Dispose();

This line resolves the issue because rollback occurs when Dispose() is called. However, we decided not to do this as following reasons:

This produces extra rollback request to server every time pooled connection is closed, which will lead significant performance decrease.
It is more consistent to keep it unmanaged, and let user finish the transaction properly before closing connection if the transaction is initiated by manual SQL command by user.

geleems on 29 Apr 2017

👍2

All 18 comments

I've been studying the problem, and even though I can intercept all instances of SqlConnection.Dispose() I'm not sure of the workaround. The connection object knows internally whether or not it can run commands but does not expose it, and even if it did, it's really hard to convince it to not go back into the connection pool.

jhudsoncedaron on 26 Sep 2016

@jhudsoncedaron What is the OS on which the Client code is running?

saurabh500 on 10 Nov 2016

Windows 10.

Incidentally this can be worked around if the ConnectionState property is made to work in all cases. (The documentation says states Executing and Fetching aren't implemented "This value is reserved for future versions of the product.".), in which case I can implement my own connection pool.

jhudsoncedaron on 10 Nov 2016

What's the status on this?

krwq on 17 Apr 2017

I was able to repro the timeout exception with this code in Windows 10.

using System.Data.SqlClient;

namespace ConnectionRepro
{
    class Program
    {
        public static void Main(string[] args)
        {
            string connString = "Server=tcp:<hostname>,1433;User ID=<id>;Password=<password>;Connect Timeout=600;";

            using (var c1 = new SqlConnection(connString))
            {
                c1.Open();
                using (var c2 = new SqlConnection(connString))
                {
                    c2.Open();
                    Exec(c2, "BEGIN TRANSACTION;");
                    Exec(c2, "UPDATE Scratch SET ID=2 WHERE COL=11;");
                    //Exec(c2, "COMMIT;");
                }
                Exec(c1, "UPDATE Scratch SET ID=4 WHERE COL=22;");
            }
        }
        private static void Exec(SqlConnection c, string s)
        {
            using (var m = c.CreateCommand())
            {
                m.CommandText = s;
                m.ExecuteNonQuery();
            }
        }
    }
}

I tested with following SQL queries:

drop table Scratch;
create table Scratch
(
    ID int,
    col int
);

insert into Scratch(ID, col) values (22, 11);
insert into Scratch(ID, col) values (33, 22);

Is there any particular reason to leave c2 transaction incomplete, and move back to c1?
Because it seems it was intentionally failed since the transaction of c2 was not completed by either commit or rollback before moving back to c1 in connection pool environment.
If I put Exec(c2, "COMMIT;"); after c2's update clause to finish the transaction properly, the program runs successfully.
If I take out Exec(c2, "BEGIN TRANSACTION;"); beforec2's update clause to avoid making transaction, the program runs successfully as well.
This behavior is exactly the same as what .NET framework does. I am not sure if this exception should be considered as a bug or not.

geleems on 27 Apr 2017

The why behind this code is hard to understand due to reduction. The actual problem is the second connection got sent back to the connection pool with an open transaction. In the real code, there was a thrown exception after the Exec("UPDATE ..."); line, and another thread picked up the connection with an open transaction.

The reason there's no rollback in a finally block is because if a data reader were open, trying to execute the rollback would itself throw.

jhudsoncedaron on 27 Apr 2017

👍1

Recommended (managed) way of initiating transaction is using SqlTransaction object by beginning with SqlConnection.BeginTransaction() as following:

using (var sqlConnection = new SqlConnection(connString))
{
    sqlConnection.Open();

    using (var trans = sqlConnection.BeginTransaction())
    {
        using (var cmd = new SqlCommand("UPDATE Scratch SET ID=1;", sqlConnection, trans))
        {
            cmd.ExecuteNonQuery();
        }
    }
}

But if user initiates transaction by executing SQL command BEGIN TRANSACTION manually (unmanaged) :

using (var sqlConnection = new SqlConnection(connString))
{
    sqlConnection.Open();

    using (var cmd = new SqlCommand("BEGIN TRANSACTION;", sqlConnection, trans))
    {
        cmd.ExecuteNonQuery();
    }

    using (var cmd = new SqlCommand("UPDATE Scratch SET ID=1;", sqlConnection, trans))
    {
        cmd.ExecuteNonQuery();
    }
}

this.BeginTransaction().Dispose();

This line resolves the issue because rollback occurs when Dispose() is called. However, we decided not to do this as following reasons:

This produces extra rollback request to server every time pooled connection is closed, which will lead significant performance decrease.
It is more consistent to keep it unmanaged, and let user finish the transaction properly before closing connection if the transaction is initiated by manual SQL command by user.

geleems on 29 Apr 2017

👍2

So how do you write a function that does not care if it is in a transaction or not to create a command?

This is the reason we don't use the managed transaction API.

jhudsoncedaron on 29 Apr 2017

@jhudsoncedaron Could you provide a little more information about? :

a function that does not care if it is in a transaction or not to create a command

In our understanding, if there is a function that creates a command without knowing if it is in a transaction or not, then the caller of the function should manage the transaction flow.
If the command is adding transaction as TSQL, then it should also make sure that there is clean up logic involved there as well.

geleems on 29 Apr 2017

We discovered that Connection.CreateCommand() returns a command that doesn't work if a transaction is open. It's easy enough to fix if you know you need to do this, but the fix code crashes if there is no transaction open. We have tons of helper functions that read stuff from the database that literally do not care if the caller opened a transaction or not.

The connection pool is kind of broken anyway due to inheriting the isolation level of the previous transaction (which would be immediately fixed on opening a new one, but many read-only paths don't) and I'd actually rather you fix ConnectionState to let me know if the connection can accept commands or not and I'll implement my own pool.

jhudsoncedaron on 29 Apr 2017

I have discovered that an actual fix is easy and doesn't depend on adding any more round trips.

Instead of "EXEC sp_resetconnection;" on return to pool do "BEGIN TRANSACTION;ROLLBACK;EXEC sp_resetconnection;". Because BEGIN TRANSACTION/COMMIT stack and ROLLBACK doesn't, any pending transaction is rolled back.

They tell me that sp_resetconnection is ran from Open() rather than Close() but from the number of bad connections I've gotten out of the pool in a row this is not the case. Also, the client is going to have to do set the transaction isolation level immediately on receiving the connection from the pool anyway. This is well known to the internet if you look for it but not documented in a way that is discoverable if you don't already know. (The connection object inherits its previous isolation level from the pool, which is likely random in any large application.)

jhudsoncedaron on 5 Jun 2017

As recently announced in the .NET Blog, focus on new SqlClient features an improvements is moving to the new Microsoft.Data.SqlClient package. For this reason, we are moving this issue to the new repo at https://github.com/dotnet/SqlClient. We will still use https://github.com/dotnet/corefx to track issues on other providers like System.Data.Odbc and System.Data.OleDB, and general ADO.NET and .NET data access issues.

divega on 16 May 2019

@jhudsoncedaron

So how do you write a function that does not care if it is in a transaction or not to create a command?

This is the reason we don't use the managed transaction API.

We discovered that Connection.CreateCommand() returns a command that doesn't work if a transaction is open. It's easy enough to fix if you know you need to do this, but the fix code crashes if there is no transaction open. We have tons of helper functions that read stuff from the database that literally do not care if the caller opened a transaction or not.

If I'm following, you're saying, in essence, that using the managed API for transactions, which would solve this problem, doesn't work because of how CreateCommand() works, right?

Would the proposal in https://github.com/dotnet/SqlClient/issues/28 resolve/bypass this issue? I think it would fix the CreateCommand() challenge you've encountered, allowing you to use managed transactions.

bgribaudo on 3 Jun 2019

@bgribaudo : Yeah that would; but I call via IDbConnection.CreateCommand()

jhudsoncedaron on 3 Jun 2019

Thanks, @jhudsoncedaron! Maybe the fact that there are two of us who have an interest in #28 will get a little more traction under it. :-)

bgribaudo on 3 Jun 2019

@bgribaudo : Don't hope too much. I've a pull request up for fixing a data loss bug that's probably going to be declined just because hitting the race condition is rare.

jhudsoncedaron on 3 Jun 2019

@geleems

@jhudsoncedaron Could you provide a little more information about? :

a function that does not care if it is in a transaction or not to create a command

In our understanding, if there is a function that creates a command without knowing if it is in a transaction or not, then the caller of the function should manage the transaction flow.

Agreed! The challenge is puling this off when CreateCommand() is used.

Let's say I have a caller which manages transactions (as you describe). At certain points, this caller invokes other, low-level functions, passing them a reference to the db connection. These functions may choose to execute commands on that connection, creating those commands using CreateCommand(). These commands are transaction agnoistic--they don't modify the transaction state (no begins, rollbacks or commits) and don't care whether they are in a transaction. So far, this sounds good, and maintains separation of responsivities (caller handles transactions, low-level methods handle details in a transaction-agnostic way).

The catch is that it doesn't work at the technical level with managed transactions. :-(

If the caller uses managed transactions, command created by those other functions using CreateCommand() will error because they haven't been associated with the open transaction. Either those low-level functions need to be passed a reference to the managed transaction (so that those functions, which are supposed to be transaction-agnostic can set the appropriate transaction property on each command they create) or the caller needs to use T-SQL to manage transactions. Neither option is preferable.

Something like the proposal in #28 would resolve this.

bgribaudo on 3 Jun 2019

As explained in https://github.com/dotnet/SqlClient/issues/84#issuecomment-298121435 by @geleems this is expected behavior if SqlClient is not managing transactions in pooled connection scenario.

Closing this issue as not a bug in driver.