I am having an issue when trying to open a connection to our SQL Server 2016 SP1 CU7 server from latest (not sure if it happens with previous) netcore 3 preview 9. The code just hangs when calling Open (same result with OpenAsync) and doesn't continue. Tried with both Microsoft.Data.SqlClient and System.Data.SqlClient NuGet packages.
Same code works fine when executed in my windows laptop outside Docker. If I switch to netcore 2.2 it works fine inside docker (changing images as well).
I have created a small repo with a repro: https://github.com/adrian-lopez-softtek/NetCore3SqlOpenIssue/
Tried with both images (the commented ones in the Dockerfile) with same result.
Any help would be appreciated, as I'm creating a new project with version 3 and don't want to switch to 2.2 at this point, but if I'm not able to make it work I will probably have to.
Thanks!
I did one more test connecting to one a Azure SQL database, and with this one works fine. Our on premise server version is 13.0.4466.4 vs cloud 12.0.2000.8, in case this helps
Managed to get it working using preview 9 ubuntu images (3.0.100-preview9-disco & 3.0.0-preview9-disco). Not sure what happens with debian images, but they didn't work for me.
-disco resolved this for us with SQL Server 2014 and Linux containers - You are a lifesaver @adrian-lopez-softtek !
Just tested this with RC1 and new debian docker images and the issue is still there
I am having the same issue: program connects just fine from my laptop and hangs on.
In a tcpdump I can see that there is a TDS7 pre-login request and response being sent, but the connection hangs before the credentials are checked. (Tested by sending bad credentials to the server with the same result as the correct credentials.)
I can confirm that the Ubuntu image does connect.
There are a few issues out there about hangs but this looks the freshest. We are seeing a deadlock between SNIMarsHandle and SNIMarsConnection. When we disable multiple active result sets on the connection string, the issue is gone.
snimarshandle handleack -- lock(this)
snimarsconn handlereceivingcomplete -- lock(this)
snimarsconn sendasync -- lock(this)
snimarshandle internalsendasync -- lock (this)
Hi @aidanjryan
As we have more MARS related issues reported and users have reported regressions that occurred in System.Data.SqlClient. Could you test this case with System.Data.SqlClient 4.6.1 and confirm if that was a working version? We're trying to reach the root cause of the problems.
@cheenamalhotra we were using System.Data.SqlClient with the same connection string, same client code, with no issues until changing to .NET Core 3.0 preview 7. I assume that was System.Data.SqlClient 4.6.1.
Hello - we ran in exactly the same issue with both SqlClient-Packages (System.Data + Microsoft.Data). When we used the original Microsoft baseimages for netcore 3.0 we had this issue. When we changed to Ubuntu 18.04 (3.0-bionic) everything worked fine - thanks to @adrian-lopez-softtek for this hint!
I am not sure if it's the same issue, in my case, the first time after I deployed a simple dotnet core 3.0 web api to linux image, sqlconnection open failed or takes around 100 seconds to just open the connection, then the database connection works well and very fast, after around 5 or 10 minutes, again the sqlconnection.open takes 100 seconds every time unless I redeploy the web api again.
I tried sqlserver 2008 and 2016, also I tried dotnet core 2.1 and 3.0, also tried Sysyem.Data.Sqlclient and Microsoft.Data.SqlClient, also tried -bionic and -disco images, all the same issue.
Appreciate if anyone has some hints.
Thanks.
@gfzhang8 Do you have MultipleActiveResultSets=True
in your connection string? Disabling this resolved the hang for us.
@aidanjryan Thanks, but MultipleActiveResultSets=True
is not in my connection string.
Additionally, I can reproduce the issue every time after redeploy, the first time fails or takes 100 second, then works well and connection open fast, after 5 or 10 minutes, fails or takes 100 second, which is the most strange thing to me, why the hang-time is exactly 100 seconds every time even when fails.
BTW, when connection open fail, the exception is something like 'cannot write to transport...', will attach the exact exception stack later.
And I confirmed the same code deployed on a Windows server has no issue to connect to the same database with same connection string.
This is the exception stack when sqlconnection.open failed:
Connection id "0HLRISNCFDBAV", Request id "0HLRISNCFDBAV:00000001": An unhandled exception was thrown by the application.
Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 35 - An internal exception was caught)
---> System.IO.IOException: The write operation failed, see inner exception.
---> System.AggregateException: One or more errors occurred. (Unable to write data to the transport connection: Connection reset by peer.)
---> System.IO.IOException: Unable to write data to the transport connection: Connection reset by peer.
---> System.Net.Sockets.SocketException (104): Connection reset by peer
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
--- End of inner exception stack trace ---
at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
at Microsoft.Data.SqlClient.SNI.SslOverTdsStream.WriteInternal(Byte[] buffer, Int32 offset, Int32 count, CancellationToken token, Boolean async)
--- End of inner exception stack trace ---
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at Microsoft.Data.SqlClient.SNI.SslOverTdsStream.Write(Byte[] buffer, Int32 offset, Int32 count)
at System.Net.Security.SslStream.WriteSingleChunkTWriteAdapter
--- End of inner exception stack trace ---
at System.Net.Security.SslStream.WriteAsyncInternalTWriteAdapter
ClientConnectionId:9f80f330-79b5-4a84-b486-e97275e14eff
Microsoft.AspNetCore.Server.Kestrel[13]
I'm running into the same issue when upgrading our ASP.NET Core 2.2 app to ASP.NET Core 3.0. As others have mentioned, it works fine locally running in Windows 10 Enterprise, but it fails when it gets deployed to a Linux container using the .NET image: mcr.microsoft.com/dotnet/core/aspnet:3.0
.
It failed with both System.Data.SqlClient
as well as the new Microsoft.Data.SqlClient 1.0.19269.1
. I'm also not using MARS. Here is how connection string looks like:
SERVER=some.domain.com,3655;DATABASE=SomeDatabase;UID=SomeUsername;PWD=SomePassword;PACKET SIZE=4096
The DB code is very simple:
private const string GetAllQuery =
"SELECT [Value] " +
"FROM [dbo].[TB_DataProtection]";
public IReadOnlyCollection<XElement> GetAllElements()
{
_logger.LogInformation("Getting all elements");
string conStr = _connectionStringFactory(DatabaseName);
var elements = new List<XElement>();
using (var con = new SqlConnection(conStr))
{
con.Open();
using (var cmd = new SqlCommand(GetAllQuery, con))
{
using(SqlDataReader reader = cmd.ExecuteReader())
{
while(reader.Read())
{
byte[] data = (byte[])reader["Value"];
string dataValue = System.Text.Encoding.UTF8.GetString(data);
var element = XElement.Parse(dataValue);
elements.Add(element);
}
}
}
}
return elements;
}
After adding log statements after each line, I was able to identify that the problem was in the con.Open()
line.
@adrian-lopez-softtek
I tried your sample and I'm able to connect fine with both images:
FROM mcr.microsoft.com/dotnet/core/runtime:3.0-buster-slim AS base
FROM mcr.microsoft.com/dotnet/core/aspnet:3.0.0-preview9-buster-slim AS base
FROM mcr.microsoft.com/dotnet/core/sdk:3.0-buster AS build
FROM mcr.microsoft.com/dotnet/core/sdk:3.0.100-preview9-buster AS build
I tested "Microsoft.Data.SqlClient" Version="1.1.0" with netcoreapp3.0 and connected to:
FYI I used IP addresses of SQL Servers in network in connection string, as DNS resolution may not happen appropriately inside docker containers.
Please confirm if you still have this issue?
I tried our setup again and investigated different docker images:
like @iinuwa posted before you can see in wireshark that in the bad case the server closes the socket connection after the TDS7 prelogin message, which contains the TLS "Client Hello"
I decoded the TLS payload of the good and bad prelogin message and there you can see that in the "bad" case the supported versions are TLS 1.3 and 1.2.
In the "good" case versions 1.0, 1.1, 1.2 and 1.3 are supported.
In Debian 10 TLS support of 1.0 and 1.1 was deactivated, so the behaviour of the used baseimages is ok. The working bionic image with Ubuntu 18.04 still supports these TLS versions.
I think that the sql server closes the connection cause it only supports TLS 1.1 and the request of the client signals that this version is not supported by the client.
So - i guess there are two problems now:
Seems that our sql server needs an update to support at least TLS 1.2 - have to contact our IT with this.
The hang in the OpenAsync of the SQL-Client seems like ab bug to me, cause the server closes the socket connection - expecting an exception in this case!
Hope this helps a little bit
@clane2812
Thanks for the info. There's another thread for similar issue where the recommended solution is to enable TLS 1.2 on SQL Server https://github.com/dotnet/SqlClient/issues/222#issuecomment-537019520
Docker containers may be dropping support for TLS 1.0 and 1.1 as these protocols will be marked deprecated soon. Switching docker image to _bionic_ does not guarantee that the TLS protocol in use is v1.2, and if that image takes out support for older protocols too, this issue will occur again.
Minimum use of TLS 1.2 will be mandatory soon, and so does Microsoft promote using TLS 1.2 protocol with SQL Server. Microsoft's article to upgrade OS and SQL Server to TLS 1.2 can be found here.
Hey @cheenamalhotra
First of all, thanks for looking into this.
I think we have 2 problems here, as @clane2812 mentioned. On one side, we have the problem with the SQL Servers that doesn't use TLS 1.2. I have already requested to the team that handles DBs in our company to enable 1.2, because as you mention in your comment, disco works for now, but everything seems to indicate that deprecated versions are going to be disabled in all distros.
On the other side, we have the behavior of SqlClient just hanging without throwing any exception. I don't know the insights of this, but if this could be improved by throwing an exception that clearly indicates what the error is, would be super helpful for people that is not really conscious of the TLS version used by their database (probably because it's handled by a different team) and avoid them losing time investigating until they realize that something that was working yesterday, is not working today because settings on the docker image have changed.
Do you think it would be possible to throw an exception when this happens on later versions of the client?
Thank you!!
My issue was also due to mismatch between TLS versions between the debian image and SQL Server. The new debian image updated the openssl configuration to have TLS 1.2 as the minimum allowed version.
In my case, I was able to continue using the debian image after adding the following command to my Dockerfile:
RUN sed -i 's/MinProtocol = TLSv1.2/MinProtocol = TLSv1/' /etc/ssl/openssl.cnf \
&& sed -i 's/CipherString = DEFAULT@SECLEVEL=2/CipherString = DEFAULT@SECLEVEL=1/' /etc/ssl/openssl.cnf
This brings back the minimum TLS version back to TLS 1.0. Not ideal, but upgrading SQL Server is not an option right now for us.
I agree that the error message needs to be improved.
Thanks for the discussion. It helped me a lot find the problem.
I checked my sqlserver, it only support TLS1.2, if I use original Debian image, which has MinProtocol=TLS1.2, sqlconnection.open gives me SSL Handshake failed with OpenSSL error
with detail ssl_choose_client_version:unsupported protocol
immediately.
Then if I change to MinProtocol=TLS1.0, it will hang for 100 seconds or fail with another error, which I posted in above comments.
I'm confused why unsupported protocol
error, my sqlserver only support TLS1.2, and MinProtocol=TLS1.2 should be the right setting...
Can confirm this issue is happening in our environment on 3.1.
Also, a switch to the bionic image fixed the issue for us.
I can confirm that the problem exists. @epignosisx Solution worked for me!
I just want to mention the correct solution to this issue is to ensure target SQL Server supports TLS 1.2 protocol by updating it respectively. This Microsoft Article can be used to figure out whether target SQL Server supports TLS 1.2 or not.
The above comment by @epignosisx is a workaround merely to bring back TLS 1.0 support in client machine, which is not Microsoft recommendation. Microsoft has declared TLS 1.0 and TLS 1.1 protocols as insecure with known vulnerabilities and all customers must move towards TLS 1.2 protocol.
More information on how to Enable TLS 1.2 on Servers
I encountered this issue on SQL Server 2016, which according to that article supports TLS 1.2 out of the box. Did I miss a step?
SQL Server 2016 still supports older TLS versions and it is possible TLS 1.2 is disabled.
You can confirm which TLS version is in use by capturing network packets when you create connection with Wireshark and looking into pre-login handshake packets. You will find "ClientHello" and "ServerHello" packets that also contain information about TLS version in use.
e.g.
Hey @cheenamalhotra thanks for the links. Any plans on giving a proper error/exception when server doesn't support TLS1.2 ? Current behavior is a bit misleading and requires some digging to find out what is really happening.
Thanks again! 馃槃
@adrian-lopez-softtek
Yes we are looking into providing proper error message if possible.
Looks like you have to add SQL Server certificate to trusted in order to utilize TLS1.2. Copy root certificate to /usr/local/share/ca-certificates
in container and run update-ca-certificates
Faced a similar issue in my work setup connecting to a SQL Server 2014 (12.0.6108.1). In my case the server had TLS1.2 enabled and working which I was able to verify through network monitor. But the problem was due to this Seclevel=2 setting in the Debian 10 openssl.cnf as mentioned by @epignosisx
[system_default_sect]
MinProtocol = TLSv1.2
CipherString = DEFAULT@SECLEVEL=2
I changed only the SECLEVEL=1
and was able to connect from the Debian images. Digging deeper into Seclevel it seems that level 2 is more strict on the cipher suites (https://www.openssl.org/docs/man1.1.0/man1/ciphers.html) Especially it leaves out SHA1 which is the signature algorithm used in the self signed certificates in SQL Server (until 2016) used for connection login data encryption (ref). Probably why @cheenamalhotra's tests with SQL Server 2016 & 2017 worked.
And as expected changing the certificate on my server to use a SHA256 did the trick. (Used this as ref)
I tried our setup again and investigated different docker images:
- with netcore 2.1 the sql client connects without problems to our server
with netcore 3.0:
- mcr.microsoft.com/dotnet/core/runtime-deps:3.0
the sql client hangs in a call to OpenAsync and never returns- mcr.microsoft.com/dotnet/core/runtime-deps:3.0-buster-slim
the sql client hangs in a call to OpenAsync and never returns- mcr.microsoft.com/dotnet/core/runtime-deps:3.0-bionic
everything fine!like @iinuwa posted before you can see in wireshark that in the bad case the server closes the socket connection after the TDS7 prelogin message, which contains the TLS "Client Hello"
I decoded the TLS payload of the good and bad prelogin message and there you can see that in the "bad" case the supported versions are TLS 1.3 and 1.2.
In the "good" case versions 1.0, 1.1, 1.2 and 1.3 are supported.In Debian 10 TLS support of 1.0 and 1.1 was deactivated, so the behaviour of the used baseimages is ok. The working bionic image with Ubuntu 18.04 still supports these TLS versions.
I think that the sql server closes the connection cause it only supports TLS 1.1 and the request of the client signals that this version is not supported by the client.
So - i guess there are two problems now:
Seems that our sql server needs an update to support at least TLS 1.2 - have to contact our IT with this.The hang in the OpenAsync of the SQL-Client seems like ab bug to me, cause the server closes the socket connection - expecting an exception in this case!
Hope this helps a little bit
This is the correct response, for the problem that we find with openasync using NetCore 3.1 (the last version until now - January 2020) we just need to pull using docker in this way FROM mcr.microsoft.com/dotnet/core/aspnet:3.1-bionic
SQL Server 2016 still supports older TLS versions and it is possible TLS 1.2 is disabled.
You can confirm which TLS version is in use by capturing network packets when you create connection with Wireshark and looking into pre-login handshake packets. You will find "ClientHello" and "ServerHello" packets that also contain information about TLS version in use.
e.g.
@cheenamalhotra
This is my ClientHello when connecting from Linux image:
It seems like it's using TLS1.0?, but I do have
MinProtocol = TLSv1.2
CipherString = DEFAULT@SECLEVEL=2
in the openssl.conf file.
And when I connecting from Windows image, the ClientHello is TLS1.2:
Any idea?
Thanks a lot!
Looks like you have to add SQL Server certificate to trusted in order to utilize TLS1.2. Copy root certificate to
/usr/local/share/ca-certificates
in container and runupdate-ca-certificates
What's the certificate and where I can find it?
Thanks.
Same problem here after migration from 2.2 to 3.1 LTS
Database : SQL Server 2016 on Windows Server (TLS 1.2 enabled)
Asp.net core app (2.2) on debian : works.
Same Asp.net core app (3.1 LTS) on debian : doesn't work.
Tried both System.Data.SqlClient and Microsoft.Data.SqlClient.
Worked using the docker command :
RUN sed -i 's/DEFAULT@SECLEVEL=2/DEFAULT@SECLEVEL=1/g' /etc/ssl/openssl.cnf
Please try to change Connection Timeout from 0 to 30, If you have given Connection Timeout = 0
Thanks
I have this problem.
Im using Azure Web App Container with Linux. trying connect to a SQL Azure Database.
The error is intermittent, but very frequent
2020-04-11T19:37:40.418839620Z Microsoft.Data.SqlClient.SqlException (0x80131904): A connection was successfully established with the server, but then an error occurred during the login process. (provider: TCP Provider, error: 35 - An internal exception was caught)
2020-04-11T19:37:40.419144120Z ---> System.IO.IOException: The write operation failed, see inner exception.
2020-04-11T19:37:40.419164520Z ---> System.AggregateException: One or more errors occurred. (Unable to write data to the transport connection: Connection reset by peer.)
2020-04-11T19:37:40.419611020Z ---> System.IO.IOException: Unable to write data to the transport connection: Connection reset by peer.
2020-04-11T19:37:40.419622520Z ---> System.Net.Sockets.SocketException (104): Connection reset by peer
2020-04-11T19:37:40.419634920Z at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
2020-04-11T19:37:40.419639720Z --- End of inner exception stack trace ---
2020-04-11T19:37:40.420576920Z at System.Net.Sockets.NetworkStream.Write(Byte[] buffer, Int32 offset, Int32 size)
2020-04-11T19:37:40.420588220Z at Microsoft.Data.SqlClient.SNI.SslOverTdsStream.WriteInternal(Byte[] buffer, Int32 offset, Int32 count, CancellationToken token, Boolean async)
2020-04-11T19:37:40.421101420Z --- End of inner exception stack trace ---
2020-04-11T19:37:40.421112620Z at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
2020-04-11T19:37:40.421125920Z at System.Threading.Tasks.Task.Wait()
2020-04-11T19:37:40.422007020Z at Microsoft.Data.SqlClient.SNI.SslOverTdsStream.Write(Byte[] buffer, Int32 offset, Int32 count)
2020-04-11T19:37:40.422020520Z at System.Net.Security.SslStream.WriteSingleChunk[TWriteAdapter](TWriteAdapter writeAdapter, ReadOnlyMemory
1 buffer)
2020-04-11T19:37:40.422683520Z at System.Net.Security.SslStream.WriteAsyncInternalTWriteAdapter
2020-04-11T19:37:40.423597421Z at System.Net.Security.SslStream.Write(Byte[] buffer, Int32 offset, Int32 count)
2020-04-11T19:37:40.423609421Z at Microsoft.Data.SqlClient.SNI.SNIPacket.WriteToStream(Stream stream)
2020-04-11T19:37:40.423622021Z at Microsoft.Data.SqlClient.SNI.SNITCPHandle.Send(SNIPacket packet)
2020-04-11T19:37:40.424319421Z at Microsoft.Data.ProviderBase.DbConnectionPool.CheckPoolBlockingPeriod(Exception e)
2020-04-11T19:37:40.424331221Z at Microsoft.Data.ProviderBase.DbConnectionPool.CreateObject(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
2020-04-11T19:37:40.424841321Z at Microsoft.Data.ProviderBase.DbConnectionPool.UserCreateRequest(DbConnection owningObject, DbConnectionOptions userOptions, DbConnectionInternal oldConnection)
2020-04-11T19:37:40.425237321Z at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)
2020-04-11T19:37:40.425578921Z at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, TaskCompletionSource1 retry, DbConnectionOptions userOptions, DbConnectionInternal& connection)
2020-04-11T19:37:40.425899721Z at Microsoft.Data.ProviderBase.DbConnectionFactory.TryGetConnection(DbConnection owningConnection, TaskCompletionSource
1 retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionInternal& connection)
2020-04-11T19:37:40.426233821Z at Microsoft.Data.ProviderBase.DbConnectionInternal.TryOpenConnectionInternal(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource1 retry, DbConnectionOptions userOptions)
2020-04-11T19:37:40.426245321Z at Microsoft.Data.ProviderBase.DbConnectionClosed.TryOpenConnection(DbConnection outerConnection, DbConnectionFactory connectionFactory, TaskCompletionSource
1 retry, DbConnectionOptions userOptions)
2020-04-11T19:37:40.426756421Z at Microsoft.Data.SqlClient.SqlConnection.TryOpen(TaskCompletionSource1 retry)
2020-04-11T19:37:40.426780421Z at Microsoft.Data.SqlClient.SqlConnection.Open()
2020-04-11T19:37:40.426785721Z at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenDbConnection(Boolean errorsExpected)
2020-04-11T19:37:40.427572121Z at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.Open(Boolean errorsExpected)
2020-04-11T19:37:40.427584421Z at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReader(RelationalCommandParameterObject parameterObject)
2020-04-11T19:37:40.428292521Z at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable
1.Enumerator.InitializeReader(DbContext _, Boolean result)
2020-04-11T19:37:40.428726021Z at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteTState,TResult
2020-04-11T19:37:40.428749821Z at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable1.Enumerator.MoveNext()
2020-04-11T19:37:40.428755521Z ClientConnectionId:2389060f-e541-494a-a770-bc9501877192
Hi everyone, since #577 fixes the hang issue and will be released with Microsoft.Data.SqlClient v2.0.0, we will close the issue. This fix will also be backported to System.Data.SqlClient soon.
The recommended solution for anyone facing "End of Stream reached" exception in future is to verify target SQL Server supports TLS 1.2+ and server certificates are encrypted with SHA256+.
There are workarounds to switch back to lower TLS version if needed, as discussed above, but starting next release (v2.0), applications will also receive a warning as implemented in #591 if a lower insecure TLS version was negotiated with server, since these versions are not recommended for client applications. It includes raising warning for TLS v1.0 and TLS 1.1 protocols.
Good to know this is being fixed. I can confirm it's still an issue in aspnet:3.1.5-Buster and it's working fine in aspnet:3.1-Bionic. Thanks!
I can confirm it's still an issue in aspnet:3.1.5-Buster
@BackTrak Have you tried with Microsoft.Data.SqlClient v2.0.0?
Most helpful comment
My issue was also due to mismatch between TLS versions between the debian image and SQL Server. The new debian image updated the openssl configuration to have TLS 1.2 as the minimum allowed version.
In my case, I was able to continue using the debian image after adding the following command to my Dockerfile:
This brings back the minimum TLS version back to TLS 1.0. Not ideal, but upgrading SQL Server is not an option right now for us.
I agree that the error message needs to be improved.
Thanks for the discussion. It helped me a lot find the problem.