Hello,
I had a situation where I specified SslProtocls.Tls and tried to talk to a server with Tls (aka Tlsv1) but the server was configured to not speak that protocol.
Understandably, the handshake failed, but the exception stack trace I got was pretty opaque and it took me a while to debug. This is on Linux. Not sure if there is more information that we can use to improve the error message, but regardless, filing an issue at least so that if someone else hits the same thing they know that it is an error in protocol negotiation. Setting SslProtocols to SslProtocols.None (aka "use system defaults") fixed it for me.
Detailed trace:
Unhandled Exception: System.AggregateException: One or more errors occurred. (WebSocket connection failure.) ---> System.Net.WebSockets.WebSocketException: WebSocket connection failure. ---> System.Security.Authentication.AuthenticationException: A call to SSPI failed, see inner exception. ---> Interop+OpenSsl+SslException: SSL Handshake failed with OpenSSL error - SSL_ERROR_SSL. ---> System.Security.Cryptography.CryptographicException: Error occurred during a cryptographic operation.
--- End of inner exception stack trace ---
at Interop.OpenSsl.DoSslHandshake(SafeSslHandle context, Byte[] recvBuf, Int32 recvOffset, Int32 recvCount, Byte[]& sendBuf, Int32& sendCount)
at System.Net.Security.SslStreamPal.HandshakeInternal(SafeFreeCredentials credential, SafeDeleteContext& context, SecurityBuffer inputBuffer, SecurityBuffer outputBuffer, SslAuthenticationOptions sslAuthenticationOptions)
--- End of inner exception stack trace ---
at System.Net.Security.SslState.StartSendAuthResetSignal(ProtocolToken message, AsyncProtocolRequest asyncRequest, ExceptionDispatchInfo exception)
at System.Net.Security.SslState.CheckCompletionBeforeNextReceive(ProtocolToken message, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.StartReadFrame(Byte[] buffer, Int32 readBytes, AsyncProtocolRequest asyncRequest)
at System.Net.Security.SslState.PartialFrameCallback(AsyncProtocolRequest asyncRequest)
--- End of stack trace from previous location where exception was thrown ---
at System.Net.Security.SslState.InternalEndProcessAuthentication(LazyAsyncResult lazyResult)
at System.Net.Security.SslState.EndProcessAuthentication(IAsyncResult result)
at System.Net.Security.SslStream.EndAuthenticateAsClient(IAsyncResult asyncResult)
at System.Net.Security.SslStream.<>c.<AuthenticateAsClientAsync>b__43_2(IAsyncResult iar)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
I believe we are just throwing the error we get from libcurl and there is not much more we can do with that :(. I have seen similar problems when people use certificates with TLS-version-incompatible crypto algorithms. Not a great troubleshooting experience either.
cc @wfurt @stephentoub @geoffkizer if they have any ideas.
Closing as there is likely not any action on our side. We can reopen if we find something we could do.
@karelz, this isn't related to libcurl; as the call stack shows, it's coming from our interactions with OpenSSL.
@brendandburns, what version are you using? If not 2.1, can you try with 2.1?
In general we do try to propagate error codes/messages when we can:
https://github.com/dotnet/corefx/blob/525ba72827cbadce4e70e8d16e843b6cc6e66deb/src/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.ERR.cs#L89
To Karel's point, it's possible you're hitting a situation where we can't:
https://github.com/dotnet/corefx/blob/525ba72827cbadce4e70e8d16e843b6cc6e66deb/src/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.ERR.cs#L73-L76
but it'd be good to determine if there's something better that can be done, e.g. maybe there's a place we're not handling this correctly.
Note, too, that the code I referenced above has been tweaked for 2.1, hence my suggestion to try to that.
This was in 2.1 thanks
--brendan
From: Stephen Toub notifications@github.com
Sent: Thursday, March 22, 2018 9:20:14 AM
To: dotnet/corefx
Cc: Brendan Burns; Mention
Subject: Re: [dotnet/corefx] SSL TLS protocol mis-match not handled cleanly on Linux (#28365)
@karelzhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fkarelz&data=04%7C01%7Cbburns%40microsoft.com%7Cdc813d4fa55c4f1fd51608d59010cb94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636573324165059431%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=cv9PjKxnvi5M40eJw7Qz7Qoz5uUw0BVkdk%2F0R5BTYrc%3D&reserved=0, this isn't related to libcurl; as the call stack shows, it's coming our interactions with OpenSSL.
@brendandburnshttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fbrendandburns&data=04%7C01%7Cbburns%40microsoft.com%7Cdc813d4fa55c4f1fd51608d59010cb94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636573324165059431%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=Cj3u4UDn7qNkAMH5YcMS769AOEmxPS3LhAwFqJteh7o%3D&reserved=0, what version are you using? If not 2.1, can you try with 2.1?
In general we do try to propagate error codes/messages when we can:
https://github.com/dotnet/corefx/blob/525ba72827cbadce4e70e8d16e843b6cc6e66deb/src/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.ERR.cs#L89https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fcorefx%2Fblob%2F525ba72827cbadce4e70e8d16e843b6cc6e66deb%2Fsrc%2FCommon%2Fsrc%2FInterop%2FUnix%2FSystem.Security.Cryptography.Native%2FInterop.ERR.cs%23L89&data=04%7C01%7Cbburns%40microsoft.com%7Cdc813d4fa55c4f1fd51608d59010cb94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636573324165059431%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=24mG33CYP7%2FxYwYXWqLpXeDVVM5ANGk6kB%2BXfvTnkMM%3D&reserved=0
To Karel's point, it's possible you're hitting a situation where we can't:
https://github.com/dotnet/corefx/blob/525ba72827cbadce4e70e8d16e843b6cc6e66deb/src/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.ERR.cs#L73-L76https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fcorefx%2Fblob%2F525ba72827cbadce4e70e8d16e843b6cc6e66deb%2Fsrc%2FCommon%2Fsrc%2FInterop%2FUnix%2FSystem.Security.Cryptography.Native%2FInterop.ERR.cs%23L73-L76&data=04%7C01%7Cbburns%40microsoft.com%7Cdc813d4fa55c4f1fd51608d59010cb94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636573324165059431%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=A7Kl65vnH%2Fgu9Pw0NgBJi9EB4Qj2THyIddJyMfVZ%2FBQ%3D&reserved=0
but it'd be good to determine if there's something better that can be done, e.g. maybe there's a place we're not handling this correctly.
Note, too, that the code I referenced above has been tweaked for 2.1, hence my suggestion to try to that.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fdotnet%2Fcorefx%2Fissues%2F28365%23issuecomment-375366657&data=04%7C01%7Cbburns%40microsoft.com%7Cdc813d4fa55c4f1fd51608d59010cb94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636573324165059431%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=%2B53olg%2FGAmT5CtTuJHgmtHHvfN1Q%2F6I5SIihN%2FmuwTs%3D&reserved=0, or mute the threadhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFfDgpGQd5LW0Kd0LokhoVXz5bTiga69ks5tg88-gaJpZM4S3FiF&data=04%7C01%7Cbburns%40microsoft.com%7Cdc813d4fa55c4f1fd51608d59010cb94%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636573324165059431%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=SaxwAYUWIdutLCas5XDxxxmgVARrAJHMqfIFsaH27rI%3D&reserved=0.
@brendandburns, ok, thanks. Then next question, do you see a similarly vague error in 2.0, if you could possibly try it there? I just want to confirm we weren't previously returning a better error and we somehow broke it. One of the changes made in 2.1 involves how the OpenSSL error queue is cleared out, so it's possible we've lost something. cc: @Drawaes
If it's confirmed a behavioral change I will take a look for sure. If it's the same as before then I am not sure it's worth rushing in a fix before 2.1?
@karelz I just got stung by the exact same behavior; do you think it's possible to re-open this issue?
This is a regression in 2.1 from 2.0.
Simple test:
```C#
using System;
using System.Net.Security;
using System.Net.Sockets;
using System.Security.Authentication;
using System.Threading.Tasks;
class Program
{
static async Task Main()
{
using (var s = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
{
await s.ConnectAsync("www.ssllabs.com", 10301); // endpoint supports only TLS 1.0
using (var ns = new NetworkStream(s, ownsSocket: false))
using (var ssl = new SslStream(ns, true, delegate { return true; }))
{
await ssl.AuthenticateAsClientAsync("www.ssllabs.com", null, SslProtocols.Tls11, false);
Console.WriteLine("Connected");
}
}
}
}
On Ubuntu 17.10 on .NET Core 2.0 I get this:
Unhandled Exception: System.Security.Authentication.AuthenticationException: A call to SSPI failed, see inner exception. ---> Interop+OpenSsl+SslException: SSL Handshake failed with OpenSSL error - SSL_ERROR_SSL. ---> Interop+Crypto+OpenSslCryptographicException: error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol
--- End of inner exception stack trace ---
at Interop.OpenSsl.DoSslHandshake(SafeSslHandle context, Byte[] recvBuf, Int32 recvOffset, Int32 recvCount, Byte[]& sendBuf, Int32& sendCount)
at System.Net.Security.SslStreamPal.HandshakeInternal(SafeFreeCredentials credential, SafeDeleteContext& context, SecurityBuffer inputBuffer, SecurityBuffer outputBuffer, Boolean isServer, Boolean remoteCertRequired)
...
whereas just changing to use .NET Core 2.1 Preview 1 produces this:
Unhandled Exception: System.Security.Authentication.AuthenticationException: A call to SSPI failed, see inner exception. ---> Interop+OpenSsl+SslException: SSL Handshake failed with OpenSSL error - SSL_ERROR_SSL. ---> System.Security.Cryptography.CryptographicException: Error occurred during a cryptographic operation.
--- End of inner exception stack trace ---
at Interop.OpenSsl.DoSslHandshake(SafeSslHandle context, Byte[] recvBuf, Int32 recvOffset, Int32 recvCount, Byte[]& sendBuf, Int32& sendCount)
at System.Net.Security.SslStreamPal.HandshakeInternal(SafeFreeCredentials credential, SafeDeleteContext& context, SecurityBuffer inputBuffer, SecurityBuffer outputBuffer, SslAuthenticationOptions sslAuthenticationOptions)
The key difference is:
Interop+Crypto+OpenSslCryptographicException: error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol
vs
System.Security.Cryptography.CryptographicException: Error occurred during a cryptographic operation.
```
cc: @Drawaes, @bartonjs
Late reopening - adding Post-ZBB.
I should be able to take a look tonight
@Drawaes, any luck?
@rmkerr can you please take a look?
@Drawaes do you have any thoughts on what might be causing this? I've gone through the obvious possibilities and have not found anything. I can see that the exception is being thrown here:
https://github.com/dotnet/corefx/blob/bb5c46859048bfcefca281d7c303540c1217da41/src/Common/src/Interop/Unix/System.Security.Cryptography.Native/Interop.OpenSsl.cs#L187-L197
But this code hasn't been modified recently, and neither has the implementation of GetSslError. SslDoHandshake is an external call, so I wouldn't expect it to change this. Is there other code that might be affecting the errors we get from OpenSsl? Excuse me if these are simple questions, this is the first time I've taken a look at this code :)
@pjanotti, does this look similar to the SSL issues you have been looking into?
Yes, this is the part that I was mentioning that we need to grab the errors from the queue and actually display them. The code is using SSL_get_error to retrieve the error but that only peeks at the error and always returns SSL_ERROR_SSL or SSL_ERROR_SYS_CALL if there is any error on the queue. If there are no errors on the queue it can return errors related to BIO like SSL_ERROR_WANT_READ in the snippet above.
OpenSSL is/should-be pushing SSLerr(SSL_F_SSL23_GET_SERVER_HELLO, SSL_R_UNSUPPORTED_PROTOCOL); in this case (server responded with a protocol version not allowed by the client mask).
That suggests that the error queue is either being cleared between the push and the read, or that the push and the read happened on different native threads.
The error is being cleared when Ssl.SslGetError is called: it contains a ERR_clear_queue call since dotnet/corefx#25646 (see CryptoNative_SslGetError).
CreateOpenSslCryptographicException() relies, via CryptoNative_ErrGetErrorAlloc, on the error still being on the queue to properly report it.
Okay so the issue is that the sslgeterror should actually use the error there rather than relying on the function to do it for it.
I can fix it but won't be back in mobile/wifi until tomorrow night. I wonder if the connection error we are seeing with the http client will be better diagnosed with this fixed as well.
@Drawaes we also have another bug (#25676) related to the same PR but that one requires a bit of code inspection: the issue is that at least one call is not checking and clearing if it added errors to the queue, later when we try to decrypt and call Ssl.SslGetError it reports SSL_ERROR_SSL because of that earlier error while it should have proceeded with one of the cases of https://github.com/dotnet/corefx/blob/c533892f2e57940ec9e66616288bf340b75a9217/src/System.Net.Security/src/System/Net/Security/SslStreamPal.Unix.cs#L208-L219
The idea of cleaning up the errors only when they happen is a good one and improves the performance quite a lot, but we need to scrub the code to see if there are other places in which we can leave errors in the queue. To be fair I am aware of only one such location and I haven't looked yet at the scope of the overall thing.
What is the connection error that you are referring to?
Hm. I noticed the "while the queue seems to have more than one thing keep popping" part, but missed the "read one more". (I wonder if that was also true when I reviewed the change back then) I'd keep the cost of the two P/invokes over changing the signature on the shim methods; but perhaps I slide back and forth across commits more than the average bear.
I would also be for keeping two p/invokes. At this point it is an error condition anyway and the pinvoke will likely be dwarfed by the exception throw anyway.
@Drawaes and @rmkerr I'm going to pick this up because I need it for dotnet/corefx#28862
PR dotnet/corefx#28862 has a fix for this. Basically a return to what it was before, i.e.: managed side in charge of cleaning up the queue when a SSL_ERROR_SSL is hit. It is implicit, we may want to revisit that again later and consider something like capturing the whole queue and not only the last error. Anyway, as of now the PR is not ready to be merged due to test failures in the other issue being addressed in the same PR.
@pjanotti can you split off the known fixes from the testing part in the PR?
As we discussed - it would be great to flow known targeted fixes in (assuming they do not regress anything else).
It is hard to keep track of PR which does bunch of additional testing and experiments as well. It will help us improve things in isolation from each other.
Most helpful comment
@Drawaes and @rmkerr I'm going to pick this up because I need it for dotnet/corefx#28862