Aws-sdk-net: S3 PutObjectAsync fails when using the Sockets Http Handler

Created on 26 Sep 2018  路  25Comments  路  Source: aws/aws-sdk-net


Put object requests that worked fine when using .net core 2.0.3 fails after updating to .net core 2.1.4. The only workaround is setting the environment variable DOTNET_SYSTEM_NET_HTTP_USESOCKETSHTTPHANDLER to 0. Get and List requests seem to be unaffected.

Expected Behavior



The request should succeed just like it did prior to .net core 2.1

Current Behavior






An exception is thrown:

System.Net.Http.HttpRequestException: The server returned an invalid or unrecognized response.
   at System.Net.Http.HttpConnection.ThrowInvalidHttpResponse()
   at System.Net.Http.HttpConnection.SendAsyncCore(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
   at System.Net.Http.HttpClient.FinishSendAsyncUnbuffered(Task`1 sendTask, HttpRequestMessage request, CancellationTokenSource cts, Boolean disposeCts)
   at Amazon.Runtime.HttpWebRequestMessage.GetResponseAsync(CancellationToken cancellationToken) in E:\JenkinsWorkspaces\v3-trebuchet-release\AWSDotNetPublic\sdk\src\Core\Amazon.Runtime\Pipeline\HttpHandler\_mobile\HttpRequestMessageFactory.cs:line 428
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeAsync[T](IExecutionContext executionContext) in E:\JenkinsWorkspaces\v3-trebuchet-release\AWSDotNetPublic\sdk\src\Core\Amazon.Runtime\Pipeline\HttpHandler\HttpHandler.cs:line 175
   at Amazon.Runtime.Internal.RedirectHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.S3.Internal.AmazonS3ResponseHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeAsync[T](IExecutionContext executionContext) in E:\JenkinsWorkspaces\v3-trebuchet-release\AWSDotNetPublic\sdk\src\Core\Amazon.Runtime\Pipeline\Handlers\CredentialsRetriever.cs:line 98
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeAsync[T](IExecutionContext executionContext) in E:\JenkinsWorkspaces\v3-trebuchet-release\AWSDotNetPublic\sdk\src\Core\Amazon.Runtime\Pipeline\RetryHandler\RetryHandler.cs:line 137
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.S3.Internal.AmazonS3ExceptionHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeAsync[T](IExecutionContext executionContext)
   at Amazon.Runtime.Internal.MetricsHandler.InvokeAsync[T](IExecutionContext executionContext)

Possible Solution


Steps to Reproduce (for bugs)





The below code fails:

```C#
using(IAmazonS3 client)
{
PutObjectRequest req = new PutObjectRequest
{
BucketName = "name",
Key = "key",
InputStream = memoryStream,
CannedACL = S3CannedACL.Private,
ServerSideEncryptionMethod = ServerSideEncryptionMethod.AWSKMS,
ServerSideEncryptionKeyManagementServiceKeyId = "kmsKeyId"
};

PutObjectResponse res = await client.PutObjectAsync(req).ConfigureAwait(false);

}
```

Context



Rolling back to the previous HttpClientHandler implementation is not desirable in a lot of cases. There are some features that are lost by doing this in certain instances.

Your Environment

  • AWSSDK.Core version used: 3.3.25.3 (and AWSSDK.S3 3.3.24)
  • Operating System and version: Windows 10 and 7
  • Visual Studio version: 2017 15.7.5
  • Targeted .NET platform: .net core 2.1.4

.NET Core Info

  • .NET Core version used for development: 2.1.401
  • Output of dotnet --info:

.NET Core SDK (reflecting any global.json):
Version: 2.1.401
Commit: 91b1c13032

Runtime Environment:
OS Name: Windows
OS Version: 10.0.15063
OS Platform: Windows
RID: win10-x64
Base Path: C:\Program Files\dotnet\sdk\2.1.401\

Host (useful for support):
Version: 2.1.4
Commit: 85255dde3e

.NET Core SDKs installed:
1.0.0 [C:\Program Files\dotnet\sdk]
2.0.0 [C:\Program Files\dotnet\sdk]
2.1.2 [C:\Program Files\dotnet\sdk]
2.1.4 [C:\Program Files\dotnet\sdk]
2.1.202 [C:\Program Files\dotnet\sdk]
2.1.300 [C:\Program Files\dotnet\sdk]
2.1.401 [C:\Program Files\dotnet\sdk]

.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.All 2.1.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.0 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.AspNetCore.App 2.1.3 [C:\Program Files\dotnet\shared\Microsoft.AspNetCore.App]
Microsoft.NETCore.App 1.0.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 1.1.1 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.5 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.0.9 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.1.0 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.1.3 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]
Microsoft.NETCore.App 2.1.4 [C:\Program Files\dotnet\shared\Microsoft.NETCore.App]

To install additional .NET Core runtimes or SDKs:
https://aka.ms/dotnet-download

response-requested

All 25 comments

Hi @pnquest,

I have not been able to reproduce your issue. I'm using 2.1.401 .NET SDK and 2.1.4 runtime, but I am running on a Mac. I could try later on Windows.

In the meantime, can you please do a couple things?:

For starters, can you please provide us with the request id of a failing request and the content of the response?

Please then verify that you can get this to work correctly using .NET Core 2.0.3 using the same parameters (i.e. bucket, key, kmskey)? If it doesn't work, it might be an issue with those parameters, such as a name that is not ASCII.

@klaytaybai I can switch the target framework from netcoreapp2.1 to netcoreapp2.0 and back again without changing anything else and it will fail for netcoreapp2.1 every time and succeed with netcoreapp2.0. We have seen this in 3 different code bases and 3 different machines, 2 of which should be close to the same, and the other of which is actually running windows 7.

As to the request ID and content, since the method throws an exception, I am not getting an object back. Is there some other way to get the information you are looking for?

Try using Fiddler or Wireshark to capture your network traffic. They should be able to record a response before the exception is thrown. The hope is that you are receiving a response back that is at least partially readable but probably malformed.

@klaytaybai I have not had a chance to get you a request ID or content yet, but we were able to confirm that we cannot reproduce this issue when running on linux, so it seems a pretty safe bet that this issue is localized to Windows.

I will reply again with a request ID once I have one (probably tomorrow).

Hi @pnquest, thank you for checking in.

I tried to reproduce the issue on Windows, but I still didn't manage to reproduce it. I suspect that we would be getting more complaints or +1s if this was an easily reproducible issue.

Let me know if you manage to capture a failing response.

Hi, just thought I would pop in and throw a wrinkle into the mix - I am on a mac, and am seeing the same issue with my .net core 2.1.401 app. Specifying the environment variable from above fixes it for me, too. What kind of traces do you need for diagnosis, again?

@dchw, they asked for a request id for a failed request. I have not been able to get to providing that yet, although it is on my list of things to do. If you are able to do so, that would actually be much appreciated!

Interesting. Using Wireshark, I don't see a request id header coming back. I see an HTTP 100 continue (no headers), and then a buncha TCP traffic for sending up the binary data, and then... nothing? I am applying a filter based on traffic using the server that gave me the 100 Continue message.

Even more interesting, I haven't changed anything in my code that uploads data to s3, it seems to have "just started" (I know, I know) this morning as I arrived to work. Only difference is that it worked from home on Sunday...

@dchw Could this be proxy-related somehow then? The only thing that I find odd about that idea is that I am making multiple list and get calls prior to my put and all of them go through no problem.

@pnquest Its gotta be. I just tethered to a coworkers iPhone and tested the exact same request. It worked. Connected back to corporate network: it failed.

Next problem: How the heck do I write this helpdesk ticket?

@dchw I still have to wonder, the fact that this works using the netcoreapp2.0 version of the ClientHandler makes me think that something is still wrong, just maybe not with the s3 sdk, but with the client handler instead? I am equally uncertain how to progress things from a helpdesk perspective

@pnquest Sat down with my local IT wizard and he and I were able to confirm it is the firewall... and it was even breaking this when empty policies were applied.

No policy at all made it work!

They are still digging into solutions here, I will let you know what we did to fix this once we get it fixed.

@pnquest and @dchw, thank you both for working hard on this. Please let us know if you think you need more help from the AWS side.

@dchw Do you think it would be worth opening an issue on the dotnet core repo as well, or did your IT guy seem to think it may be a configuration problem?

@pnquest He seemed to think it was something to do with proxy vs flow mode on the firewall, we may be playing with it later today

@pnquest More information: Using the same hardware with policies in places in _flow_ mode over _proxy_ mode the call worked properly.

I did some more digging and it appears that these are the changes we are seeing break the code from 2.0 -> 2.1: https://github.com/dotnet/corefx/issues/28353

@dchw It is good to know that firewall change works. That environment variable that "solves" the issue essentially forces use of the .net core 2.0 HttpClient implementation so this 100% makes sense. I think it would probably be prudent to open an issue with the corefx repo if one does not already exist. Do you and @klaytaybai agree that makes the most sense at this point?

Actually I think dotnet/corefx#31423 may be the same issue. Do you agree?

@pnquest Similar issue as far as I can tell, but at least in the packet capture I did the sequence was more of a request -> 100 Continue -> Send binary data -> No response from server. Still worth poking in and seeing if its related.

@pnquest , @dchw It does seem fairly closely related to the point where just adding a comment with your own experience and call stack is fine. If you do make a comment, maybe ask if you think your situation warrants a new issue.

I'm not sure there is much work for AWS to do on this. Are you both okay with us closing out this issue? We can reopen at any time if you think that we should make any changes.

@klaytaybai Fine by me.

Thanks for dealing with us hashing it out in this issue even if only slightly related!

I agree that it is likely that the fix will be solely with corefx. Thank you for letting us work through that, @klaytaybai, and thank you as well @dchw!

@dchw I just found another corefx issue that is even closer to our own experience. It seems to suggest that this problem may be fixed in the recently released 2.1.5 version of the runtime. I am out of the office and won't be able to test till Monday, but just throught I would give you a heads up if you are able to get to it before I am.

I was just able to test, and I am still seeing the issue when using the 2.1.5 runtime.

Simple fix for this problem

Create new class

private class PutObjectRequestNoContinue : PutObjectRequest 
{
    protected override bool Expect100Continue => false;
}

and use that instead PutObjectRequest

Was this page helpful?
0 / 5 - 0 ratings