Aws-sdk-java: SDK repeatedly complaining "Not all bytes were read from the S3ObjectInputStream"

Created on 27 Jun 2017 · 34Comments · Source: aws/aws-sdk-java

With a recent upgrade to the 1.11.134 SDK, tests seeking around a large CSV file is triggering a large set of repeated warnings about closing the stream early.

2017-06-27 15:47:05,121 [ScalaTest-main-running-S3ACSVReadSuite] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-27 15:47:06,730 [ScalaTest-main-running-S3ACSVReadSuite] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

full log

I've seen other people complaining about this, with the issue being closed as "user gets to fix their code", or similar. However, here's why I believe the system is overreacting: it is making invalid assumptions about the remaining amount of data & cost of read vs abort & reconnect, and it is also failing to note that it's previously warned of this.

Hadoop supports reading in files of tens to hundreds of GB, with the client code assuming its a Posix input stream where seek() is inefficient, but less inefficient than reading.

The first S3A release did call close(), rather than abort(), leading to HADOOP-11570: seeking being pathologically bad on a long input stream. What we now do, HADOOP-13047, is provide a tunable threshold for forward seeks where we skip bytes rather than seek. The default is 64K, for long-haul links a value of 512K works better. But above 512K, even over a long-haul connect, it is better to set up a new HTTPS connection than try and reuse an existing HTTP/1.1 connection. Which is what we do.

only now, every time it happens a message appears in the log, "This is likely an error". It's not, its exactly what we want to do based on our benchmarking of IO performance. We do have a faster IO mechanism when users explicitly want random access, but as thats pathological on non-seeking file reads, it's not on by default.

I'm covering this in HADOOP-14596; I think we'll end up configuring log4j so that, even in production clusters, warning messages from S3AbortableInputStream are not logged. This is a bit dangerous.

Here are some ways which our logs could be improved without having to be so drastic.

Look at the amount of remaining data before warning of suboptimal performance. If it's more than some threshold (64K? 128K?) then don't complain. Instead consider that the caller has made the optimal choice.
Remember that it's been warned once & not bother repeating it on every single seek() call.

thanks.

guidance

Source

steveloughran

👍2

Most helpful comment

Any update? still encountered this issue.

innobead on 2 May 2018

👍11

All 34 comments

related to #1111, #1203. Also surfacing in IMPALA-5387.

steveloughran on 27 Jun 2017

Is it possible to explicitly call abort if it's the optimal choice for the circumstances? If you call abort then the warning message is not logged, and for cases where you do want to reuse the connection you can just drain the stream.

BTW close when there are bytes left in the stream does abort the underlying connection. This was a change we made in 1.11.x.

shorea on 27 Jun 2017

OK, so that's the big change...it used to stream through but now it aborts. That's a major change. We can fix that. Will a skip(EOF-pos) do it?

steveloughran on 27 Jun 2017

@steveloughran Yes, that should be sufficient to drain the stream.

dagnir on 27 Jun 2017

OK, We'll do that, plus also

add count of bytes remaining in final stream close to the instrumentation we track per stream (and coalesce into the whole FS instance after)
add in our troubleshooting what to do if you see the log message.
It'd still be good if it only got logged once though.

steveloughran on 27 Jun 2017

Current status: if we drain the stream a lot of the messages go away, but we still seem to get told off on an abort() call, even though that's an explicit request to break the connection

testRandomIONormalPolicy: Random IO with policy "normal"

2017-06-28 15:24:21,013 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[2097152-45603307], length=131072, streamPosition=0, nextReadPosition=2097152
2017-06-28 15:24:21,895 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream seekInStream(): soft
2017-06-28 15:24:21,896 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:21,896 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:21,897 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: seekInStream(); remaining=43375083 streamPos=2228224, nextReadPos=131072, request range 2097152-45603307 length=45603307
2017-06-28 15:24:21,897 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[131072-45603307], length=131072, streamPosition=131072, nextReadPosition=131072
2017-06-28 15:24:23,227 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream seekInStream(): soft
2017-06-28 15:24:23,227 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:23,228 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:23,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: seekInStream(); remaining=45341163 streamPos=262144, nextReadPos=5242880, request range 131072-45603307 length=45603307
2017-06-28 15:24:23,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[5242880-45603307], length=65536, streamPosition=5242880, nextReadPosition=5242880
2017-06-28 15:24:24,226 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream seekInStream(): soft
2017-06-28 15:24:24,228 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:24,228 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:24,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: seekInStream(); remaining=40294891 streamPos=5308416, nextReadPos=1048576, request range 5242880-45603307 length=45603307
2017-06-28 15:24:24,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[1048576-45603307], length=1048576, streamPosition=1048576, nextReadPosition=1048576
2017-06-28 15:24:26,310 [JUnit-testRandomIONormalPolicy] INFO  contract.ContractTestUtils (ContractTestUtils.java:end(1511)) - Duration of Time to execute 4 reads of total size 1376256 bytes: 5,293,705,537 nS
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream close() operation: soft
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: close() operation; remaining=43506155 streamPos=2097152, nextReadPos=0, request range 1048576-45603307 length=45603307
2017-06-28 15:24:26,312 [JUnit-testRandomIONormalPolicy] INFO  scale.ITestS3AInputStreamPerformance (ITestS3AInputStreamPerformance.java:logTimePerIOP(203)) - Time per byte read: 3,846 nS
2017-06-28 15:24:26,312 [JUnit-testRandomIONormalPolicy] INFO  scale.ITestS3AInputStreamPerformance (ITestS3AInputStreamPerformance.java:executeRandomIO(464)) - Effective bandwidth 0.247936 MB/S
2017-06-28 15:24:26,313 [JUnit-testRandomIONormalPolicy] INFO  scale.ITestS3AInputStreamPerformance (ITestS3AInputStreamPerformance.java:logStreamStatistics(365)) - Stream Statistics
StreamStatistics{OpenOperations=4, CloseOperations=4, Closed=0, Aborted=4, SeekOperations=2, ReadExceptions=0, ForwardSeekOperations=0, BackwardSeekOperations=2, BytesSkippedOnSeek=0, BytesBackwardsOnSeek=6356992, BytesRead=1376256, BytesRead excluding skipped=1376256, ReadOperations=159, ReadFullyOperations=4, ReadsIncomplete=155, BytesReadInClose=0, BytesDiscardedInAbort=172517292}
2017-06-28 15:24:26,313 [JUnit-testRandomIONormalPolicy] INFO  contract.AbstractFSContractTestBase (AbstractFSContractTestBase.java:describe(255)) - cleanup
2017-06-28 15:24:26,545 [JUnit-testRandomIONormalPolicy] INFO  contract.AbstractFSContractTestBase (AbstractFSContractTestBase.java:describe(255)) - closing file system

steveloughran on 28 Jun 2017

Here's what I suspect is happening. We've invoked abort(), but a stream close() is still being called (somewhere) and that's triggering the warning, even though the stream is now aborted & shouldn't be considered live.

I think maybe you should see if the operations in 'abort()` are complete (set the http channel to null? the input stream? & make sure its in sync with the expectations of the warning code in close().

There's nothing else we can do in our code. I'm draining the stream on a close when the remaining bytes is in our configured skip range.

(there's one flaw in my hypothesis: after we call 'abort(), we *dont* callclose()`.

steveloughran on 28 Jun 2017

It's happening in your abort call

    @Override
    public void abort() {
        if (httpRequest != null) {
            httpRequest.abort();
        }
        IOUtils.closeQuietly(in, null);    // **HERE**
    }

think you should catch the special case "this is an abort so don't worry about it" & not log.

steveloughran on 28 Jun 2017

Ah I see, yeah we can fix that. Thanks for reporting.

shorea on 28 Jun 2017

👍1

This was fixed in 1.11.163. Going to go ahead and close this.

dagnir on 20 Jul 2017

👍3

I'm still getting this error in 1.11.182. I'm using AWS-SDK as the mechanism to access S3 within Apache Spark. Has anyone else seen this issue? Spark is pulling down a 640Kb test file and throwing up this error.

17/08/25 23:11:32 WARN S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

dado0583 on 26 Aug 2017

you will see this at the tail of reads in S3A if the version of hadoop you are using doesn't have HADOOP-14596 in to drain the stream at the end of any non-aborted cloeStream() call.

Which version of the Hadoop JARs are on your CP? There have been no ASF releases with this patch (indeed, no ASF releases built & tested against 1.11.x, AFAIK)

steveloughran on 26 Aug 2017

I'm still seeing this issue, I call s3Object.getObjectContent.abort(), but I get the warning on "com.amazonaws" % "aws-java-sdk-s3" % "1.11.200". It appears to be coming exactly from the place @steveloughran pointed out above.

mirosval on 22 Sep 2017

👍1

Is the call to abort throwing an exception. Looks like we set the flag after we try to abort so if it fails we may still log the warning.

shorea on 22 Sep 2017

If I apply the HADOOP-14890 patch to update the AWS SDK for Hadoop, the message goes away from hadoop-3. That means the abort() path is silent; we're making sure we drain the input stream when we want to stop reading before the end of the stream, but not actually force an abort. I'm happy...just need to get the SDK update in.

steveloughran on 22 Sep 2017

@shorea: The error might be here:

https://github.com/aws/aws-sdk-java/blob/293156398cfc79b2beb29b98e84147942f165d4c/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/model/S3ObjectInputStream.java#L94

The comment says "The default abort() implementation calls abort on the wrapped stream if it's an SdkFilterInputStream; otherwise we'll need to close the stream." but the code doesn't actually do that -- it calls close() if the wrapped stream is not a SdkFilterInputStream, and does nothing if the wrapped stream is a SdkFilterInputStream.

cc: @dhruba

tudor on 25 Sep 2017

The S3ObjectInputStream class extends SdkFilterInputStream and the abort() in S3ObjectInputStream performs super.abort() which is the default abort() implementation that aborts the wrapped stream as described.

tnorthrup on 23 Oct 2017

I'm in 1.11.221, and I'm seeing the IOUtils.closeQuietly at the end of S3ObjectInputStream.abort printing this warning. The "aborted" flag is just not getting set correctly, it's hard to track what the intention was to get it set though.

davidvandebunte on 2 Nov 2017

@davidvandebunte would you happen to have a reproducible code sample for this problem?

shorea on 2 Nov 2017

@shorea Try HelloWorld.java.txt

davidvandebunte on 3 Nov 2017

Also experiencing this with version 1.11.215:

// try-with-resources auto-close adds a 2nd WARN to log
try (final S3ObjectInputStream is = amazonS3.getObject(new GetObjectRequest(bucketName,
                    keyName)).getObjectContent()){
            myService.process(is); // read most of the input stream, but not all -it's a ZIP archive and I read 1 file inside the ZIP archive (which is ~99% of the file in my case)
            is.abort(); // adds a 3rd WARN to the log
        }

So by closing the input stream I get a 2nd warning to the log. And then by explicitly calling abort I get a 3rd WARN message to the log.

Workaround will be to consume rest of input stream using another library but there ought to be a way to suppress this warning. Also it would very helpful if the WARN message included a stack trace or some other meta data as it took a while to figure out what code was generating this warn message.

jhovell on 7 Dec 2017

@jhovell if you look @ what we do in Hadoop's S3A connector, if we aren't that far from the EOF, we just skip() to the end. Further away (which means a few hundred KB) and we call the abort, accept that the error text can get printed & not worry about it. I think we may have some coverage in our troubleshooting docs...if not something should go in.

FWIW, libraries trying to be "helpful" by downgrading close() to read-to-EOF, and abort to close() in order to improve HTTP connection pooling are a recurrent problem with Hadoop object store connectors. When you have a 16GB file and want to stop reading 32 MB in, abort() is absolutely the correct strategy, and we know exactly what we are doing, at least until HTTP/2 becomes the transport

steveloughran on 7 Dec 2017

@all has there been any progress on this ticket yet?

okulkarni-chwy on 27 Feb 2018

Any update? still encountered this issue.

innobead on 2 May 2018

👍11

Fixed for me in hadoop-aws 2.9.0 and later.

tekumara on 5 Jun 2018

yes, fixed on 2.9 and 3.0 with the AWS update of HADOOP-14890

steveloughran on 5 Jun 2018

I'm getting this warning when aborting on 1.11.355. I think it's a regression or incomplete fix. I filed a new bug: https://github.com/aws/aws-sdk-java/issues/1657

atorstling on 28 Jun 2018

Still happening on 1.11.368

metabrain on 19 Jul 2018

@metabrain we have a small fix per https://github.com/aws/aws-sdk-java/issues/1657. This should be released today.

shorea on 19 Jul 2018

Hi,

I still got the following message:

18/07/31 18:27:09 WARN S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
18/07/31 18:27:14 WARN S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

I only checked for a sample, but the tasks producing the logs containing those messages were also the ones which only complete after an unreasonable amount of time.

PowerToThePeople111 on 31 Jul 2018

👍2

1.11.435 happens too

echokk11 on 3 Dec 2018

For people like @PowerToThePeople111 who are seeing this w/ S3A, download cloudstore, run storediag with the -j option to see what JARs are actually on your CP. We shouldn't see this on recent Hadoop releases, e.g 3.1+.

steveloughran on 3 Dec 2018

Experiencing the same in 1.11.595

WARN S3AbortableInputStream:178 - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

naveen09 on 7 Oct 2019

@naveen09 -you need to explicitly drain the bytes in the stream, call abort() or do a ranged get. It should not be showing this message on an explicit call to abort()

steveloughran on 14 Oct 2019

Was this page helpful?

0 / 5 - 0 ratings