Aws-sdk-java: SDK repeatedly complaining "Not all bytes were read from the S3ObjectInputStream"

Created on 27 Jun 2017  路  34Comments  路  Source: aws/aws-sdk-java

With a recent upgrade to the 1.11.134 SDK, tests seeking around a large CSV file is triggering a large set of repeated warnings about closing the stream early.

2017-06-27 15:47:05,121 [ScalaTest-main-running-S3ACSVReadSuite] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-27 15:47:06,730 [ScalaTest-main-running-S3ACSVReadSuite] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

full log

I've seen other people complaining about this, with the issue being closed as "user gets to fix their code", or similar. However, here's why I believe the system is overreacting: it is making invalid assumptions about the remaining amount of data & cost of read vs abort & reconnect, and it is also failing to note that it's previously warned of this.

Hadoop supports reading in files of tens to hundreds of GB, with the client code assuming its a Posix input stream where seek() is inefficient, but less inefficient than reading.

The first S3A release did call close(), rather than abort(), leading to HADOOP-11570: seeking being pathologically bad on a long input stream. What we now do, HADOOP-13047, is provide a tunable threshold for forward seeks where we skip bytes rather than seek. The default is 64K, for long-haul links a value of 512K works better. But above 512K, even over a long-haul connect, it is better to set up a new HTTPS connection than try and reuse an existing HTTP/1.1 connection. Which is what we do.

only now, every time it happens a message appears in the log, "This is likely an error". It's not, its exactly what we want to do based on our benchmarking of IO performance. We do have a faster IO mechanism when users explicitly want random access, but as thats pathological on non-seeking file reads, it's not on by default.

I'm covering this in HADOOP-14596; I think we'll end up configuring log4j so that, even in production clusters, warning messages from S3AbortableInputStream are not logged. This is a bit dangerous.

Here are some ways which our logs could be improved without having to be so drastic.

  1. Look at the amount of remaining data before warning of suboptimal performance. If it's more than some threshold (64K? 128K?) then don't complain. Instead consider that the caller has made the optimal choice.
  2. Remember that it's been warned once & not bother repeating it on every single seek() call.

thanks.

guidance

Most helpful comment

Any update? still encountered this issue.

All 34 comments

related to #1111, #1203. Also surfacing in IMPALA-5387.

Is it possible to explicitly call abort if it's the optimal choice for the circumstances? If you call abort then the warning message is not logged, and for cases where you do want to reuse the connection you can just drain the stream.

BTW close when there are bytes left in the stream does abort the underlying connection. This was a change we made in 1.11.x.

OK, so that's the big change...it used to stream through but now it aborts. That's a major change. We can fix that. Will a skip(EOF-pos) do it?

@steveloughran Yes, that should be sufficient to drain the stream.

OK, We'll do that, plus also

  • add count of bytes remaining in final stream close to the instrumentation we track per stream (and coalesce into the whole FS instance after)
  • add in our troubleshooting what to do if you see the log message.
    It'd still be good if it only got logged once though.

Current status: if we drain the stream a lot of the messages go away, but we still seem to get told off on an abort() call, even though that's an explicit request to break the connection

testRandomIONormalPolicy: Random IO with policy "normal"

2017-06-28 15:24:21,013 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[2097152-45603307], length=131072, streamPosition=0, nextReadPosition=2097152
2017-06-28 15:24:21,895 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream seekInStream(): soft
2017-06-28 15:24:21,896 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:21,896 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:21,897 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: seekInStream(); remaining=43375083 streamPos=2228224, nextReadPos=131072, request range 2097152-45603307 length=45603307
2017-06-28 15:24:21,897 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[131072-45603307], length=131072, streamPosition=131072, nextReadPosition=131072
2017-06-28 15:24:23,227 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream seekInStream(): soft
2017-06-28 15:24:23,227 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:23,228 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:23,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: seekInStream(); remaining=45341163 streamPos=262144, nextReadPos=5242880, request range 131072-45603307 length=45603307
2017-06-28 15:24:23,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[5242880-45603307], length=65536, streamPosition=5242880, nextReadPosition=5242880
2017-06-28 15:24:24,226 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream seekInStream(): soft
2017-06-28 15:24:24,228 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:24,228 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:24,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: seekInStream(); remaining=40294891 streamPos=5308416, nextReadPos=1048576, request range 5242880-45603307 length=45603307
2017-06-28 15:24:24,229 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(148)) - reopen(s3a://landsat-pds/scene_list.gz) for read from new offset range[1048576-45603307], length=1048576, streamPosition=1048576, nextReadPosition=1048576
2017-06-28 15:24:26,310 [JUnit-testRandomIONormalPolicy] INFO  contract.ContractTestUtils (ContractTestUtils.java:end(1511)) - Duration of Time to execute 4 reads of total size 1376256 bytes: 5,293,705,537 nS
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(456)) - Closing stream close() operation: soft
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(486)) - Aborting stream
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] WARN  internal.S3AbortableInputStream (S3AbortableInputStream.java:close(163)) - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
2017-06-28 15:24:26,311 [JUnit-testRandomIONormalPolicy] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(490)) - Stream s3a://landsat-pds/scene_list.gz aborted: close() operation; remaining=43506155 streamPos=2097152, nextReadPos=0, request range 1048576-45603307 length=45603307
2017-06-28 15:24:26,312 [JUnit-testRandomIONormalPolicy] INFO  scale.ITestS3AInputStreamPerformance (ITestS3AInputStreamPerformance.java:logTimePerIOP(203)) - Time per byte read: 3,846 nS
2017-06-28 15:24:26,312 [JUnit-testRandomIONormalPolicy] INFO  scale.ITestS3AInputStreamPerformance (ITestS3AInputStreamPerformance.java:executeRandomIO(464)) - Effective bandwidth 0.247936 MB/S
2017-06-28 15:24:26,313 [JUnit-testRandomIONormalPolicy] INFO  scale.ITestS3AInputStreamPerformance (ITestS3AInputStreamPerformance.java:logStreamStatistics(365)) - Stream Statistics
StreamStatistics{OpenOperations=4, CloseOperations=4, Closed=0, Aborted=4, SeekOperations=2, ReadExceptions=0, ForwardSeekOperations=0, BackwardSeekOperations=2, BytesSkippedOnSeek=0, BytesBackwardsOnSeek=6356992, BytesRead=1376256, BytesRead excluding skipped=1376256, ReadOperations=159, ReadFullyOperations=4, ReadsIncomplete=155, BytesReadInClose=0, BytesDiscardedInAbort=172517292}
2017-06-28 15:24:26,313 [JUnit-testRandomIONormalPolicy] INFO  contract.AbstractFSContractTestBase (AbstractFSContractTestBase.java:describe(255)) - cleanup
2017-06-28 15:24:26,545 [JUnit-testRandomIONormalPolicy] INFO  contract.AbstractFSContractTestBase (AbstractFSContractTestBase.java:describe(255)) - closing file system

Here's what I suspect is happening. We've invoked abort(), but a stream close() is still being called (somewhere) and that's triggering the warning, even though the stream is now aborted & shouldn't be considered live.

I think maybe you should see if the operations in 'abort()` are complete (set the http channel to null? the input stream? & make sure its in sync with the expectations of the warning code in close().

There's nothing else we can do in our code. I'm draining the stream on a close when the remaining bytes is in our configured skip range.

(there's one flaw in my hypothesis: after we call 'abort(), we *dont* callclose()`.

It's happening in your abort call

    @Override
    public void abort() {
        if (httpRequest != null) {
            httpRequest.abort();
        }
        IOUtils.closeQuietly(in, null);    // **HERE**
    }

think you should catch the special case "this is an abort so don't worry about it" & not log.

Ah I see, yeah we can fix that. Thanks for reporting.

This was fixed in 1.11.163. Going to go ahead and close this.

I'm still getting this error in 1.11.182. I'm using AWS-SDK as the mechanism to access S3 within Apache Spark. Has anyone else seen this issue? Spark is pulling down a 640Kb test file and throwing up this error.

17/08/25 23:11:32 WARN S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

you will see this at the tail of reads in S3A if the version of hadoop you are using doesn't have HADOOP-14596 in to drain the stream at the end of any non-aborted cloeStream() call.

Which version of the Hadoop JARs are on your CP? There have been no ASF releases with this patch (indeed, no ASF releases built & tested against 1.11.x, AFAIK)

I'm still seeing this issue, I call s3Object.getObjectContent.abort(), but I get the warning on "com.amazonaws" % "aws-java-sdk-s3" % "1.11.200". It appears to be coming exactly from the place @steveloughran pointed out above.

Is the call to abort throwing an exception. Looks like we set the flag after we try to abort so if it fails we may still log the warning.

If I apply the HADOOP-14890 patch to update the AWS SDK for Hadoop, the message goes away from hadoop-3. That means the abort() path is silent; we're making sure we drain the input stream when we want to stop reading before the end of the stream, but not actually force an abort. I'm happy...just need to get the SDK update in.

@shorea: The error might be here:

https://github.com/aws/aws-sdk-java/blob/293156398cfc79b2beb29b98e84147942f165d4c/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/model/S3ObjectInputStream.java#L94

The comment says "The default abort() implementation calls abort on the wrapped stream if it's an SdkFilterInputStream; otherwise we'll need to close the stream." but the code doesn't actually do that -- it calls close() if the wrapped stream is not a SdkFilterInputStream, and does nothing if the wrapped stream is a SdkFilterInputStream.

cc: @dhruba

The S3ObjectInputStream class extends SdkFilterInputStream and the abort() in S3ObjectInputStream performs super.abort() which is the default abort() implementation that aborts the wrapped stream as described.

I'm in 1.11.221, and I'm seeing the IOUtils.closeQuietly at the end of S3ObjectInputStream.abort printing this warning. The "aborted" flag is just not getting set correctly, it's hard to track what the intention was to get it set though.

@davidvandebunte would you happen to have a reproducible code sample for this problem?

@shorea Try HelloWorld.java.txt

Also experiencing this with version 1.11.215:

// try-with-resources auto-close adds a 2nd WARN to log
try (final S3ObjectInputStream is = amazonS3.getObject(new GetObjectRequest(bucketName,
                    keyName)).getObjectContent()){
            myService.process(is); // read most of the input stream, but not all -it's a ZIP archive and I read 1 file inside the ZIP archive (which is ~99% of the file in my case)
            is.abort(); // adds a 3rd WARN to the log
        }

So by closing the input stream I get a 2nd warning to the log. And then by explicitly calling abort I get a 3rd WARN message to the log.

Workaround will be to consume rest of input stream using another library but there ought to be a way to suppress this warning. Also it would very helpful if the WARN message included a stack trace or some other meta data as it took a while to figure out what code was generating this warn message.

@jhovell if you look @ what we do in Hadoop's S3A connector, if we aren't that far from the EOF, we just skip() to the end. Further away (which means a few hundred KB) and we call the abort, accept that the error text can get printed & not worry about it. I think we may have some coverage in our troubleshooting docs...if not something should go in.

FWIW, libraries trying to be "helpful" by downgrading close() to read-to-EOF, and abort to close() in order to improve HTTP connection pooling are a recurrent problem with Hadoop object store connectors. When you have a 16GB file and want to stop reading 32 MB in, abort() is absolutely the correct strategy, and we know exactly what we are doing, at least until HTTP/2 becomes the transport

@all has there been any progress on this ticket yet?

Any update? still encountered this issue.

Fixed for me in hadoop-aws 2.9.0 and later.

yes, fixed on 2.9 and 3.0 with the AWS update of HADOOP-14890

I'm getting this warning when aborting on 1.11.355. I think it's a regression or incomplete fix. I filed a new bug: https://github.com/aws/aws-sdk-java/issues/1657

Still happening on 1.11.368

@metabrain we have a small fix per https://github.com/aws/aws-sdk-java/issues/1657. This should be released today.

Hi,

I still got the following message:

18/07/31 18:27:09 WARN S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.
18/07/31 18:27:14 WARN S3AbortableInputStream: Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

I only checked for a sample, but the tasks producing the logs containing those messages were also the ones which only complete after an unreasonable amount of time.

1.11.435 happens too

For people like @PowerToThePeople111 who are seeing this w/ S3A, download cloudstore, run storediag with the -j option to see what JARs are actually on your CP. We shouldn't see this on recent Hadoop releases, e.g 3.1+.

Experiencing the same in 1.11.595

WARN S3AbortableInputStream:178 - Not all bytes were read from the S3ObjectInputStream, aborting HTTP connection. This is likely an error and may result in sub-optimal behavior. Request only the bytes you need via a ranged GET or drain the input stream after use.

@naveen09 -you need to explicitly drain the bytes in the stream, call abort() or do a ranged get. It should not be showing this message on an explicit call to abort()

Was this page helpful?
0 / 5 - 0 ratings