Aws-sdk-java: Gzip compression doesn't work with DynamoDB (crc32 errors)

Created on 5 Oct 2015  Â·  49Comments  Â·  Source: aws/aws-sdk-java

If you setup gzip, sometimes I get an exception:

Caused by: com.amazonaws.internal.CRC32MismatchException: Client calculated crc32 checksum didn't match that calculated by server side
    at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:112)
    at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:42)
    at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:1072)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:746)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:489)

Android SDK fixed that bug. See:
https://github.com/aws/aws-sdk-android/pull/40
and the actual fix:
https://github.com/aws/aws-sdk-android/commit/b03c51e6fd413b885de513443600671ecd2cce3d

Please also fix it in Java SDK.

service-api

Most helpful comment

We have fixed this bug in our latest release 1.11.20. Please try the new version and let us know if you have further questions. Sorry for the long waiting of the bug fix.

All 49 comments

Hi @aws team,

We are also impacted by this bug. Please fix this :+1:

I've also noticed this while trying to use it with Kinesis streaming. Since the easiest way to set up the KCL is with a common client for Kinesis, DynamoDB and CloudWatch, it means that you need to disable Gzip compression for all three of those, which is not ideal.

2015-12-22_10:43:25.44544 Dec 22, 2015 10:43:25 AM com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardSyncTask call
2015-12-22_10:43:25.44546 SEVERE: Caught exception while sync'ing Kinesis shards and leases
2015-12-22_10:43:25.44547 com.amazonaws.services.kinesis.leases.exceptions.DependencyException: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Client calculated crc32 checksum didn't match that calculated by server side
2015-12-22_10:43:25.44548   at com.amazonaws.services.kinesis.leases.impl.LeaseManager.list(LeaseManager.java:268)
2015-12-22_10:43:25.44549   at com.amazonaws.services.kinesis.leases.impl.LeaseManager.listLeases(LeaseManager.java:203)
2015-12-22_10:43:25.44549   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardSyncer.syncShardLeases(ShardSyncer.java:119)
2015-12-22_10:43:25.44550   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardSyncer.checkAndCreateLeasesForNewShards(ShardSyncer.java:88)
2015-12-22_10:43:25.44551   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShardSyncTask.call(ShardSyncTask.java:68)
2015-12-22_10:43:25.44551   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
2015-12-22_10:43:25.44553   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker.initialize(Worker.java:383)
2015-12-22_10:43:25.44553   at com.amazonaws.services.kinesis.clientlibrary.lib.worker.Worker.run(Worker.java:318)
2015-12-22_10:43:25.44554   at com.X.Y.Z.main(Z.java:137)
2015-12-22_10:43:25.44555 Caused by: com.amazonaws.AmazonClientException: Unable to execute HTTP request: Client calculated crc32 checksum didn't match that calculated by server side
2015-12-22_10:43:25.44557   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:587)
2015-12-22_10:43:25.44568   at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:362)
2015-12-22_10:43:25.44569   at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:328)
2015-12-22_10:43:25.44570   at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:307)
2015-12-22_10:43:25.44570   at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:1805)
2015-12-22_10:43:25.44571   at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.scan(AmazonDynamoDBClient.java:1525)
2015-12-22_10:43:25.44572   at com.amazonaws.services.kinesis.leases.impl.LeaseManager.list(LeaseManager.java:227)
2015-12-22_10:43:25.44573   ... 8 more
2015-12-22_10:43:25.44575 Caused by: com.amazonaws.internal.CRC32MismatchException: Client calculated crc32 checksum didn't match that calculated by server side
2015-12-22_10:43:25.44576   at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:113)
2015-12-22_10:43:25.44576   at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:42)
2015-12-22_10:43:25.44577   at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:1142)
2015-12-22_10:43:25.44589   at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:853)
2015-12-22_10:43:25.44592   at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:576)
2015-12-22_10:43:25.44594   ... 14 more

It looks like a bug in java sdk, will fix that. Thanks for the report.

Hello @jakozaur , can you give us a specific test case ? I am working on fixing this.

@zhangzhx I'm hitting it every time, when I enable Gzip compression and than try to use DynamoDB Api.

I just created DynamoDB client:

new AmazonDynamoDBClient(credentials, new ClientConfiguration().withGzip(true))

and then every network call fails.

https://github.com/aws/aws-sdk-js/issues/405 is probably relevant to your interests. DynamoDB is returning a checksum of the compressed bytes, which many HTTP clients transparently decompress for you making the checksum "wrong". Ideally they would switch to returning a checksum over the uncompressed bytes, but until that happens the only real option is to disable checksum validation when you might get a gzipped response. :(

Oh bother, didn't see the linked Android SDK issue. It's unfortunate that they fixed this client-side and are now depending on the checksum matching the compressed bytes, as that pretty much closes the door on a fix coming from the server-side. If you have a reasonable way of getting the compressed bytes out of the Apache HttpClient, you may as well follow suit and verify the checksum on that.

Thanks @fernomac for the explain.

I will reach out to the DynamoDB server team and figure out a safe way to fix this and not breaking android as well. Maybe letting them send both CRCs from uncompressed and compressed payload would be an option.

@zhangzhx, could you please let us know if there is any update? Thank you in advance.

Also hitting this. Any status on a fix?

Sorry for the long wait. We asked the service team and they are not planning to change the checksum to uncompressed data. We will soon migrate our underlying Apache Httpclient from 4.3.6 to a newer version 4.5. With that said, we will fix this issue in the new Httpclient by disabling response content automatic decompression for dynamo. I will let you know once that happens.

+1 :)

We have fixed this bug in our latest release 1.11.20. Please try the new version and let us know if you have further questions. Sorry for the long waiting of the bug fix.

I think you actually may have introduced a bigger bug.... out of the blue once we updated to 1.11.20 we started seeing this issue (when previously we were not).

Hi @theRealDrumBum, can you give me a detailed description what the issue you are seeing?

Its basically the same issue reported above, except introduced with 1.11.20...

stack-trace.txt

Can you give me some sample code to reproduce this bug? Thanks.

Any details about how you've configured the client would be very helpful as well.

I can help out, we are having the same issue. Version 1.11.21 in a spring-boot 1.4.rc1 project.

In a Spring app:

@Configuration
public class AwsConfig {

    @Bean
    public AWSCredentialsProvider awsCredentialsProvider() {
        return new DefaultAWSCredentialsProviderChain();
    }

    @Bean
    public AmazonDynamoDBAsyncClient client(final AWSCredentialsProvider credentialsProvider,
                                            final DynamoDBConfig dynamoDBConfig) {
        final ClientConfiguration configuration = new ClientConfiguration();
        configuration.setRequestTimeout(dynamoDBConfig.getTimeoutInMillis());
        configuration.setClientExecutionTimeout(dynamoDBConfig.getTimeoutInMillis());
        final AmazonDynamoDBAsyncClient client = new AmazonDynamoDBAsyncClient(credentialsProvider, configuration);
        client.setRegion(Region.getRegion(Regions.fromName(dynamoDBConfig.getRegion())));
        return client;
    }
}

And somewhere else, the exception mentioned above when executing:

GetItemResult getItemRes = client.getItem(
                new GetItemRequest(settingsTable, ImmutableMap.of(SETTINGS_ID_KEY, new AttributeValue().withS(SETTINGS_NEWER_URL_ID)))
        );

Exception:

com.amazonaws.AmazonClientException: Unable to execute HTTP request: Client calculated crc32 checksum didn't match that calculated by server side
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:735)
    at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:475)
    at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:437)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:386)
    at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:2074)
    at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:2044)
    at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.getItem(AmazonDynamoDBClient.java:1392)
Caused by: com.amazonaws.internal.CRC32MismatchException: Client calculated crc32 checksum didn't match that calculated by server side
    at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:125)
    at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:46)
    at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:1274)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:913)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:723)

Which versions of the aws-java-sdk-dynamodb and aws-java-sdk-core are being used at runtime?

We got com.amazonaws:aws-java-sdk-dynamodb and com.amazonaws:aws-java-sdk-sqs on the dependency list. Those are on the classpath at runtime:

Still the same with 1.11.22, but downgrading to 1.11.19 avoids this problem. Anything new on this topic?

I have not been able to reproduce this. Would it be possible to provide some wire log samples? Keep in mind logs may contain sensitive data.

http://docs.aws.amazon.com/java-sdk/latest/developer-guide/java-dg-logging.html

Hey @thisismana, do you see this error not having Gzip configured? I couldn't reproduce this issue as well. Please provide some wire log for us to further investigate. Thanks.

Sorry for the late answer, but the problem seems to be resolved now. Even with the versions mentioned above. Guess it wasn't the java-sdks fault.

retested it with 1.11.21 and 1.11.22, as well as the new 1.11.23

close the issue?

Hmm interesting. Well if you see any further issues with it please let us know.

It's still an issue.

Can you give us more information such as sample code that can reproduce this issue, error log or exception trace? Thanks.

com.amazonaws.AmazonClientException: Unable to execute HTTP request: Client calculated crc32 checksum didn't match that calculated by server side
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:732)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:475)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:437)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:386)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:2078)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:2048)
at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.query(AmazonDynamoDBClient.java:1690)
at com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBMapper.query(DynamoDBMapper.java:1649)
at com.nordstrom.amp.core.dao.mappers.HashKeyMapper.query(HashKeyMapper.java:145)
at com.nordstrom.amp.core.dao.mappers.WebReadyAssetsMapper.queryByAssetId(WebReadyAssetsMapper.java:193)
at com.nordstrom.amp.core.dao.mappers.WebReadyAssetsMapper.getItemByAssetId(WebReadyAssetsMapper.java:118)
at com.nordstrom.amp.core.dao.mappers.WebReadyAssetsMapper.getItemByAssetId(WebReadyAssetsMapper.java:105)
at com.nordstrom.amp.core.repositories.WebReadyAssetRepository.getWebReadyAsset(WebReadyAssetRepository.java:53)
at com.nordstrom.amp.migration.Map.map(Map.java:47)
at com.nordstrom.amp.migration.Map.map(Map.java:21)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
Caused by: com.amazonaws.internal.CRC32MismatchException: Client calculated crc32 checksum didn't match that calculated by server side
at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:125)
at com.amazonaws.http.JsonResponseHandler.handle(JsonResponseHandler.java:46)
at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:1277)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:916)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:723)
... 18 more

Do you see this for every request?

Here's the code. I've stripped out as much as I can.
Yes, every single time.

java -jar build\libs\word-count-emr-1.0-adee88b.jar
WordCountEMRTest.zip

I've stripped out even more of our code.
WordCountEMRTest-2.zip

1.11.19 works
1.11.20 fails with the crc32 error

Thanks for providing these sample code. I saw you intended to perform a query operation using DynamoDB Mapper. I couldn't reproduce this issue. Could you turn on log4j to get the wire log for us to investigate? Also, did you turn on Gzip in the ClientConfiguration setting?

Here is how to turn on log4j. Keep in mind logs may contain sensitive data such as credentials.
http://docs.aws.amazon.com/java-sdk/latest/developer-guide/java-dg-logging.html

Gzip doesn't affect the crc error.

I'll try the wire log...

public ClientConfiguration getClientConfiguration() {
    if (clientConfiguration == null) {
        clientConfiguration = new ClientConfiguration();
        clientConfiguration.setUseGzip(true);
        clientConfiguration.setConnectionTimeout(clientConnectionTimeout);
        clientConfiguration.setClientExecutionTimeout(clientExecutionTimeout);
        clientConfiguration.setUseGzip(true); // true or false doesn't affect the crc error
        logger.info("getClientConfiguration");

        if (!StringUtils.isEmpty(proxyHost) && !StringUtils.isEmpty(proxyPort)) {
            clientConfiguration.setProxyHost(proxyHost);
            clientConfiguration.setProxyPort(proxyPort);
        }
    }
    return clientConfiguration;
}

@zhangzhx can you email me at dan.vallejo / nordstrom.com

Able to reproduce with the sample you've provided. Digging into it now.

WOOHOO

For some reason it's not wrapping the content in a CRC calculating input stream so the client side crc is always 0. Still investigating.

Ah okay got it. It's an issue with the request timeout feature not playing nice with how we are calculating the checksum now. For the request timeout when wrap with a BufferedHttpEntity to ensure we read the entire request before leaving AmazonHttpClient (to strictly enforce the timeout over the whole receiving of the request) and this is messing up checksum calculation. Looking for a fix now.

Okay got a fix in mind and I'll make sure it goes out in tomorrows release. If this is blocking you then temporarily disabling the request timeout and client execution timeout features will fix this issue.

@danvallejo Thanks for reporting this and providing the sample code, it was enormously helpful tracking this issue down. The fix has been staged and will be released tomorrow as part of 1.11.28.

Sweet! I think Dan deserves an Xl Dynamodb shirt - we can meet you in the
lobby of Blackfoot!

-Matt

On Wed, Aug 17, 2016 at 5:20 PM Andrew Shore [email protected]
wrote:

@danvallejo https://github.com/danvallejo Thanks for reporting this and
providing the sample code, it was enormously helpful tracking this issue
down. The fix has been staged and will be released tomorrow as part of
1.11.28.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/aws/aws-sdk-java/issues/526#issuecomment-240588941,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAPN80MOxCuzzqaRZR0i4e3mv25eJwdUks5qg6VfgaJpZM4GI4i8
.

WOOOOOOOOHOOOOOOOOOOOOO

Not sure how frequently that's synced with Maven Central. You should be able to pull it down from central though.

http://search.maven.org/#artifactdetails|com.amazonaws|aws-java-sdk-core|1.11.28|jar

What public maven url do you use?

buildscript {
ext {
AWSSDK = '1.11.28'
}
repositories {
maven { url "https://mvnrepo.nordstrom.net/nexus/content/groups/public/" }
}
}

Maven Central is http://repo1.maven.org/maven2/ I believe

Fix VERIFIED. Thanks.

Was this page helpful?
0 / 5 - 0 ratings