Aws-sdk-java-v2: First DynamoDB request is very slow

Created on 11 Jul 2019  Â·  28Comments  Â·  Source: aws/aws-sdk-java-v2

Expected Behavior

All DynamoDB queries should take in the low 10s of milliseconds.

Current Behavior

DynamoDB queries are generally in the 10s of milliseconds however the first time a Dynamo function is executed I am seeing response times around 2s, subsequent queries then fall back to the expected levels.

Possible Solution

Unsure, I haven't been able to identify anything from stepping through the code

Steps to Reproduce (for bugs)

I have written the following code to replicate the creation of the DynamoDbClient and the execution of four put requests. It assumes you have a table named "item" in your AWS account.

The results are below.

import java.util.UUID;

import org.junit.jupiter.api.Test;

import com.google.common.collect.ImmutableMap;
import com.ixxus.cnc.utils.SimpleTimer;

import software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider;
import software.amazon.awssdk.http.SdkHttpClient;
import software.amazon.awssdk.http.urlconnection.UrlConnectionHttpClient;
import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;
import software.amazon.awssdk.services.dynamodb.DynamoDbClientBuilder;
import software.amazon.awssdk.services.dynamodb.model.AttributeValue;
import software.amazon.awssdk.services.dynamodb.model.PutItemRequest;

class DynamoClientCT {

    private static final SimpleTimer TIMER = new SimpleTimer();

    private static final String CONTEXT_ALL = "CONTEXT_ALL";
    private static final String CONTEXT_HTTP_CLIENT_BUILD = "CONTEXT_HTTP_CLIENT_BUILD";
    private static final String CONTEXT_CREDENTIALS_PROVIDER_BUILD = "CONTEXT_CREDENTIALS_PROVIDER_BUILD";
    private static final String CONTEXT_DYNAMO_CLIENT_BUILD = "CONTEXT_DYNAMO_CLIENT_BUILD";
    private static final String CONTEXT_DYNAMO_CLIENT_EXECUTE = "CONTEXT_DYNAMO_CLIENT_EXECUTE_";

    @Test
    void test() {
        TIMER.start(CONTEXT_ALL);

        createClientAndExecuteRequest();

        TIMER.stop(CONTEXT_ALL);

        TIMER.logAllSummaries("DynamoClientCT");
    }

    private void createClientAndExecuteRequest() {
        TIMER.start(CONTEXT_HTTP_CLIENT_BUILD);
        final SdkHttpClient httpClient = UrlConnectionHttpClient.builder().build();
        TIMER.stop(CONTEXT_HTTP_CLIENT_BUILD);

        TIMER.start(CONTEXT_CREDENTIALS_PROVIDER_BUILD);
        final DefaultCredentialsProvider credentialsProvider = DefaultCredentialsProvider.create();
        TIMER.stop(CONTEXT_CREDENTIALS_PROVIDER_BUILD);

        TIMER.start(CONTEXT_DYNAMO_CLIENT_BUILD);
        final DynamoDbClientBuilder builder = DynamoDbClient.builder();
        builder.httpClient(httpClient);
        builder.region(Region.EU_WEST_1);
        builder.credentialsProvider(credentialsProvider);
        final DynamoDbClient client = builder.build();
        TIMER.stop(CONTEXT_DYNAMO_CLIENT_BUILD);

        putItem(client, 1);
        putItem(client, 2);
        putItem(client, 3);
        putItem(client, 4);
    }

    private void putItem(final DynamoDbClient client, final int count) {
        TIMER.start(CONTEXT_DYNAMO_CLIENT_EXECUTE + count);
        final PutItemRequest request = PutItemRequest.builder()
                .tableName("item")
                .item(ImmutableMap.of("id", AttributeValue.builder().s(UUID.randomUUID().toString()).build()))
                .build();

        client.putItem(request);
        TIMER.stop(CONTEXT_DYNAMO_CLIENT_EXECUTE + count);
    }
}

The following is the code for the SimpleTimer:

import java.util.ArrayList;
import java.util.DoubleSummaryStatistics;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.TreeMap;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class SimpleTimer {

    private static final Logger LOGGER = LoggerFactory.getLogger(SimpleTimer.class);

    Map<String, List<Long>> times = new TreeMap<String, List<Long>>();

    Map<String, Long> currentTime = new HashMap<>();

    public void start(final String context) {
        currentTime.put(context, System.currentTimeMillis());
    }

    public void stop(final String context) {
        if (!currentTime.containsKey(context)) {
            throw new IllegalStateException("Start has not been called for context: " + context);
        }
        times.computeIfAbsent(context, s -> new ArrayList<Long>())
                .add(System.currentTimeMillis() - currentTime.get(context));
    }

    public DoubleSummaryStatistics getSummaryStatistics(final String context) {
        return times.getOrDefault(context, new ArrayList<Long>())
                .stream()
                .mapToDouble(a -> a)
                .summaryStatistics();
    }

    public void logAllSummaries(final String prefix) {
        LOGGER.debug("logAllSummaries: {}");
        times.keySet().forEach(ctx -> LOGGER.info("{} - {} - {}", prefix, ctx, getSummaryStatistics(ctx)));
    }
}

Results:

2019-07-11 15:14:17.746  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_ALL - DoubleSummaryStatistics{count=1, sum=1633.000000, min=1633.000000, average=1633.000000, max=1633.000000}
2019-07-11 15:14:17.747  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_CREDENTIALS_PROVIDER_BUILD - DoubleSummaryStatistics{count=1, sum=4.000000, min=4.000000, average=4.000000, max=4.000000}
2019-07-11 15:14:17.748  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_DYNAMO_CLIENT_BUILD - DoubleSummaryStatistics{count=1, sum=594.000000, min=594.000000, average=594.000000, max=594.000000}
2019-07-11 15:14:17.748  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_DYNAMO_CLIENT_EXECUTE_1 - DoubleSummaryStatistics{count=1, sum=884.000000, min=884.000000, average=884.000000, max=884.000000}
2019-07-11 15:14:17.748  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_DYNAMO_CLIENT_EXECUTE_2 - DoubleSummaryStatistics{count=1, sum=29.000000, min=29.000000, average=29.000000, max=29.000000}
2019-07-11 15:14:17.749  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_DYNAMO_CLIENT_EXECUTE_3 - DoubleSummaryStatistics{count=1, sum=70.000000, min=70.000000, average=70.000000, max=70.000000}
2019-07-11 15:14:17.749  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_DYNAMO_CLIENT_EXECUTE_4 - DoubleSummaryStatistics{count=1, sum=29.000000, min=29.000000, average=29.000000, max=29.000000}
2019-07-11 15:14:17.750  INFO  SimpleTimer:49 - DynamoClientCT - CONTEXT_HTTP_CLIENT_BUILD - DoubleSummaryStatistics{count=1, sum=23.000000, min=23.000000, average=23.000000, max=23.000000}

As you can see the first put takes 884ms then the three subsequent ones take 29, 70, and 29ms making the first request at least an order of magnitude slower. Add in the 600ms for the DynamoDbClient creation and that's quite a long time.

Context

This code is used in a Lambda function serving a REST API which is one of the canonical use cases for Lambdas. The expectation is that the function should be quick to respond and as shown by the subsequent requests, this is a reasonable expectation. The results above are actually far better than we are seeing when deployed to Lambda - our first Dynamo execution is consistently around 2s +/-0.2s. We are not using a VPC and the Lambdas have 1024MB memory.

Your Environment

The above code was run on my machine to get the provided results but we are deploying virtually identical code to an AWS Lambda with 1024MB memory.

  • AWS Java SDK version used: 2.5.48
  • JDK version used: 1.8
  • Operating System and version: N/A - using AWS Lambda

I have some questions:

  1. Are there other settings or configuration we could be providing to reduce the time to build the client and execute the first request?
  2. What causes the first request to be so much slower than subsequent requests?
guidance

Most helpful comment

Hi @ocind. I'm quite surprised more people haven't encountered the problem, it fundamentally undermines one of the main reference architectures for AWS Lambda IMO.

The only way we have found to reduce the time is by warming the Lambdas. However the normal Lambda warming (for example, using, serverless-plugin-warmup) isn't enough; you need to make a Dynamo request so that the connection is established to do the SSL negotiation, setup connection pooling, etc. We use DynamoDbClient#describeEndpoints for this as it is a harmless operation. Note that you'll need to do something similar for this for every AWS service you use from Lambda. For example we also use Cognito and call CognitoIdentityProviderClient#listUserPools as part of our warming process.

All 28 comments

Hi @stuartleylandcole The initial request being slow is because the HTTP client must open a new socket and perform an SSL handshake which are slow and costly operations.

For your first question, it looks like you already use the URLConnectionHTTPClient which has the quickest startup time of the 3 existing HTTP clients we have.

You can also have a look at this comment https://github.com/aws/aws-sdk-java-v2/issues/6#issuecomment-422126377 but it looks like you more or less follow the guidelines already.

Hi @dagnir, thank you for your replies! I've been doing some further investigation and wanted to report back my findings and ask for some further assistance.

With a 1024MB Lambda the first DynamoDB request reliably takes 2s from executing the request to receiving a response. Note that this does not include time to build DynamoDbClient nor any of its dependencies (SdkHttpClient, etc). Building DynamoDbClient itself takes 400ms. Here is a log showing the client build and request. Note that I have used the ApacheHttpClient instead of UrlConnectionClient here so I could enable trace logging - using this client has not affected the times taken to make the DynamoDB request, they are inline with my original post, although I concede it probably takes longer to initialise so I will revert back when deploying for real.

2019-07-15 11:04:41.030 DEBUG CommonModule:26 - Starting to build DynamoDB client
2019-07-15 11:04:41.473 DEBUG CommonModule:28 - Finished building DynamoDB client
...
2019-07-15 11:04:42.172 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG ItemFetcher:50 - Retrieving item with key {id=AttributeValue(S=5fb7d27c-0447-411c-9c73-bbbbcb816cfe, SS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, NS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, BS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, M=software
2019-07-15 11:04:42.318 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG AwsCredentialsProviderChain:95 - Unable to load credentials from SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId). software.amazon.awssdk.core.exception.SdkClientException: Unabl
2019-07-15 11:04:42.333 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG AwsCredentialsProviderChain:84 - Loading credentials from EnvironmentVariableCredentialsProvider()
2019-07-15 11:04:42.351 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG ExecutionInterceptorChain:84 - Creating an interceptor chain that will apply interceptors in the following order: []
2019-07-15 11:04:42.556 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG request:84 - Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=dynamodb.eu-west-1.amazonaws.com, encodedPath=/, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent, X-Amz-Target], queryParameters=[])
2019-07-15 11:04:42.613 4762e9b6-bc2a-4bfb-9d62-3332c459314a TRACE Aws4Signer:126 - AWS4 Canonical Request: POST / amz-sdk-invocation-id:80f99976-67fc-b5d6-37fc-cacdb75683cb amz-sdk-retry:0/0/500 content-length:91 content-type:application/x-amz-json-1.0 host:dynamodb.eu-west-1.amazonaws.com x-amz-date:20190715T110442Z x-amz-security-token:AgoJb3JpZ2luX2VjEKv//////////wEaCWV1LXdlc3QtMSJHMEUCIQCvkA
2019-07-15 11:04:42.631 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG Aws4Signer:84 - AWS4 String to sign: AWS4-HMAC-SHA256 20190715T110442Z 20190715/eu-west-1/dynamodb/aws4_request 905a861c122219a97b2b584d0c7e9e1e6e1257fd99460c5f014ee99b2df67b8c
2019-07-15 11:04:42.634 4762e9b6-bc2a-4bfb-9d62-3332c459314a TRACE Aws4Signer:126 - Generating a new signing key as the signing key not available in the cache for the date: 1563148800000
2019-07-15 11:04:42.992 4762e9b6-bc2a-4bfb-9d62-3332c459314a TRACE SdkTlsSocketFactory:126 - Connecting to dynamodb.eu-west-1.amazonaws.com/52.119.244.156:443
2019-07-15 11:04:43.194 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG SdkTlsSocketFactory:84 - socket.getSupportedProtocols(): [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2], socket.getEnabledProtocols(): [TLSv1, TLSv1.1, TLSv1.2]
2019-07-15 11:04:43.211 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG SdkTlsSocketFactory:84 - TLS protocol enabled for SSL handshake: [TLSv1.2, TLSv1.1, TLSv1]
2019-07-15 11:04:43.950 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG SdkSslSocket:84 - created: dynamodb.eu-west-1.amazonaws.com/52.119.244.156:443
2019-07-15 11:04:44.094 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG request:84 - Received successful response: 200

I have also run the same code on a 2048MB Lambda to compare times and they are roughly half the time for the 1024MB Lambda, except for building DynamoDbClient - that still takes 400ms. Here is the log:

2019-07-15 11:55:19.645 DEBUG CommonModule:26 - Starting to build DynamoDB client
2019-07-15 11:55:20.063 DEBUG CommonModule:28 - Finished building DynamoDB client
...
2019-07-15 11:55:20.577 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG ItemFetcher:50 - Retrieving item with key {id=AttributeValue(S=e97498c4-4bbf-42d7-97c9-86cbe98575e3, SS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, NS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, BS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, M=software
2019-07-15 11:55:20.663 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG AwsCredentialsProviderChain:95 - Unable to load credentials from SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId). software.amazon.awssdk.core.exception.SdkClientException: Unabl
2019-07-15 11:55:20.669 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG AwsCredentialsProviderChain:84 - Loading credentials from EnvironmentVariableCredentialsProvider()
2019-07-15 11:55:20.674 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG ExecutionInterceptorChain:84 - Creating an interceptor chain that will apply interceptors in the following order: []
2019-07-15 11:55:20.763 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG request:84 - Sending Request: DefaultSdkHttpFullRequest(httpMethod=POST, protocol=https, host=dynamodb.eu-west-1.amazonaws.com, encodedPath=/, headers=[amz-sdk-invocation-id, Content-Length, Content-Type, User-Agent, X-Amz-Target], queryParameters=[])
2019-07-15 11:55:20.791 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba TRACE Aws4Signer:126 - AWS4 Canonical Request: POST / amz-sdk-invocation-id:3772b0c5-757d-ce7f-e32f-77c5df238715 amz-sdk-retry:0/0/500 content-length:91 content-type:application/x-amz-json-1.0 host:dynamodb.eu-west-1.amazonaws.com x-amz-date:20190715T115520Z x-amz-security-token:AgoJb3JpZ2luX2VjEKv//////////wEaCWV1LXdlc3QtMSJHMEUCIQCvkA
2019-07-15 11:55:20.795 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG Aws4Signer:84 - AWS4 String to sign: AWS4-HMAC-SHA256 20190715T115520Z 20190715/eu-west-1/dynamodb/aws4_request d0243cc15e289b4d98de1e7c3ffcd503b7a23e226957337e0d6145e2d291bc06
2019-07-15 11:55:20.798 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba TRACE Aws4Signer:126 - Generating a new signing key as the signing key not available in the cache for the date: 1563148800000
2019-07-15 11:55:20.964 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba TRACE SdkTlsSocketFactory:126 - Connecting to dynamodb.eu-west-1.amazonaws.com/52.94.25.56:443
2019-07-15 11:55:21.037 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG SdkTlsSocketFactory:84 - socket.getSupportedProtocols(): [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2], socket.getEnabledProtocols(): [TLSv1, TLSv1.1, TLSv1.2]
2019-07-15 11:55:21.039 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG SdkTlsSocketFactory:84 - TLS protocol enabled for SSL handshake: [TLSv1.2, TLSv1.1, TLSv1]
2019-07-15 11:55:21.394 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG SdkSslSocket:84 - created: dynamodb.eu-west-1.amazonaws.com/52.94.25.56:443
2019-07-15 11:55:21.433 e3cf38b2-24d5-4d2b-bbef-1a81cf0046ba DEBUG request:84 - Received successful response: 200

A quick summary (all times in ms):

| Step | 1024MB | 2048MB |
| --- | ------------- | ------------- |
| DynamoDbClient build | 440 | 420 |
| Unknown | 150 | 86 |
| Creating interceptor chain | 200 | 90 |
| Generating new signing key | 360 | 170 |
| Establishing SSL connection | 960 | 430 |

Even with the timings on the 2048MB sized Lambda this makes it quite difficult to run a REST API using Lambdas as a cold start will incur both the normal Lambda cold start time (1-2s) plus the DynamoDB overhead (1s). Equally these Lambdas do not require anything like 2GB of memory, they are using less than 200MB but I realise that CPU scales with memory which is why I performed this test - ideally I would be using 512MB or less.

Some questions to finish with:

  1. Having seen the trace logging and timing is there anything you can recommend that would speed up the first request execution?
  2. What is happening during the DynamoDbClient creation that takes 400ms? Why would this be so consistent across the two different Lambdas when other timings are roughly proportional
  3. What is happening at the 'Unknown' step in the table above - this is between the following two lines of logging, the first of which is my code calling DynamoDbClient#getItem:
    2019-07-15 11:04:42.172 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG ItemFetcher:50 - Retrieving item with key {id=AttributeValue(S=5fb7d27c-0447-411c-9c73-bbbbcb816cfe, SS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, NS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, BS=software.amazon.awssdk.core.util.DefaultSdkAutoConstructList@2cbb3d47, M=software 2019-07-15 11:04:42.318 4762e9b6-bc2a-4bfb-9d62-3332c459314a DEBUG AwsCredentialsProviderChain:95 - Unable to load credentials from SystemPropertyCredentialsProvider(): Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId). software.amazon.awssdk.core.exception.SdkClientException: Unabl

Thank you for any help you can provide. Hopefully we are able to reduce these times as at the moment I'm not sure that Lambdas providing a REST API in front of DynamoDB is a viable architecture.

Hi @stuartleylandcole, the constant of ~400ms for instantiating DynamoDb is mostly due to classloading and static initialization. Most of these operations are I/O bound which is probably why it doesn't really scale with larger Lambdas. We can look into removing some unnecessary/heavy static initialization which may shave off another 100 or so ms.

The 'Unknown' step in the logs appears the the credentials loading mechanism. This is in line with what's happening since this is happening after a call to DDB is made, wherer the SDK must then resolve the credentials to use for signing. You can reduce this by providing a specific credentials provider, rather that DefaultCredentialsProvider, which delegates to a set of other providers. For example, since you're in Lambda, you can use the EnvironmentVariableCredentialsProvider which will pull credentials from the AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_SESSION_TOKEN which are exposed to your Lambda.

Hi @stuartleylandcole, has @dagnir answered your questions? Do you have any other question?

Hi @debora-ito @dagnir. Thank you for your reply and apologies for not responding sooner.

I think my question has been answered, at least in relation to the SDK - I have others concerns about the viability of Lambdas for a REST API backed by DynamoDB given the timings shown above but I don't think this is the place to raise them. Do you happen to know the correct place to raise this?

Many thanks for your help! I'm happy for this issue to be closed now.

I would suggest reaching out to the AWS Support or directly to service team via AWS Forums.

hi @stuartleylandcole , thank you so much for a very detailed explanation and benchmark. I am actually having the same issue, and your investigation results above is really helpful. I would like to know if you have found some other way to shave off some more time in regards to the first request to DynamoDb. Have you gained any updates from Support or other forums?

I think it is unacceptable that a 1024MB lambda takes 2s to execute first DynamoDb request. This is actually worse than AWS Lambda cold start itself :(

Hi @ocind. I'm quite surprised more people haven't encountered the problem, it fundamentally undermines one of the main reference architectures for AWS Lambda IMO.

The only way we have found to reduce the time is by warming the Lambdas. However the normal Lambda warming (for example, using, serverless-plugin-warmup) isn't enough; you need to make a Dynamo request so that the connection is established to do the SSL negotiation, setup connection pooling, etc. We use DynamoDbClient#describeEndpoints for this as it is a harmless operation. Note that you'll need to do something similar for this for every AWS service you use from Lambda. For example we also use Cognito and call CognitoIdentityProviderClient#listUserPools as part of our warming process.

hi @stuartleylandcole I think my use case is very similar to yours. We also use Dynamo and Cognito. On lambda warmup calls, currently we also initialize all the sdk clients and try to make a dummy request for each one.

In one of our Lambda functions, we actually have 2 Dynamo tables to read values from. I did some experiment couple months ago, and we found that while writing to different tables, first request to each table would involve setting up SSL handshake again. Currently we warm the connections by performing describe-table request for each table.

We use DynamoDbClient#describeEndpoints for this as it is a harmless operation

Hmmm, I didn't know that describeEndpoints works in this case, I will check it out.

Thank you for the insights

We have the same issue.
Java 8 & DynamoDB

Containers:
512 Mb takes 9 seconds for the first request (cold start is just 400 ms), next request 80 ms.
1024 Mb takes 10 seconds, next requests about 20 ms
2048 Mb takes 2 seconds, next requests about 12 ms

Logs say that the lambda uses not more 200 Mb of RAM.
I think the issue concerns CPU resources. The quota for CPU depends on RAM size, more RAM more CPU cycles allowed for the container.

I suppose the DynamoDB client (from AWS SDK) is not efficient and should be improved with refactoring.

I am seeing a similar situation with S3 clients inside a Lambda. The first creation of a client and first request are significantly slower than following requests (about 10 seconds vs. 200 ms) with 512M Lambda.

Just an assumption:
maybe it happens when DynamoDBMapper is used, because it needs time and CPU to scan java classes and create corresponding mapper object in the memory. It seems it is very CPU intensive task, so it works more or less better with big memory lambda instances, where CPU quota is higher.

I also have encountered this issue - I tried a few things but found that increasing the memory size was the only thing that seemed to make an impact. My findings:

Lambda memory size (mb) | First DynamoDB query duration (seconds) | Total request time (seconds)
-- | -- | --
128 (min) | Timeout (>30) | Timeout (>30)
256 (initial) | 9.0 | 12.9
512 | 4.2 | 7.3
832 | 2.3 | 5.0
1024 | 2.0 | 4.5
1280 | 1.7 | 4.2
1600 | 1.2 | 3.6
1856 | 1.1 | 3.3
2048 | 1.0 | 3.2
2560 | 0.86 | 2.8
3008 (max) | 0.78 | 2.6

image

We are using the DynamoDBMapper, but I tried switching over to using QuerySpec instead and manually mapping the result to a POJO and it didn't make a noticeable performance difference. I'd love to know if anyone has found a more consistent solution, but this at least helped make the times more reasonable (we were at 256 MB to start and it was unusable on cold starts).

Hi all, we have released a maven archetype archetype-lambda that helps customers bootstrap a lambda function faster with SDK recommended practices to achieve minimal SDK startup time.

mvn archetype:generate \
  -DarchetypeGroupId=software.amazon.awssdk \
  -DarchetypeArtifactId=archetype-lambda \
  -DarchetypeVersion=2.11.4\

See https://github.com/aws/aws-sdk-java-v2/blob/master/archetypes/archetype-lambda/README.md#maven-archetype-for-lambda-function-using-aws-sdk-for-java-2x for more info

Please try it out and let us know if you have any feedback! :)

@zoewangg many thanks for this! Could you highlight the key parts of the archetype in relation to this issue (i.e. Initialisation of the dynamo client)? I've taken a look at the code and the most obvious part is initialising the client in the Lambda handler constructor - are there other aspects to it? Thanks again!

@stuartleylandcole Sure! Below are the patterns we use in the archetype to reduce SDK startup time:

  • Initialize the SDK client outside of the lambda function.
  • client configuration

    • default to use UrlConnectionHttpClient to reduce jar size

    • specify the region

    • specify credentialsProvider to be EnvironmentVariableCredentialsProvider

  • pom.xml

    • exclude netty-nio-client and apache-client dependencies from service client dependency (they are included by default) to reduce jar size.

https://github.com/aws/aws-sdk-java-v2/blob/c72d2b554de92258b9cfced4f8f54f3ba714cf92/archetypes/archetype-lambda/src/test/resources/projects/urlhttpclient/reference/pom.xml#L37-L46


Update: please check out https://aws.amazon.com/blogs/developer/tuning-the-aws-java-sdk-2-x-to-reduce-startup-time/ for best practices to reduce SDK startup time

@ocind @stuartleylandcole . I'm thinking about the following setup to eliminate the problem:

  • initialise DynamoDB connection in static block and make dummy request in static block
  • use provisioned concurrency
  • cache the connection in a field

I'm sure this eliminates the overhead just after the environment has been provisioned, cause I've seen that.
I do not know what will happen if you provision the environment, and the first request only comes in an hour later. I suppose the connection might have been closed by the client or the backend dynamodb service. That means the overhead would be there again.
That is if the init code is only run once for provisioned concurrency.
I still have to test that last part to see if it actually behaves like that.

What do you think?

@Nxtra Yeah, you can consider pre-warmuping the client by making a simple GET api after you create the client to establish a connection in advance. By default, the SDK closes idle connections after 60 seconds. You can increase it, but keep in mind that idle connections can also be closed on the service side after certain amount of time, which cannot be configured on the SDK side.

https://github.com/aws/aws-sdk-java-v2/blob/c72d2b554de92258b9cfced4f8f54f3ba714cf92/http-client-spi/src/main/java/software/amazon/awssdk/http/SdkHttpConfigurationOption.java#L118

I'm not super familiar with provisioned concurrency feature of Lambda function, and I'd recommend reaching out to Lambda team on AWS forums if you have any questions regarding it.

For me, it looks like there is no any problems with dynamodb connections
and the time to establish it (honestly I did not measure it).

I think the issue is in something different, because the first request time
is very very varies on the Lambda Memory settings and as consequeuence the
CPU quota Lambda has.
If you have a big Lambda instance like 3 Gb, there is no any serious delay
for the first request. Right ? Actually, any HTTP connection (which
DynamoDB uses) must not require too much CPU resources, in case if you do
not launch a super duper connection pool for the client.

Yes, using provisioned concurrency feature may be a kind of workaround, but
it is not a solution, you can not keep many provisioned lambda instances
because it costs money (I think using the 3 Gb Lambda will be more
cheaper) .
Also, in case of spike of requests you will get the issue again, when you
need more lambda instances than you have provisioned.

I still think that there is an issue in DynamoDB client library for java,
maybe it is not optimized properly or has a problem with a dependency libs
which DynamoDB client uses.

On Tue, Mar 31, 2020 at 1:44 AM Zoe Wang notifications@github.com wrote:

@Nxtra https://github.com/Nxtra Yeah, you can consider pre-warmuping
the client by making a simple GET api after you create the client to
establish a connection in advance. By default, the SDK closes idle
connections after 60 seconds. You can increase it, but keep in mind that
idle connections can also be closed on the service side after certain
amount of time, which cannot be configured on the SDK side.

https://github.com/aws/aws-sdk-java-v2/blob/c72d2b554de92258b9cfced4f8f54f3ba714cf92/http-client-spi/src/main/java/software/amazon/awssdk/http/SdkHttpConfigurationOption.java#L118

I'm not super familiar with provisioned concurrency feature of Lambda
function, and I'd recommend reaching out to Lambda team on AWS forums if
you have any questions regarding it.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/aws/aws-sdk-java-v2/issues/1340#issuecomment-606290217,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AGYOKFGIGJEOD4Z67PYEX5LRKEODTANCNFSM4IBIJFLA
.

--
Best regards,
Sergei Biletnikov

Hi @biletnikov, from your previous comment, it seems you are using DynamoDBMapper from AWS Java SDK 1.11.x. We are aware of the slow startup issue in 1.11.x, and unfortunately there is not much we can do to improve it while still maintaining backwards compatibility.

We are actively working on the implementation of Dynamodb mapper 2.x, and it's currently in preview. https://github.com/aws/aws-sdk-java-v2/tree/master/services-custom/dynamodb-enhanced

Hi Zoe,

right, I am using DynamoDBMapper from AWS Java SDK 1.11.x
Ok, good that you confirm the startup issue in this lib.

Looking forward for Dynamodb mapper 2.x release, so we could use it for
production projects.

Could you roughly estimate when it will be ready? in 1 month, 3 months, 6
months ?

Thanks.

On Tue, Mar 31, 2020 at 8:37 PM Zoe Wang notifications@github.com wrote:

Hi @biletnikov https://github.com/biletnikov, from your previous
comment, it seems you are using DynamoDBMapper from AWS Java SDK 1.11.x. We
are aware of the slow startup issue in 1.11.x, and unfortunately there is
not much we can do to improve it while still maintaining backwards
compatibility.

We are actively working on the implementation of Dynamodb mapper 2.x, and
it's currently in preview.
https://github.com/aws/aws-sdk-java-v2/tree/master/services-custom/dynamodb-enhanced

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/aws/aws-sdk-java-v2/issues/1340#issuecomment-606770863,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AGYOKFBROXRYBBHOLBQLP5TRKIS37ANCNFSM4IBIJFLA
.

--
Best regards,
Sergei Biletnikov

Hi @biletnikov, from your previous comment, it seems you are using DynamoDBMapper from AWS Java SDK 1.11.x. We are aware of the slow startup issue in 1.11.x, and unfortunately there is not much we can do to improve it while still maintaining backwards compatibility.

We are actively working on the implementation of Dynamodb mapper 2.x, and it's currently in preview. https://github.com/aws/aws-sdk-java-v2/tree/master/services-custom/dynamodb-enhanced

Great to hear that!
Can you give some insight in what is causing that behavior in 1.11.x ?

Hi @biletnikov, we don't have a public timeline at the moment, but DynamoDbMapper v2 is definitely one of our top priorities right now. The work is tracked in #35

Hi @Nxtra Sure! One of the main reasons is that 1.11.x SDK uses ApacheHttpClient under the hood and initializing it can be expensive. Checkout this comment https://github.com/aws/aws-sdk-java/issues/1774#issuecomment-426431217 for more info. Let me know if you have further questions!

@zoewangg Hi, we've used enhanced since June 2020, and current using version is one of Mar, 2021, we noticed that slow start up is still existing and hard to optimized event if I did a pre dummy request for one of our DynamoDbTalbe<*>.
Not sure if your team is still profiling the initial start up of new sdk, this really matters for some of the time sensitive request.

Hi @talentprince thank you for your feedback. Are you using static table schema? Can you share how you create the DynamodbEnhancedClient? What is your memory setting?

On a side note, if startup latency is critical to your application, you might consider GraalVM native image, which has faster startup time compared with JVM. https://aws.amazon.com/blogs/developer/graalvm-native-image-support-in-the-aws-sdk-for-java-2-x/

Hi, @zoewangg , we are not using lambda but springboot+eks, memory is 600M, DynamodbEnhancedClient is a spring bean, and table schema is initialized as fields of our storage client (another bean).
Due to our architecture, it's hard to ship to native image...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SamBumgardner picture SamBumgardner  Â·  4Comments

tigertoes picture tigertoes  Â·  6Comments

EthanStandel picture EthanStandel  Â·  3Comments

hmatland picture hmatland  Â·  3Comments

millems picture millems  Â·  6Comments