Aws-sdk-js: Latency increased after updating [email protected]

Created on 9 Jan 2020 · 10Comments · Source: aws/aws-sdk-js

After updating the module to version 2.600.0 we notice an increasement in the API latency from 100 ms to 4/5 seconds (Percentile 95). After rolling back the update, the latency normalized.
We are only using the SDK for S3 requests.

Screen Shot 2020-01-09 at 16 38 39

service-api

Source

wjsc

👍4

Most helpful comment

Any update on this? Facing the same exact issue in our EKS clusters.

We were able to download the AWSCLI (v1) and run it directly from our cluster's container. It showed one single request to the metadata endpoint, and it seems it did a fallback after that. So the CLI added 1 second of constant latency to every single request. My guess is somehow the aws-sdk-js is doing 3 timeouts (confirmed by the comment in the linked issue).

Additionally, we marked our instances as being "required" for the new authentication flow with ec2 metadata. Like so:

aws ec2 modify-instance-metadata-options --profile default --http-endpoint enabled --http-token required --instance-id i-2834fn

The aws-sdk-js still takes approximately 4 seconds to timeout, but instead throws a Credentials error. This confirms what's already been said in other comments and the linked issue but adding more context and detail in case it helps.

phillycheeze on 22 Apr 2020

👍3

All 10 comments

Hey @wjsc, can you please provide some reproduction steps how you got these metrics?

ajredniwja on 10 Jan 2020

This is a real problem, but it started before v2.600.0. We have a project that was using v2.586.0 and found that SSM and S3 aws calls (and probably for all aws services) took 4+ seconds to execute. Here are the log statements generated from the aws sdk v2.586.0:

[AWS ssm 200 4.583s 0 retries] getParameter({
  Name: '/services/parameter1',
  WithDecryption: true
})
[AWS ssm 200 4.388s 0 retries] getParameter({
  Name: '/services/parameter2',
  WithDecryption: true
})
[AWS ssm 200 4.551s 0 retries] getParameter({
  Name: '/services/parameter3',
  WithDecryption: true
})
[AWS s3 200 4.604s 0 retries] getObject({
  Bucket: 'our-bucket',
  Key: 'our-key'
})

After downgrading to v2.507.0 (same as one of our other projects), we get these times. The only change we did was change the aws sdk version.

[AWS ssm 200 0.067s 0 retries] getParameter({
  Name: '/services/parameter1',
  WithDecryption: true
})
[AWS ssm 200 0.077s 0 retries] getParameter({
  Name: '/services/parameter2',
  WithDecryption: true
})
[AWS ssm 200 0.035s 0 retries] getParameter({
  Name: '/services/parameter3',
  WithDecryption: true
})
[AWS s3 200 0.225s 0 retries] getObject({
  Bucket: 'our-bucket',
  Key: 'our-key'
})

We're running in kubernetes pods in EC2. This is a big issue and at some point we'll want to move to newer versions of the sdk.

jeksmith on 17 Jan 2020

👍2

@ajredniwja I tried to reproduce this bug in an isolated codebase, but I couldn't. Seems to work fine in local.

We're running kubernetes in EKS.

In our case we rollback aws-sdk to v2.562.0

wjsc on 17 Jan 2020

Is this one the same as https://github.com/aws/aws-sdk-js/issues/3024 ?

lukiano on 23 Jan 2020

@lukiano Sure looks like it could be.

jeksmith on 24 Jan 2020

This problem wont be because of the SDK, I have reached out to EKS service team, since they recently made some changes on how they evaluate credentials logic. Will update once I hear back from them.

ajredniwja on 17 Mar 2020

❤1

@wjsc and @jeksmith are you using IAM roles for service accounts?

nckturner on 23 Mar 2020

Any update on this? Facing the same exact issue in our EKS clusters.

Additionally, we marked our instances as being "required" for the new authentication flow with ec2 metadata. Like so:

aws ec2 modify-instance-metadata-options --profile default --http-endpoint enabled --http-token required --instance-id i-2834fn

phillycheeze on 22 Apr 2020

👍3

I'm adding workarounds we found for anyone else suffering from this

Set the metadata timeout to be much lower

AWS.config.credentials = new AWS.EC2MetadataCredentials({
  httpOptions: { timeout: 500 }, // 1/2 second or whatever you want
  maxRetries: 1
});

Downgrade to aws-sdk-js version 2.574.0 before the commit was introduced to use the new signed IMDSv2 metadata flow. (ref: https://github.com/aws/aws-sdk-js/commit/4ed556871d347b07e95da4ec5dd0ffe1bcafe87f)
Use an alternative credentials source before it attempts to hit the metadata api. (harcoded, env variables, config files, etc). The metadata credential source defaults to being the last source to check, so using any other source will temporarily solve this problem.

phillycheeze on 22 Apr 2020

👍1

+1 also having 4-5 second latency for SSM getParameter callls from ECS. Any updates on this?