Aws-sdk-java: Support for EKS 'IAM for service accounts' not default [1.11.623, 1.11.653]

Created on 30 Oct 2019  路  28Comments  路  Source: aws/aws-sdk-java

Description

Despite the documentation saying a minimum version of 1.11.623 should allow IAM Service Accounts to works, I'm running pods and seeing that that run with the IAM role of the _node_ and not the role linked to the IAM ServiceAccount I am running as.

Another using describes the same problem with 1.11.653 here.

I have discovered a hack that fixes the issue:

    // NOTE: For some unknown reason if this call isn't made then the following code
    // uses the K8s node's role and not the IAM ServiceAccount role.
    // TODO(Jonathon): Remove when the above weirdness is fixed.
    val client = AWSSecurityTokenServiceClientBuilder.standard.build
    val request = new GetCallerIdentityRequest()
    val _ = client.getCallerIdentity(request)

That above identity request returns the correct role, and then subsequently doing something like this, AmazonSNSAsyncClientBuilder.defaultClient(), will get me a client authenticated to the IAM role associated with the service account.

investigating

Most helpful comment

The fix has been released as part of 1.11.704 and please try with the latest version.
https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md#111704-2020-01-09

We added 1.11.704 last week-ish, and it looks to have fixed the issue. 馃帀

So to clarify, we need to upgrade SDKs to ^1.11.704 _and_ also include aws-java-sdk-sts@^1.11.704? We found that without aws-java-sdk-sts pods will not use the web identity token credentials no matter what version of the SDK it is.

I tried to upgrade aws-java-sdks to 1.11.704, also including [email protected], without code changes, and it didn't work:

sqs = AmazonSQSClientBuilder.defaultClient()

Changing my client initialization to this, it works:

val awsCredentialProvider = WebIdentityTokenCredentialsProvider
            .builder()
            .roleArn(System.getenv("AWS_ROLE_ARN"))
            .roleSessionName(System.getenv("AWS_ROLE_SESSION_NAME"))
            .webIdentityTokenFile(System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"))
            .build()
        val sqs = AmazonSQSClientBuilder.standard().withCredentials(
            awsCredentialProvider
        ).build()

If AmazonSQSClientBuilder.defaultClient() returns DefaultAWSCredentialsProviderChain, and according to https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html WebIdentityTokenCredentialsProvider is already included on DefaultAWSCredentialsProviderChain, with higher precedence,
should this be working already?

Detail: I'm also using IAM Policy attached on EC2 worker nodes (EKS cluster purposes).

All 28 comments

Something like this works:

      .standard()
      .withCredentials(WebIdentityTokenCredentialsProvider.create)
      .build()

So it's just a problem that WebIdentityTokenCredentialsProvider isn't in the default toolchain.

Hi @thundergolfer
WebIdentityTokenCredentialsProvider is in the default credential provider chain in v1 though.

https://github.com/aws/aws-sdk-java/blob/5bafd0d6c4a979bc55fcf6709e93c9983bd2d1e6/aws-java-sdk-core/src/main/java/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.java#L45-L50

Could it be possible that you had some other credentials providers available on your credential chain and took precedence?

Hey @zoewangg thanks for the reply,

Yeh you're right seems it landed in the default chain in .620. I'm 95% sure I had no code that was configuring any providers. I was just requesting the default clients for SNS, S3 and leaving it up to the default chain.

Given that someone else has seen this is, I'm confident _something_ is going on here.

Do you have any advice on how I would best track this down? Perhaps simply enabling DEBUG level logs would show behaviour of the provider chain?

Yeah, it'd be great if you could enable debug logging and provide the logs here (Be sure to redact any sensitive info from the logs)

I am debugging a similar issue. AWS SDK fails to recognize AWS_WEB_IDENTITY_TOKEN_FILE environment variable and defaults to EC2ContainerCredentialsProviderWrapper.
SDK version used is : aws-sdk-java/1.11.671

Debug logs captured below,

2019-11-10T14:31:44,137 DEBUG [appenderator_merge_0] com.amazonaws.auth.AWSCredentialsProviderChain - Unable to load credentials from org.apache.druid.common.aws.ConfigDrivenAwsCredentialsConfigProvider@1de783b1: Unable to load AWS credentials from druid AWSCredentialsConfig
2019-11-10T14:31:44,138 DEBUG [appenderator_merge_0] com.amazonaws.auth.AWSCredentialsProviderChain - Unable to load credentials from org.apache.druid.common.aws.LazyFileSessionCredentialsProvider@6b9f21a6: cannot refresh AWS credentials
2019-11-10T14:31:44,138 DEBUG [appenderator_merge_0] com.amazonaws.auth.AWSCredentialsProviderChain - Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
2019-11-10T14:31:44,139 DEBUG [appenderator_merge_0] com.amazonaws.auth.AWSCredentialsProviderChain - Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
2019-11-10T14:31:44,139 DEBUG [appenderator_merge_0] com.amazonaws.auth.AWSCredentialsProviderChain - Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@22343cb: profile file cannot be null
2019-11-10T14:31:44,150 DEBUG [appenderator_merge_0] com.amazonaws.auth.AWSCredentialsProviderChain - Loading credentials from com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@44e06fc8

In those logs WebIdentity isn't present at all, and it seems you're on the latest version. It's skipping over it from ProfileCredentials to EC2ContainerCredentials...

What's that on the end here -> com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper@44e06fc8. Is it a commit hash?

I think I'm experiencing the same issue or something similar. Yesterday I updated my application to use a service account and also updated the aws-java-sdk to v1.11.673. Adding my logs here in case they help.

The app fails to deploy to EKS and hits CrashLoopBackOff. In the logs I can't see WebIdentityTokenCredentialsProvider -

{"@timestamp":"2019-11-14T10:47:56.466+00:00","@version":"1","message":"Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.466+00:00","@version":"1","message":"Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.466+00:00","@version":"1","message":"Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@4287d447: profile file cannot be null","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.472+00:00","@version":"1","message":"Unable to load configuration from com.amazonaws.monitoring.EnvironmentVariableCsmConfigurationProvider@54e7391d: Unable to load Client Side Monitoring configurations from environment variables!","logger_name":"com.amazonaws.monitoring.CsmConfigurationProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.472+00:00","@version":"1","message":"Unable to load configuration from com.amazonaws.monitoring.SystemPropertyCsmConfigurationProvider@50b8ae8d: Unable to load Client Side Monitoring configurations from system properties variables!","logger_name":"com.amazonaws.monitoring.CsmConfigurationProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.472+00:00","@version":"1","message":"Unable to load configuration from com.amazonaws.monitoring.ProfileCsmConfigurationProvider@3c8bdd5b: Unable to load config file","logger_name":"com.amazonaws.monitoring.CsmConfigurationProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.479+00:00","@version":"1","message":"Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.479+00:00","@version":"1","message":"Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.480+00:00","@version":"1","message":"Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@4287d447: profile file cannot be null","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.480+00:00","@version":"1","message":"Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.480+00:00","@version":"1","message":"Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
{"@timestamp":"2019-11-14T10:47:56.480+00:00","@version":"1","message":"Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@4287d447: profile file cannot be null","logger_name":"com.amazonaws.auth.AWSCredentialsProviderChain","thread_name":"main","level":"DEBUG","level_value":10000}
... this repeats many times

I also get a StackOverflowError with the stack trace after the AWSCredentialsProviderChain repeated many times. The stack trace does include reference the WebIdentityTokenCredentialsProvider -

Exception in thread "main" java.lang.StackOverflowError
    at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
    at java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:242)
    at java.io.File.exists(File.java:819)
    at com.amazonaws.profile.path.cred.CredentialsDefaultLocationProvider.getLocation(CredentialsDefaultLocationProvider.java:33)
    at com.amazonaws.profile.path.AwsProfileFileLocationProviderChain.getLocation(AwsProfileFileLocationProviderChain.java:41)
    at com.amazonaws.auth.profile.ProfilesConfigFile.getCredentialProfilesFile(ProfilesConfigFile.java:200)
    at com.amazonaws.auth.profile.ProfilesConfigFile.<init>(ProfilesConfigFile.java:100)
    at com.amazonaws.auth.profile.ProfileCredentialsProvider.getCredentials(ProfileCredentialsProvider.java:135)
    at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:117)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.getCredentialsFromContext(AmazonHttpClient.java:1225)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.runBeforeRequestHandlers(AmazonHttpClient.java:801)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:751)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
    at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512)
    at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.doInvoke(AWSSecurityTokenServiceClient.java:1368)
    at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.invoke(AWSSecurityTokenServiceClient.java:1335)
    at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.invoke(AWSSecurityTokenServiceClient.java:1324)
    at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.executeAssumeRole(AWSSecurityTokenServiceClient.java:491)
    at com.amazonaws.services.securitytoken.AWSSecurityTokenServiceClient.assumeRole(AWSSecurityTokenServiceClient.java:464)
    at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.newSession(STSAssumeRoleSessionCredentialsProvider.java:321)
    at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.access$000(STSAssumeRoleSessionCredentialsProvider.java:37)
    at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider$1.call(STSAssumeRoleSessionCredentialsProvider.java:76)
    at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider$1.call(STSAssumeRoleSessionCredentialsProvider.java:73)
    at com.amazonaws.auth.RefreshableTask.refreshValue(RefreshableTask.java:257)
    at com.amazonaws.auth.RefreshableTask.blockingRefresh(RefreshableTask.java:213)
    at com.amazonaws.auth.RefreshableTask.getValue(RefreshableTask.java:154)
    at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.getCredentials(STSAssumeRoleSessionCredentialsProvider.java:299)
    at com.amazonaws.auth.STSAssumeRoleSessionCredentialsProvider.getCredentials(STSAssumeRoleSessionCredentialsProvider.java:36)
    at com.amazonaws.auth.profile.internal.securitytoken.STSProfileCredentialsServiceProvider.getCredentials(STSProfileCredentialsServiceProvider.java:71)
    at com.amazonaws.auth.WebIdentityTokenCredentialsProvider.getCredentials(WebIdentityTokenCredentialsProvider.java:72)

When we implemented IRSA for our Java apps, we found that as configured in the default chain instance profile creds outrank WebIdentity; whereas in the Ruby and Go SDKs WebIdentity took precedence. Since the kubelet auths with IAM there's no way to not have an instance profile, and we don't have the option of blocking the metadata API for pods until every last service in our cluster is using IRSA. Unless the Java SDK changes its precedence order to match the other SDKs we're going to need to provide our own chains in all of our Java apps, which is less than ideal.

I was experiencing a StackOverflowError becuase another library was pulling in an older version of aws-java-sdk-sts which used to initialise a default credentials provider internally, updating this resolved the StackOverflowError.

Once that issue was resolved, as mentioned above we had to provide a credential chain to give WebIdentityTokenCredentialsProvider precedence. It would be helpful if the java sdk's default credentials provider precedence order was changed to give the WebIdentityTokenCredentialsProvider higher precedence.

I also encounter same issue and I upgraded my aws-java-sdk-sts lib version(1.11.623) to same as s3 lib version (1.11.623),it fixed.

In our case all our SDK libraries are on 1.11.659, but the problem persists.

https://github.com/aws/aws-sdk-java-v2/pull/1583 will increase the priority of the web identity environment variables in 2.x (and a similar change will be coming out for 1.11.x) to be higher than the profile file properties, but this is the only issue we could reproduce on our end.

Please make sure you're using the latest SDK runtime/core as well as the latest client version.

We've fixed this by adding aws-java-sdk-sts library to our build. We're on 1.11.699. (apparently when it's missing, default provider chain skips using web identity)

The fix has been released as part of 1.11.704 and please try with the latest version.

https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md#111704-2020-01-09

We added 1.11.704 last week-ish, and it looks to have fixed the issue. 馃帀

@thundergolfer Thanks for reporting back! Closing the issue. Feel free to reopen if you have further question.

The fix has been released as part of 1.11.704 and please try with the latest version.

https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md#111704-2020-01-09

We added 1.11.704 last week-ish, and it looks to have fixed the issue. 馃帀

So to clarify, we need to upgrade SDKs to ^1.11.704 _and_ also include aws-java-sdk-sts@^1.11.704? We found that without aws-java-sdk-sts pods will not use the web identity token credentials no matter what version of the SDK it is.

The fix has been released as part of 1.11.704 and please try with the latest version.
https://github.com/aws/aws-sdk-java/blob/master/CHANGELOG.md#111704-2020-01-09

We added 1.11.704 last week-ish, and it looks to have fixed the issue. 馃帀

So to clarify, we need to upgrade SDKs to ^1.11.704 _and_ also include aws-java-sdk-sts@^1.11.704? We found that without aws-java-sdk-sts pods will not use the web identity token credentials no matter what version of the SDK it is.

I tried to upgrade aws-java-sdks to 1.11.704, also including [email protected], without code changes, and it didn't work:

sqs = AmazonSQSClientBuilder.defaultClient()

Changing my client initialization to this, it works:

val awsCredentialProvider = WebIdentityTokenCredentialsProvider
            .builder()
            .roleArn(System.getenv("AWS_ROLE_ARN"))
            .roleSessionName(System.getenv("AWS_ROLE_SESSION_NAME"))
            .webIdentityTokenFile(System.getenv("AWS_WEB_IDENTITY_TOKEN_FILE"))
            .build()
        val sqs = AmazonSQSClientBuilder.standard().withCredentials(
            awsCredentialProvider
        ).build()

If AmazonSQSClientBuilder.defaultClient() returns DefaultAWSCredentialsProviderChain, and according to https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/DefaultAWSCredentialsProviderChain.html WebIdentityTokenCredentialsProvider is already included on DefaultAWSCredentialsProviderChain, with higher precedence,
should this be working already?

Detail: I'm also using IAM Policy attached on EC2 worker nodes (EKS cluster purposes).

@jhonatanmorais before investigating it further, can you make sure there's no conflicting versions in your environment? Could you run mvn dependency:tree and post the relevant output?

@debora-ito:

+--- com.amazonaws:amazon-sqs-java-messaging-lib:1.0.8
|    \--- org.apache.geronimo.specs:geronimo-jms_1.1_spec:1.1.1
+--- com.amazonaws:aws-java-sdk-sqs:1.11.708
|    +--- com.amazonaws:aws-java-sdk-core:1.11.708
|    \--- com.amazonaws:jmespath-java:1.11.708
|         \--- com.fasterxml.jackson.core:jackson-databind:2.6.7.3 (*)
+--- com.amazonaws:aws-java-sdk-core:1.11.708 (*)
+--- com.amazonaws:aws-java-sdk-sts:1.11.708
|    +--- com.amazonaws:aws-java-sdk-core:1.11.708 (*)
|    \--- com.amazonaws:jmespath-java:1.11.708 (*)
+--- org.apache.logging.log4j:log4j-api:2.13.0
\--- org.apache.logging.log4j:log4j-core:2.13.0
     \--- org.apache.logging.log4j:log4j-api:2.13.0

@jhonatanmorais so I assume you have tried WebIdentityTokenCredentialsProvider.create() and it did not work as well?

@bqnguyen94 I've tried only those two scenarios mentioned above.

@bqnguyen94, did you test using defaultClient() implementation?

@bqnguyen94, did you test using defaultClient() implementation?

We built AWSSecretsManager clients with standard() so that we can specify the region, then build(), so I assume it is the same as defaultClient()

I've just upgraded to 1.11.717 and everything is fine now!

@jhonatanmorais 1.11.717 worked without code changes ?

@bqnguyen94 We built AWSSecretsManager clients with standard() so that we can specify the region, then build(), so I assume it is the same as defaultClient()

  def createS3Client(): AmazonS3 =
    AmazonS3ClientBuilder
      .standard()
      .withEndpointConfiguration(new EndpointConfiguration(appConfig.s3.endpoint, appConfig.s3.region))
      .withPathStyleAccessEnabled(true)
      .build()

we use standard() which is same as default() as you said but the app still doesnt use service account iam role but IAM role of the node instead :/

@SMR39 we use standard() which is same as default() as you said but the app still doesnt use service account iam role but IAM role of the node instead :/

Actually I had to also include aws-java-sdk-sts in the pom file for it to work. I'm still at 1.11.704 though.

@jhonatanmorais 1.11.717 worked without code changes ?

@bqnguyen94 We built AWSSecretsManager clients with standard() so that we can specify the region, then build(), so I assume it is the same as defaultClient()

  def createS3Client(): AmazonS3 =
    AmazonS3ClientBuilder
      .standard()
      .withEndpointConfiguration(new EndpointConfiguration(appConfig.s3.endpoint, appConfig.s3.region))
      .withPathStyleAccessEnabled(true)
      .build()

we use standard() which is same as default() as you said but the app still doesnt use service account iam role but IAM role of the node instead :/

Yes. Did you follow the last recommendation?

@SMR39 we use standard() which is same as default() as you said but the app still doesnt use service account iam role but IAM role of the node instead :/

Actually I had to also include aws-java-sdk-sts in the pom file for it to work. I'm still at 1.11.704 though.

Was this page helpful?
0 / 5 - 0 ratings