Aws-sdk-net: AssumeRoleWithWebIdentityCredentials does not complete authorisation in EKS

Created on 8 Feb 2020  路  7Comments  路  Source: aws/aws-sdk-net

Expected Behavior

When using AssumeRoleWithWebIdentityCredentials with a service account to receive from SQS to an EKS cluster, retrieving credentials hangs forever.

Steps to Reproduce

We have set up EKS cluster and roles according to https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-technical-overview.html. However we cannot authenticate using .Net SDK.

Here is what I have setup in EKS

apiVersion: v1
kind: ServiceAccount
metadata:
  name: my-service-account
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::xxx:role/xxx
template:    
    spec:
      serviceAccountName: my-service-account

I am aware of the issue of running this as non-root and the work around using the fsGroup, however for the sake of experimentation I have tried this as running as root and non-root.

I can see in the pod that the AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE have both been correctly set. I have also added the AWS_REGION environment variable to see if that made any difference.

AWS_REGION:  eu-west-2
AWS_ROLE_ARN: arn:aws:iam::xxx:role/xxx
AWS_WEB_IDENTITY_TOKEN_FILE:  path/to/token

My application is a .Net Core 3.1 web API project, here is how I have tried to connect to SQS.

(code simplified)

var amazonSQSConfig = new AmazonSQSConfig()
{
    ServiceURL = "https://sqs.eu-west-2.amazonaws.com/xxx",
    RegionEndpoint = RegionEndpoint.GetBySystemName("eu-west-2")
};

var amazonSQSClient = new AmazonSQSClient(amazonSQSConfig);

ReceiveMessageRequest receiveMessageRequest = new ReceiveMessageRequest("https://sqs.eu-west-2.amazonaws.com/xxx/my-queue")
{
     WaitTimeSeconds = 5
};

ReceiveMessageResponse receiveMessageResponse = await amazonSQSClient.ReceiveMessageAsync(receiveMessageRequest);

The below line hangs forever and no error is returned.

ReceiveMessageResponse receiveMessageResponse = await amazonSQSClient.ReceiveMessageAsync(receiveMessageRequest);

It is my understanding that if no credentials are passed to the AmazonSQSClient then the FallbackCredentialFactory should automatically use the AssumeRoleWithWebIdentityCredentials method and pull the token out.

I've also tried setting the credentials myself but the same issue occurs.

credentials = AssumeRoleWithWebIdentityCredentials.FromEnvironmentVariables();
amazonSQSClient = new AmazonSQSClient(credentials, amazonSQSConfig);

It appears to be similar to this closed issue, https://github.com/aws/aws-sdk-net/issues/1493.

Context

This issue is preventing us from using service accounts to authenticate our use of SQS queues. We use a microservice architecture where communication between them is predominately through SQS queues so this is a major blocker for us.

Your Environment

AWS EKS 1.14
AWS SQS

.Net Core 3.1 (SDK and Runtime)

AWSSDK.Core 3.3.104.22
AWSSDK.SecurityToken 3.3.104.27
AWSSDK.SQS 3.3.102.61

<Project Sdk="Microsoft.NET.Sdk.Web">

  <PropertyGroup>
    <TargetFramework>netcoreapp3.1</TargetFramework>
    <LangVersion>latest</LangVersion>
    <Version>1.0.0</Version>
  </PropertyGroup>

  <ItemGroup>
    <PackageReference Include="AWSSDK.Core" Version="3.3.104.22" />
    <PackageReference Include="AWSSDK.SecurityToken" Version="3.3.104.27" />
    <PackageReference Include="AWSSDK.SQS" Version="3.3.102.61" />
    <PackageReference Include="Microsoft.AspNetCore.Diagnostics.HealthChecks" Version="2.2.0" />
    <PackageReference Include="Microsoft.EntityFrameworkCore.Tools" Version="3.1.0" />
    <PackageReference Include="Microsoft.Extensions.Diagnostics.HealthChecks.EntityFrameworkCore" Version="3.1.1" />
    <PackageReference Include="Npgsql.EntityFrameworkCore.PostgreSQL" Version="3.1.0" />
    <PackageReference Include="Npgsql.EntityFrameworkCore.PostgreSQL.Design" Version="1.1.1" />
    <PackageReference Include="Newtonsoft.Json" Version="12.0.3" />
    <PackageReference Include="SonarAnalyzer.CSharp" Version="8.3.0.14607">
      <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
      <PrivateAssets>all</PrivateAssets>
    </PackageReference>
  </ItemGroup>

</Project>
guidance

Most helpful comment

We have been able to authenticate tokens outside of the SDK in a pod using the CLI so seems to be an SDK issue.

I ran GetCredentials as below, was that what you meant?

credentials = AssumeRoleWithWebIdentityCredentials.FromEnvironmentVariables();
var creds = credentials.GetCredentials();

All 7 comments

You should already have credentials by the time this call is made. Can you please set logging to validate that the credentials are null?

You may also want to verify that your cluster and SQS can connect if you are using a VPC. https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html

Are you waiting for WaitTimeSeconds to expire? This could run for five seconds.

Also, ServiceURL and RegionEndpoint are mutually exclusive. Setting one will clear the other. If not set, the region will be determined from the ServiceURL.

It retrieves the credentials ok, here is the output showing the value of the RefreshingAWSCredentials object.

"WebIdentityTokenFile":"/var/run/secrets/eks.amazonaws.com/serviceaccount/token","RoleArn":"arn:aws:iam::xxx:role/xxx","RoleSessionName":"e299227e-ba1b-4fc2-b2e9-3013637da1d5","PreemptExpiryTime":"00:05:00"

The cluster is fine connecting to SQS as previously this setup worked fine with BasicAWSCredentials.

The WaitTimeSeconds is how long the receive command should wait after initially polling for messages.

Thanks for the tip on ServiceUrl vs RegionEndpoint, I didn't realise that. I'll tidy it up but doesn't work with either one or with neither.

The issue seems to be that ReceiveMessageResponse receiveMessageResponse = await amazonSQSClient.ReceiveMessageAsync(receiveMessageRequest); just hangs.

Does credentials.GetCredentials() return for you?

It throws an exception.

Unhandled exception. Amazon.Runtime.AmazonClientException: Error calling AssumeRole for role arn:aws:iam::xxxx:role/xxx
 ---> Amazon.SecurityToken.AmazonSecurityTokenServiceException: Not authorized to perform sts:AssumeRoleWithWebIdentity ---> Amazon.Runtime.Internal.HttpErrorResponseException: Exception of type 'Amazon.Runtime.Internal.HttpErrorResponseException' was thrown.
   at Amazon.Runtime.HttpWebRequestMessage.GetResponse()
   at Amazon.Runtime.Internal.HttpHandler`1.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Unmarshaller.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorHandler.InvokeSync(IExecutionContext executionContext)
   --- End of inner exception stack trace ---
   at Amazon.Runtime.Internal.HttpErrorResponseExceptionHandler.HandleException(IExecutionContext executionContext, HttpErrorResponseException exception)
   at Amazon.Runtime.Internal.ExceptionHandler`1.Handle(IExecutionContext executionContext, Exception exception)          at Amazon.Runtime.Internal.ErrorHandler.ProcessException(IExecutionContext executionContext, Exception exception)      at Amazon.Runtime.Internal.ErrorHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Signer.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointDiscoveryHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CredentialsRetriever.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RetryHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.EndpointResolver.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.Marshaller.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.CallbackHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.ErrorCallbackHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.PipelineHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.MetricsHandler.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.Internal.RuntimePipeline.InvokeSync(IExecutionContext executionContext)
   at Amazon.Runtime.AmazonServiceClient.Invoke[TResponse](AmazonWebServiceRequest request, InvokeOptionsBase options)    at Amazon.SecurityToken.AmazonSecurityTokenServiceClient.AssumeRoleWithWebIdentity(AssumeRoleWithWebIdentityRequest request)
   at Amazon.SecurityToken.AmazonSecurityTokenServiceClient.Amazon.Runtime.SharedInterfaces.ICoreAmazonSTS_WebIdentity.CredentialsFromAssumeRoleWithWebIdentityAuthentication(String webIdentityToken, String roleArn, String roleSessionName, AssumeRoleWithWebIdentityCredentialsOptions options)
   --- End of inner exception stack trace ---
   at Amazon.SecurityToken.AmazonSecurityTokenServiceClient.Amazon.Runtime.SharedInterfaces.ICoreAmazonSTS_WebIdentity.CredentialsFromAssumeRoleWithWebIdentityAuthentication(String webIdentityToken, String roleArn, String roleSessionName, AssumeRoleWithWebIdentityCredentialsOptions options)
   at Amazon.Runtime.AssumeRoleWithWebIdentityCredentials.GenerateNewCredentials()
   at Amazon.Runtime.RefreshingAWSCredentials.GetCredentials()
   at FFCDemoPaymentService.Messaging.SqsReceiver.SetCredentials() in /app/Messaging/SqsReceiver.cs:line 46
   at FFCDemoPaymentService.Messaging.SqsReceiver.StartPolling() in /app/Messaging/SqsReceiver.cs:line 31
   at FFCDemoPaymentService.Messaging.MessageService.StartPolling() in /app/Messaging/MessageService.cs:line 46           at FFCDemoPaymentService.Messaging.MessageService.ExecuteAsync(CancellationToken stoppingToken) in /app/Messaging/MessageService.cs:line 37
   at Microsoft.Extensions.Hosting.BackgroundService.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.Internal.Host.StartAsync(CancellationToken cancellationToken)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)        at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.RunAsync(IHost host, CancellationToken token)
   at Microsoft.Extensions.Hosting.HostingAbstractionsHostExtensions.Run(IHost host)
   at FFCDemoPaymentService.Program.Main(String[] args) in /app/Program.cs:line 14

Hmm. This call uses AnonymousAWSCredentials because the API doesn't require credentials, so I don't think it's an SDK problem. My guess is that you have either configured some kind of proxy that might be causing issues or your source for your OIDC token isn't allowing authentication. Where are you sourcing the token from? Some people get this error when using Cognito's tokens, and I could send you resources to help with that, such as this: https://docs.amazonaws.cn/en_us/cognito/latest/developerguide/role-trust-and-permissions.html.

If you have Premium Support, they may be able to help you more with these EKS and IAM specifics too.

We have been able to authenticate tokens outside of the SDK in a pod using the CLI so seems to be an SDK issue.

I ran GetCredentials as below, was that what you meant?

credentials = AssumeRoleWithWebIdentityCredentials.FromEnvironmentVariables();
var creds = credentials.GetCredentials();

We raised a support ticket with AWS and they confirmed some steps that should work. It was actually pretty similar to what I had tried above.

However, what I had magically started working without making any changes, so I'm not sure if they also made changes to enable this feature in the eu-west-2 region.

Either way it confirmed it's not an SDK issue, so I'm closing my issue.

Was this page helpful?
0 / 5 - 0 ratings