What happened?
Running eksctl update cluster -f path/to/config.yml has this error:
[2020-03-26T01:42:46Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:42:46Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 44.139577ms
before returning:
[2020-03-26T01:49:40Z] request expired, resigning
and then executes the update cluster successfully at the end.
What you expected to happen?
Running eksctl update cluster -f path/to/config.yml runs successfully without any timeouts.
How to reproduce it?
Run eksctl on an EC2 instance (I'm using Amazon Linux 2)
Anything else we need to know?
The following IAM role policy was applied to the instance:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"iam:*User*",
"iam:*Login*",
"iam:*Group*",
"iam:*Provider*",
"aws-portal:*",
"budgets:*",
"config:*",
"directconnect:*",
"aws-marketplace:*",
"aws-marketplace-management:*",
"ec2:*ReservedInstances*"
],
"Resource": "*",
"Effect": "Deny"
},
{
"Action": "*",
"Resource": "*",
"Effect": "Allow"
}
]
}
Versions
Please paste in the output of these commands:
$ eksctl version
0.15.0
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15+", GitVersion:"v1.15.10-eks-bac369", GitCommit:"bac3690554985327ae4d13e42169e8b1c2f37226", GitTreeState:"clean", BuildDate:"2020-02-26T01:12:54Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
Logs
[2020-03-26T01:42:41Z] [ℹ] eksctl version 0.15.0
[2020-03-26T01:42:41Z] [ℹ] using region ap-southeast-2
[2020-03-26T01:42:46Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:42:46Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 44.139577ms
[2020-03-26T01:42:52Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:42:52Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 60.808434ms
[2020-03-26T01:42:57Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:42:57Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 176.985772ms
[2020-03-26T01:43:02Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:02Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 269.781184ms
[2020-03-26T01:43:07Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:07Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 494.672656ms
[2020-03-26T01:43:13Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:13Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 1.25120592s
[2020-03-26T01:43:19Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:19Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 3.73933344s
[2020-03-26T01:43:28Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:28Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 6.520018048s
[2020-03-26T01:43:39Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:39Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 9.686971392s
[2020-03-26T01:43:54Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:43:54Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 18.913449984s
[2020-03-26T01:44:18Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:44:18Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 53.410043904s
[2020-03-26T01:45:16Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:45:16Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 1m48.563943424s
[2020-03-26T01:47:10Z] [!] retryable error (RequestError: send request failed
[2020-03-26T01:47:10Z] caused by: Put http://169.254.169.254/latest/api/token: net/http: request canceled (Client.Timeout exceeded while awaiting headers)) from ec2metadata/GetToken - will retry after delay of 2m25.7145856s
[2020-03-26T01:49:40Z] request expired, resigning
[2020-03-26T01:49:41Z] [!] NOTE: config file is used for finding cluster name and region
[2020-03-26T01:49:41Z] [!] NOTE: cluster VPC (subnets, routing & NAT Gateway) configuration changes are not yet implemented
[2020-03-26T01:49:41Z] [ℹ] re-building cluster stack "eksctl-my-cluster-cluster"
[2020-03-26T01:49:41Z] [✔] all resources in cluster stack "eksctl-my-cluster-cluster" are up-to-date
[2020-03-26T01:49:41Z] [ℹ] checking security group configuration for all nodegroups
[2020-03-26T01:49:41Z] [ℹ] all nodegroups have up-to-date configuration
Same issue with both 0.15.0 and 0.16.0-rc.1 using a mac
@jcleal 👋
Same for me. Running eksctl as part of a CI pipeline on an ec2 instance with the same IAM policy above attached to the ec2 instance profile.
Works on eksctl 0.10.0.
Upgraded to 0.16.0 and I see same errors as above.
Due to above comment, tried 0.14.0 - same error.
Rolled back to 0.10.1 - works fine.
Same command run locally (using AWS_PROFILE set to a working profile) works on 0.16.0.
The issue is due to AWS changing the method in which you get a token from the the instance metadata service.
Before aws-sdk-go v1.25.38 the way to GET a token was using the Instance Metadata Service Version 1. Since then, IMDSv2 uses a PUT method
By default, the response to PUT requests has a response hop limit (time to live) of 1 at the IP protocol level. You can adjust the hop limit using the modify-instance-metadata-options command if you need to make it larger. For example, you might need a larger hop limit for backward compatibility with container services running on the instance. For more information, see modify-instance-metadata-options in the AWS CLI Command Reference.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html
To be able to run eksctl in a container and have it successfully authenticate, you will need to have Docker use the host network (to reduce the request's hop count) or increase the max allowed hops to 2 like so:
aws ec2 modify-instance-metadata-options \
--instance-id i-1234567898abcdef0 \
--http-put-response-hop-limit 2 \
--http-endpoint enabled
Sources:
https://github.com/aws/aws-sdk-go/issues/2972
https://rtfm.co.ua/en/aws-eksctl-put-http-169-254-169-254-latest-api-token-net-http-request-canceled-2/
I think we can close this as the solution provided above should solve this issue.
Additionally, we now set the hop limit for instances created with eksctl to 2, so for clusters created with eksctl from now on, it should just work.
Most helpful comment
The issue is due to AWS changing the method in which you get a token from the the instance metadata service.
Before aws-sdk-go v1.25.38 the way to GET a token was using the Instance Metadata Service Version 1. Since then, IMDSv2 uses a PUT method
To be able to run eksctl in a container and have it successfully authenticate, you will need to have Docker use the host network (to reduce the request's hop count) or increase the max allowed hops to 2 like so:
Sources:
https://github.com/aws/aws-sdk-go/issues/2972
https://rtfm.co.ua/en/aws-eksctl-put-http-169-254-169-254-latest-api-token-net-http-request-canceled-2/