Containers-roadmap: [EKS] [request]: A reliable EKS AMI release process

Created on 7 Jun 2019  路  22Comments  路  Source: aws/containers-roadmap

My request: The AMI release process needs to be reliable

We have been EKS users since the first preview and one significant pain point is issues with the AMI. From the outside it looks like the release process is inconsistent and/or unreliable.

Here's some notable examples:

  1. Simple log rotation missing in AMI: https://github.com/awslabs/amazon-eks-ami/pull/74
  2. Commits missing from a release: https://github.com/awslabs/amazon-eks-ami/issues/215
  3. Names of releases was not consistent in the beginning: https://github.com/awslabs/amazon-eks-ami/releases
  4. New AMI version breaking ulimit settings: https://github.com/awslabs/amazon-eks-ami/issues/193
  5. Typos in some init scripts: https://github.com/awslabs/amazon-eks-ami/pull/192
  6. Changelog incorrectly filled out: https://github.com/awslabs/amazon-eks-ami/pull/241
  7. AMI v20190329 released in the AWS console but not on Github: https://github.com/awslabs/amazon-eks-ami/issues/233#issuecomment-499590330
  8. Ulimit thing STILL not fixed, read this comment to get the history on this issue, it's crazy: https://github.com/awslabs/amazon-eks-ami/issues/278#issuecomment-498693960

Now, I know and expect some bugs, I also recognise it is (or was) a new service so of course there's a few kinks to work out, but the last 2 items are recent. This is not the quality that I have come to expect from my favourite cloud provider 馃挃

EKS Proposed

Most helpful comment

Thank you, that sounds great! Sorry for nagging, but would it be possible to synchronize GitHub releases with AMI releases? If not, maybe disabling GitHub releases altogether would make sense, to reduce potential confusion.

All 22 comments

@mhausenblas

EKS AMI releases appear to have simply ceased since 29 March. I could find not AMIs that are newer, and so important fixes to self-inflicted wounds like ulimit are not available.

The project home page and releases page lists the latest AMI's as 27 March. And the AWS Marketplace says the latest AMI version is 20 February.

image

image

image

Additionally, if there is a supported method for subscribing to EKS AMI updates, I haven't found it in the documentation.

@wolverian you could watch https://github.com/awslabs/amazon-eks-ami in releases-only mode, that corresponded well with the releases so far.

you could watch https://github.com/awslabs/amazon-eks-ami in releases-only mode

This is exactly what I do. But this only works when the release process is reliable, which is what this issue is about 馃檹

@max-rocket-internet I meant by release-time, not necessarily by content ;-) Although I have to agree that the tagging in this case is slightly misleading.

I meant by release-time, not necessarily by content

I get you but number 7 on my list shows that there have been releases in AWS console but not on github. That's why I mentioned it 馃槂

Hi, sorry for the issues and trouble caused.

We are working on an more standard AMI release process which can be used to release AMIs more frequently.
There will be SNS notifications for new AMI release notifications
Also, there will be SSM public parameter that references the lastest available AMI.

BTW, the github release doesn't have correlation with our AMI release, e.g. we may release new AMIs in case of security patches.

Thank you, that sounds great! Sorry for nagging, but would it be possible to synchronize GitHub releases with AMI releases? If not, maybe disabling GitHub releases altogether would make sense, to reduce potential confusion.

Synchronized releases would be :heart:. We pull information about recent updates for all our services over RSS and GH fits nicely into this workflow.

Synchronized releases would be 鉂わ笍

Exactly 馃挴

An SNS topic is not very user friendly when much of our current software and work flow already revolves around Github.

@max-rocket-internet @szymonpk
Hi, would you help describe how your current workflow build AMI? Did you rebuild AMI with this repo or use our published AMI as base image?

There is actually three component that can be involved in release process here:

  1. The AMI build binary artifacts(binaries in our amazon-eks S3 bucket). It's updated when we release new Kubernetes versions/binary patches.
  2. The AMI build source code(build scripts/files in this `amazon-eks-ami' repo).
  3. The actual AMI releases. This can happen when the above two changed, or the a newer base AmazonLinux2 AMI have been released(e.g. securityPatches).

For Github release, I think it can be synced with 1 and 2 above.

@M00nF1sh that all sounds like a big improvement on the current situation. There is also an immediate need, as the latest available EKS AMIs are more than 3 months old and contain some pretty problematic bugs, like this ulimit issue. We you be able to cut an updated AMI using the old process in the meantime?

@whereisaaron We'll release new amis on monday 馃槃

@wolverian more context on what @M00nF1sh mentioned, we are working on launching an AMI SSM parameter that you can use along with an SNS topic that you can subscribe to. See https://github.com/aws/containers-roadmap/issues/231

@M00nF1sh

Hi, would you help describe how your current workflow build AMI? Did you rebuild AMI with this repo or use our published AMI as base image?

Sure: We don't build AMIs. We want to avoid this administrative overhead. We just want reliable, bug free AMIs provided from AWS with:

  • A clear and consistent change log (6)
  • Releases published on Github (7)
  • Consistency between Github release, change log and AMI in AWS console (2, 3, 7)
  • No basic/obvious bugs (1, 5, 8)

(the numbers are references to my points in my first post).

we are working on launching an AMI SSM parameter that you can use along with an SNS topic that you can subscribe to

This problem is non-existent for Terraform users 馃槂

@M00nF1sh
Last month, not every important (important from our pov, not necessarily with severity important) security update got new EKS AMI image. Or there was no clear notification about it. We are subscribed to ALAS2 and AWS security bulletins. Our use case varies if there is official AMI we use it. If there is an important update without new AMI we build our own images. Last time we had to build own images was when Intel security issues were disclosed.

The best solution for us, would be synchronized 1, 2 and 3. I see 3 may not happen, so it may be a good idea to have something similar to ALAS2 bulletin but for EKS. However, I see it is rather unlikely to have separate security feeds for AWS services.

This problem is non-existent for Terraform users

Well, it appears I spoke too soon 馃檨

In the Hashicorp EKS Terraform module, we filter AMI IDs by name. This means we can easily specify a release by its name and not worry about what region we are in. That's item number 3 is important. But in AWS region us-west-2 there's 2 AMIs with the name amazon-eks-gpu-node-1.13-v20190614: https://us-west-2.console.aws.amazon.com/ec2/v2/home?region=us-west-2#Images:visibility=public-images;ownerAlias=602401143452;search=amazon-eks-gpu-node-1.13-v20190614;sort=desc:creationDate

Screen Shot 2019-06-24 at 15 54 05

Even if you aren't using Terraform, how would you know what AMI is the correct one?

Noted in https://github.com/awslabs/amazon-eks-ami/issues/291

This is what I mean by "release process". I don't know what the process behind the scenes is, or how it works, but if it was automated in a proper this should be impossible.

we are working on launching an AMI SSM parameter that you can use along with an SNS topic that you can subscribe to.

@tabern I just want to highlight one more point related SSM, I got an issue in which the SSM got updated to amazon/amazon-eks-node-1.15-v20200312 around around 3 weeks back, however CF didn't support the new value for ReleaseVersion, which causes stack update failed.

Just checked and find out that newest release for AMI https://github.com/awslabs/amazon-eks-ami/releases v20200406, not sure if it's supported by CF or not.

Not sure if it is entirely related to this issue, but I find the Kubernetes version matching AMIs quite uncomfortable. Preferably, I'd like to see a list of Kubernetes versions (including the patch version) and the AMI in a single list. As far as I can tell, there is no way to tell which exact Kubernetes version is used in an AMI.

Bumping up against the ulimit hard limit issue in EKS Fargate (see also: https://github.com/aws/containers-roadmap/issues/1013) which is rather painful because the whole point of Fargate is to abstract away node complexity. Now I'm wishing I was embracing that managed node complexity, because it would at least provide me an escape hatch.

is it planned to support AWS AMI builder https://github.com/awslabs/amazon-eks-ami/issues/548 ?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yinshiua picture yinshiua  路  3Comments

sarath9985 picture sarath9985  路  3Comments

talawahtech picture talawahtech  路  3Comments

MartinDevillers picture MartinDevillers  路  3Comments

ORESoftware picture ORESoftware  路  3Comments