kops currently uses a Debian image with a custom kernel by default.
Unfortunately, the kernel doesn't appear to be installed from an APT repository or the repository hasn't been updated, so I can't see an easy way to update it.
The recent Stack Clash vulnerability shows that kops needs two things:
Two questions:
An automated way to deal with master and node security updates in general
You can perform a rolling update on the cluster to update all of your nodes. Not sure what else you would like.
Provide an up-to-date kernel or use an out-of-the-box one that's updated by someone else (like Debian)
https://github.com/kubernetes/kops/issues/874 and a couple of other issues, that I cannot find right now, document that m3 and m4 have had kernel panics with the scheduler. Because of that, we are running a custom kernel. I think 4.10 has the fixes in the kernel that we need, but I cannot find the issue.
How can I help
We have developer office hours on every other Friday, see the README.md for more details. Please swing buy to discuss more.
The kernel is maintained by @justinsb and the image is built out of another project https://github.com/kubernetes/kube-deploy/tree/master/imagebuilder
As far as I understand, performing a rolling update doesn't update the operating system of the VMs. Rather, the old instance is destroyed and a new one is created, based on an AMI image (if using AWS) which is containing outdated packages. Manually executing "apt update && upgrade" doesn't seem a viable option mainly because of autoscaling and the impossibility to reboot. Correct?
[EDIT] I've found the answer to this question by myself, the VMs are using unattended upgrades, which is ok for updates that don't require reboot [/EDIT]
So we normally have automatic updates enabled, and so you should get package updates.
However, it looks like we do need to build a new kernel version to get these patches. There's been some interruption because of some folks discussing where that should live, but I'll kick off another build in the regular process - thanks for the prompt!
When we build a new kernel version, we also build a new AMI, so that the updates are then "baked in" to the image. And that can then be applied with a rolling update - we bump the channels/alpha file and then the channels/stable file. Some people do prefer to disable automatic updates and rely on this mechanism, but it is more manual.
@chrislovecnm @justinsb Thanks for the feedback!
So this seems more like a lack of documentation than a missing update mechanism. I think people should be aware of how security updates are supposed to work with kops and should know that rolling updates are not only necessary for configuration changes (incl. Kubernetes upgrades), but also for security updates (at least kernel updates, but sometimes it's also necessary to restart processes to receive fixes in shared library packages).
It could also make sense to build the AMIs regularly (e.g. daily) with all package updates installed, so people can update more easily. I'm not sure how that would fit into the current alpha/stable channels process, though. The decision on which packages to update immediately for security (OS security updates?) vs. which packages to keep at a specific version for stability (Kubernetes releases and non-security OS updates?) is probably not going to be easy.
I'll have a more detailed look at the image and kernel build process and might make a merge request for documentation on how I understand that software updates (mainly kernel and OS packages) work with kops. Unfortunately, we just changed our prioritization regarding Kubernetes in production (mostly related to the software we're going to deploy), so it might be a few weeks until I've got time for that.
@justinsb Will it be possible for us to make use of something like Packer to add on additional software to the AMI. I would like to add a tool like AWS Inspector and a HIDS on the host to assist with my security monitoring and vulnerability scanning.
@hatityechindove you can use kube-deploy/imagebuilder to build updated AMIs. Seems to work well, albeit with a few quirks
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Prevent issues from auto-closing with an /lifecycle frozen comment.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
So we normally have automatic updates enabled, and so you should get package updates.
I haven't seen the mechanism that the packages are auto-updated in the code? Where can I find it?
In general the topic (security) updates is a big one. We don't want to have (Debian) stretch updates running which are over half a year old without having the possibility to install updates.
Most helpful comment
In general the topic (security) updates is a big one. We don't want to have (Debian) stretch updates running which are over half a year old without having the possibility to install updates.