https://github.com/nestybox/sysbox is open source now :upside_down_face:
This sits below docker, so we'll need to think about how this fits into the current abstractions.
I think to start, we can gate it behind KIND_EXPERIMENTAL_RUNTIME=sysbox-runc (in absence of a standard env to do this in docker).
cc @ctalledo
we should also probe docker to see if sysbox is the default runtime and gracefully handle that.
Thanks for opening the issue Ben, I will be glad to help add support for Sysbox in KinD.
cc @rmolina
Thanks @BenTheElder, will look into KIND_EXPERIMENTAL_RUNTIME approach.
Likely to require https://github.com/kubernetes/kubernetes/commit/db0c4cbe9f797b85dfacb41daf2894ddf05e7b44 and https://github.com/kubernetes/kubernetes/commit/503cff058798342be74c12f077f1d85cb4205ad9 (https://github.com/kubernetes/kubernetes/pull/92863/commits), as sysbox uses user namespaces
@AkihiroSuda thanks for pointing that out, these changes make sense. But please keep in mind that they are not required by Sysbox to host K8s clusters (Sysbox already handles K8s sysctl write attempts). You can easily test it yourself by looking at this KinD fork (with very minimal changes) that we created as a prototype.
Do we have numbers to compare how sysbox can enhance kind performance?
@felipecrs : sysbox would not enhance kind performance; it's benefits would mainly be functional, such as:
1) Removing the need for using privileged containers for the K8s node containers (i.e., enhancing isolation between the k8s node containers and the host).
2) Since sysbox does partial emulation of /proc and will support emulation of /proc/cpuinfo and /proc/meminfo in the near future, this enables the K8s scheduler running inside the kind cluster to better schedule workloads according to the resources consumed by each k8s node container. This is helpful in scenarios where users want more realistic scheduling in the kind cluster.
An additional benefit would be that sysbox removes the need for many of the actions taken in the KinD entrypoint script, but it's not clear to me that KinD would be able to take advantage of this given that it has to support the OCI runc which does require the entrypoint.
Finally, sysbox has some optimizations that save a lot of disk space when the inner containers/pods are spawned by running K8s + Docker inside the container. But KinD nodes use containerd only (not Docker) inside the container, and thus said optimizations don't apply.
Thank you so much @ctalledo for the great explanation.
I was asking because we have a very resource consumer CI which install many applications in a KinD cluster, so any kind of optimization would be welcome.
Despite that, the benefits that sysbox brings already worth it.