Serving: Support Sidecar Processes

Created on 17 Oct 2019 · 26Comments · Source: knative/serving

In what area(s)?

/area API

/area autoscale
/area build
/area monitoring
/area networking
/area test-and-release

/kind proposal

Describe the feature

It would be useful to allow developers to run additional sidecar processes next to their application code. Examples:

Abstraction layers like dapr or sidecars based on Go CDK
Cloud SQL proxy https://github.com/knative/serving/issues/4659
BeyondCorp proxies to other services
Injected logging/monitoring systems

areAPI kinfeature lifecyclfrozen

Source

josephlewis42

👍9 ❤5

Most helpful comment

@evankanderson Here's a production use case for multiple containers in one Pod (in addition to metric exporters etc):

Maintaining e.g. an PHP stack with php-fpm and nginx.

I know this can be circumvented by a kind of complex json patch injected by a MutatingAdmissionWebhook but I don't see why Knative would limit on this.

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  volumes:
  - name: sock
    emptyDir: {}
  containers:
  - name: nginx
    image: test
    imagePullPolicy: Never
    command: ["/usr/sbin/nginx", "-g daemon off;"]
    volumeMounts:
      - name: sock
        mountPath: /sock/
  - name: php-fpm
    image: test
    imagePullPolicy: Never
    command: ["php-fpm"]
    volumeMounts:
      - name: sock
        mountPath: /sock/

xvzf on 27 Apr 2020

👍2

All 26 comments

@josephlewis42: The label(s) kind/proposal cannot be applied. These labels are supported: ``

In response to this:

In what area(s)?

/area API

/area autoscale
/area build
/area monitoring
/area networking
/area test-and-release

/kind proposal

Describe the feature

It would be useful to allow developers to run additional sidecar processes next to their application code. Examples:

Abstraction layers like dapr or sidecars based on Go CDK
Cloud SQL proxy https://github.com/knative/serving/issues/4659
BeyondCorp proxies to other services
Injected logging/monitoring systems

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

knative-prow-robot on 17 Oct 2019

For another piece of data, Cloud Foundry supports sidecars: http://v3-apidocs.cloudfoundry.org/version/release-candidate/#sidecars

poy on 17 Oct 2019

I had the very same thought and started a proposal here.
It's still in progress but it captures our thoughts so far. I'll throw it on the agenda for the next API meeting

savitaashture on 17 Oct 2019

Adding two earlier issues along the same lines as well as a document on the previous decision here to only support a single container with the goal of providing context.

Thanks for putting together a proposal and I think it would be great to discuss it further in a future API WG meeting.

https://github.com/knative/serving/issues/1794
https://github.com/knative/serving/issues/3384
https://docs.google.com/document/d/1p42n_WwQc1Z3FsXr5ih4NnMY3PRzY15Ro2EU-mk_EuA/edit?ts=5b9aa556

dgerd on 23 Oct 2019

@dgerd I think that doc is private, would you be able to make it world readable?

Thanks!

josephlewis42 on 24 Oct 2019

@josephlewis42 Looks like it was shared with the knative-dev group. I have added read access for knative-users group as well.

You can find directions to join those groups here: https://knative.dev/community/

dgerd on 30 Oct 2019

Is this possibly a change of decision made in https://github.com/knative/serving/issues/3384#issuecomment-487188293 as RevisionSpecTemplate now has a containers[] field?

ahmetb on 31 Oct 2019

That comment was made after RevisionSpecTemplate had containers[].

The doc that @savitaashture put together explores exactly what changing that decision may look like.

dgerd on 31 Oct 2019

I'd like to also better understand how the two containers relate, and why two containers are better than two processes in one container.

What happens if one container dies and not the other?
One container will be scaled and driven by requests. If we turned off the CPU between requests, would that cause a problem?
Do you need any sort of shared disk/etc between the two processes?
How would you expect monitoring, etc to work for the secondary container?

With respect to workarounds, depending on what your sidecar container is for, it may make sense to use mutating admission controllers to inject the container into all pods that match a certain pattern (ala Istio sidecar injection), but that only works for certain scenarios.

evankanderson on 8 Nov 2019

why two containers are better than two processes in one container.

Application Dependencies - Applications can be written in multiple languages or the same language at multiple versions without concern of conflict (i.e. Python 2.7 and Python 3)
Separate Build Processes - Easier to reason about single repository -> single build -> single container.
Security Boundaries - Blast radius is smaller for compromise of a process separated by a container boundary
Resource Starvation - Guaranteed resources for the processes running in separate containers

I will leave the particular use-case answers to others, but without a use-case in mind my stab at your questions.

Today we have a readinessProbe that determines if the Pod is healthy and I would imagine this would stay the same. So really this question boils down to: if the sidecar cannot become ready/dies does the pod die and stop serving traffic, or can it continue. Atomic readiness seems like the easiest answer here, but not sure if it is the right answer.
The impact here seems the same regardless of if you have 2 processes or 2 containers. We don't do this today in Knative so we will have to consider the impact of this change regardless of if there is a second container. I could imagine that if turning off CPU between requests is important to build it becomes a cluster configurable setting to preserve backwards compatibility. In that case this problem can be controlled by the operator.
I am going to guess that a number of use-cases will need shared disk as it is one of the two ways to share state between containers.
I don't have anything to add here.

The workaround seems feasible, but still subject to the same questions you posed.

dgerd on 8 Nov 2019

I'll also take a stab at this:

I'd like to also better understand how the two containers relate, and why two containers are better than two processes in one container.

Along with the points @dgerd mentioned:

Licensing constraints -- your database might be under a viral license but your main server process isn't or vice-versa.
Separation of duties -- the database team, storage architects, platform operators, and security teams can all provide their own sidecars to unify access to those systems.
Lifecycle management -- sidecars need distinct lifecycle management from the apps if they're managed by 1p and 3p teams. Breaking apart containers means they can be independently tested/verified for industries that require audit compliance.
Two processes in one container means coupling the lifecycle via some other means. It's better to have visility into that at the platform level rather than have everyone invent their own mechanisms.

What happens if one container dies and not the other?

The underlying serving pod should be considered unhealthy--ideally the serving container should also become unhealthy because it should be performing health checks on its dependencies.

One container will be scaled and driven by requests. If we turned off the CPU between requests, would that cause a problem?

I was under the impression that serving scaled requests by increasing/decreasing entire pods based on request metrics, under which both processes should be scaled up/down together so each serving container has a copy of its sidecars.

Do you need any sort of shared disk/etc between the two processes?

I think this is up to the spec. Most of the sidecars I listed only need network accessibility. e.g. the serving process hits the sidecar which uses some logic to manipulate/forward the request.

How would you expect monitoring, etc to work for the secondary container?

Health checks should be done via the main service (e.g. for a MySQL forwarding sidecar, the main process should tie its health to the health of the service) or via k8s container health checks.

With respect to workarounds, depending on what your sidecar container is for, it may make sense to use mutating admission controllers to inject the container into all pods that match a certain pattern (ala Istio sidecar injection), but that only works for certain scenarios.

I don't believe this makes sense in the context of Knative Serving because--to the best of my knowledge--serving doesn't guarantee it's going to use pods underneath. Instead it would be a workaround for a specific version of serving. If additional containers were brought into the spec, then it could be used on any Knative Serving compatible platform.

josephlewis42 on 11 Nov 2019

👍2

Issues go stale after 90 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle stale.
Stale issues rot after an additional 30 days of inactivity and eventually close.
If this issue is safe to close now please do so by adding the comment /close.

Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.

/lifecycle stale

knative-housekeeping-robot on 10 Feb 2020

/remove-lifecycle stale

vagababov on 10 Feb 2020

/assign @markusthoemmes @savitaashture

vagababov on 10 Feb 2020

/lifecycle frozen

ahmetb on 10 Feb 2020

@evankanderson Here's a production use case for multiple containers in one Pod (in addition to metric exporters etc):

Maintaining e.g. an PHP stack with php-fpm and nginx.

I know this can be circumvented by a kind of complex json patch injected by a MutatingAdmissionWebhook but I don't see why Knative would limit on this.

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  volumes:
  - name: sock
    emptyDir: {}
  containers:
  - name: nginx
    image: test
    imagePullPolicy: Never
    command: ["/usr/sbin/nginx", "-g daemon off;"]
    volumeMounts:
      - name: sock
        mountPath: /sock/
  - name: php-fpm
    image: test
    imagePullPolicy: Never
    command: ["php-fpm"]
    volumeMounts:
      - name: sock
        mountPath: /sock/

xvzf on 27 Apr 2020

👍2

Sidecars are now alpha with the 0.16 release available here: https://github.com/knative/serving/releases/tag/v0.16.0

Next steps is to revisit the feature track wrt. to the exit criteria for the various phases

dprotaso on 8 Jul 2020

I've structured the GitHub project for this feature with columns that'll contain issues for the exit criteria

https://github.com/knative/serving/projects/33

dprotaso on 8 Jul 2020

My current worry is that the use case is "sidecar" but it seems that the API is "provide multiple containers" (carried over from PodSpec).

That's probably OK, but we have to acknowledge that this lower level than an explicit "sidecar" API (which would be a more explicit high level abstraction), and that this probably expands the use cases to more than sidecars.

I made this comment in the API working group in the past, I'll try to join the next one to discuss this further.

steren on 8 Jul 2020

The extra containers do have restrictions (ie. no exposed ports)- this feature needs to be documented describing those limitations.

dprotaso on 8 Jul 2020

Yes, I am mostly talking in terms of API design:

Looking at the proposed API, there is nothing that makes it obvious that one of these multiple containers is the "main" one and the others are sidecars. I need to read the docs to understand the API, which is usually not a good sign.

I know that there is consistency with the PodSpec, which has benefits when it comes to migrating workloads. But from the standpoint of KService API usability, it's not such a great choice. That's probably OK, as there are clients to help manipulate the YAMLs. Just wanted to call it out.

steren on 8 Jul 2020

Could somebody share the latest API design? This doc is marked as WIP and dated 10/09/2019. Is it the source of truth and the approved design?

It'd be good to share an example of a KService YAML that illustrates the use cases described in the first comment of this issue.

steren on 15 Jul 2020

/cc @savitaashture @markusthoemmes

vagababov on 15 Jul 2020

Could somebody share the latest API design? This doc is marked as WIP and dated 10/09/2019. Is it the source of truth and the approved design?

It'd be good to share an example of a KService YAML that illustrates the use cases described in the first comment of this issue.

Hi @steren

Apologize for the delay in reply

Here is the public doc https://docs.google.com/document/d/1XjIRnOGaq9UGllkZgYXQHuTQmhbECNAOk6TT6RNfJMw/edit?ts=5e25d093#heading=h.n8a530nnrb

And let us know if you need any information or any gaps which you feel need to be added to the doc we will take a look at those points.

Thank you

savitaashture on 1 Sep 2020

Yes, I am mostly talking in terms of API design:

Looking at the proposed API, there is nothing that makes it obvious that one of these multiple containers is the "main" one and the others are sidecars. I need to read the docs to understand the API, which is usually not a good sign.

I know that there is consistency with the PodSpec, which has benefits when it comes to migrating workloads. But from the standpoint of KService API usability, it's not such a great choice. That's probably OK, as there are clients to help manipulate the YAMLs. Just wanted to call it out.

Right i agree with you we don't have a doc to give information like what all has been supported and how do we consider as main and sidecar container

we created an issue https://github.com/knative/docs/issues/2787 to address the documentation

But we have added some of those information in design doc

savitaashture on 1 Sep 2020

I'll unassign here. @savitaashture has it all under control :slightly_smiling_face:

/unassign