Contour: IngressRoute support for custom request timeout

Created on 20 Nov 2018  Â·  15Comments  Â·  Source: projectcontour/contour

Describe the solution you'd like
I'd like to be able to configure custom request timeouts per IngressRoute.

It seems from docs that you can use annotations on ingresses, but according to this comment

IngressRoutes don't really use annotations for anything

Docs regarding IngressRoute contain no reference to a _requests timeout config parameter_.

Environment:

  • Kubernetes version: v1.10.3 (EKS)
  • Cloud provider or hardware configuration: AWS

Contour daemon set

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  labels:
    app: contour
  name: contour
  namespace: heptio-contour
spec:
  selector:
    matchLabels:
      app: contour
  template:
    metadata:
      annotations:
        prometheus.io/format: prometheus
        prometheus.io/path: /stats
        prometheus.io/port: "9001"
        prometheus.io/scrape: "true"
      creationTimestamp: null
      labels:
        app: contour
    spec:
      containers:
      - args:
        - serve
        - --incluster
        command:
        - contour
        image: gcr.io/heptio-images/contour:latest
        imagePullPolicy: Always
        name: contour
        ports:
        - containerPort: 8000
          hostPort: 8000
          name: contour
          protocol: TCP
      - args:
        - --config-path /config/contour.yaml
        - --service-cluster cluster0
        - --service-node node0
        - --log-level info
        - --v2-config-only
        command:
        - envoy
        image: docker.io/envoyproxy/envoy-alpine:v1.7.0
        name: envoy
        ports:
        - containerPort: 8080
          hostPort: 8080
          name: http
          protocol: TCP
        - containerPort: 8443
          hostPort: 8443
          name: https
          protocol: TCP
        volumeMounts:
        - mountPath: /config
          name: contour-config
      dnsPolicy: ClusterFirst
      hostNetwork: true
      initContainers:
      - args:
        - bootstrap
        - /config/contour.yaml
        command:
        - contour
        image: gcr.io/heptio-images/contour:latest
        imagePullPolicy: Always
        name: envoy-initconfig
        volumeMounts:
        - mountPath: /config
          name: contour-config
      restartPolicy: Always
      serviceAccount: contour
      serviceAccountName: contour
      volumes:
      - emptyDir: {}
        name: contour-config
kinfeature prioritimportant-soon

Most helpful comment

I am willing to work on this if nobody is actively working. Will start with the design doc.

All 15 comments

I'd like to help with this if it has not been started already. Here's what I was thinking:

  • Some of the other annotations (i.e. contour.heptio.com/websocket-routes) have been added to the Route struct of the IngressRouteSpec which ends up in the Route portion of the dag.
  • It would seem the best place to define the retry and timeout annotations is in the Route struct.
  • The one caveat being this comment and this one would suggest all of the below would need to be defined on the Service (both dag and IngressRoute)

Below would be the additional fields (current fields omitted for brevity):

type Route struct {
  // RequestTimeout specified as a time.Duration set the upstream timeout for the route
  RequestTimeout string `json:"request-timeout,omitempty"`
  // RetryOn specifies the conditions under which retry takes place
  RetryOn string `json:"retry-on,omitempty"`
  // NumRetries defines the maximum number of retries Envoy should make before abandoning 
  // and returning an error to the client
  NumRetries int `json:"num-retries,omitempty"`
  // PerTryTimeout specificed as a time.Duration, set the timeout per retry attempt
  PreTryTimeout string `json:"per-try-timeout,omitempty"`
}

I'm open to also take on the work of moving these things around if that's what is needed for this to be done well.

It seems that the fields indicated by @jipperinbham need to be added here and mapped through to Envoy.

I'm currently using the following against Ingress, but also need them on IngressRoute:

  • ingress.kubernetes.io/force-ssl-redirect
  • contour.heptio.com/retry-on
  • contour.heptio.com/num-retries
  • contour.heptio.com/request-timeout
  • contour.heptio.com/per-try-timeout

Thank you for your comment. We won’t be adding these annotations but they will be added as fields on the ingress route object where applicable.

On 11 Jan 2019, at 17:00, lostllama notifications@github.com wrote:

To add my 2 cents to this, we're using the following annotations on Ingress that we'd like to use on IngressRoute too:

ingress.kubernetes.io/force-ssl-redirect
contour.heptio.com/retry-on
contour.heptio.com/num-retries
contour.heptio.com/request-timeout
contour.heptio.com/per-try-timeout
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

Understood, I just wanted to add my 2 cents for the features that I need (obviously I don't need them as annotations, I just need the features). Thank you for your prompt response.

Hey @davecheney would you accept a PR adding these fields to IngressRoute and hooking them up to the Route in Envoy (as @jipperinbham suggested)?

    // RequestTimeoutSeconds specified as a time.Duration set the upstream timeout for the route
    RequestTimeoutSeconds int64 `json:"requestTimeoutSeconds,omitempty"`
    // RetryOn specifies the conditions under which retry takes place
    RetryOn string `json:"retryOn,omitempty"`
    // NumRetries defines the maximum number of retries Envoy should make before abandoning
    // and returning an error to the client
    NumRetries uint32 `json:"numRetries,omitempty"`
    // perTryTimeoutSeconds specificed as a time.Duration, set the timeout per retry attempt
    PreTryTimeoutSeconds int64 `json:"perTryTimeoutSeconds,omitempty"`

Thanks for your reply. This is something I want to look at for the 0.10 cycle starting just as soon as I get 0.9 out the door on Monday.

WRT to timeouts, I don’t want to add additional fields to route, but create a new timeout object that encapsulates all the elements of timeout behaviour which we can apply in all the places where timeouts are needed—which is turning out to be a growing set.

On 27 Jan 2019, at 09:19, Robert Syvarth notifications@github.com wrote:

Hey @davecheney would you accept a PR adding these fields to IngressRoute and hooking them up to the Route in Envoy (as @jipperinbham suggested)?

// RequestTimeoutSeconds specified as a time.Duration set the upstream timeout for the route
RequestTimeoutSeconds int64 json:"requestTimeoutSeconds,omitempty"
// RetryOn specifies the conditions under which retry takes place
RetryOn string json:"retryOn,omitempty"
// NumRetries defines the maximum number of retries Envoy should make before abandoning
// and returning an error to the client
NumRetries uint32 json:"numRetries,omitempty"
// perTryTimeoutSeconds specificed as a time.Duration, set the timeout per retry attempt
PreTryTimeoutSeconds int64 json:"perTryTimeoutSeconds,omitempty"
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Great! I am mostly concerned with the RetryOn/NumRetries fields personally - we see in our cluster that envoy will often attempt a request on a closed connection (if a pod gets killed for example) and which returns a 503 immediately. We reduce our error rate pretty significantly by enabling the envoy-level retries.

I am not sure that retry logic error makes a ton of sense as a part of a generic timeout object, what do you think about just adding in these fields to the Route?

    // RetryOn specifies the conditions under which retry takes place
    RetryOn string `json:"retryOn,omitempty"`
    // NumRetries defines the maximum number of retries Envoy should make before abandoning
    // and returning an error to the client
    NumRetries uint32 `json:"numRetries,omitempty"`

(thanks for all of the work on this project btw! Currently migrating our cluster over from Ambassador and really appreciate the k8s-native design being used here).

At a first pass thinking about how this configuration should be exposed I
think Retry and Timeout parameters should be segregated into different
types; eg.

route:
match: /slow
retry:
count: 7
retryon:
- 50x
timeout:
connect: 500ms
response: 2s

On Sun, 27 Jan 2019 at 10:01, Robert Syvarth notifications@github.com
wrote:

Great! I am mostly concerned with the RetryOn/NumRetries fields personally

  • we see in our cluster that envoy will often attempt a request on a closed
    connection (if a pod gets killed for example) and which returns a 503
    immediately. We reduce our error rate pretty significantly by enabling the
    envoy-level retries.

I am not sure that retry logic error makes a ton of sense as a part of a
generic timeout object, what do you think about just adding in these fields
to the Route?

// RetryOn specifies the conditions under which retry takes place
RetryOn string json:"retryOn,omitempty"
// NumRetries defines the maximum number of retries Envoy should make before abandoning
// and returning an error to the client
NumRetries uint32 json:"numRetries,omitempty"

(thanks for all of the work on this project btw! Currently migrating our
cluster over from Ambassador and really appreciate the k8s-native design
being used here).

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/heptio/contour/issues/815#issuecomment-457873640, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAcAxM3ZQG12UfWJhjEl6UgLvRF5kKrks5vHN5bgaJpZM4YrPZ2
.

I am willing to work on this if nobody is actively working. Will start with the design doc.

All yours.

WRT to timeouts, I don’t want to add additional fields to route, but create a new timeout object that encapsulates all the elements of timeout behaviour which we can apply in all the places where timeouts are needed—which is turning out to be a growing set.

@davecheney do you mind be more specific about the last part? Are you referring to that growing set as something in Envoy API or within Contour, like HealthCheck?

Because of the need to release Contour 0.11 to address the security issue in Envoy 1.9.0, this issue has been reassigned to the 0.12 milestone.

Fixed via #1018

@stevesloka #1019 is still open. We need to update the documentation for these new parameters.

Docs implemented in #1042

Was this page helpful?
0 / 5 - 0 ratings