Zero-to-jupyterhub-k8s: `image-awaiter/daemonset.go` should actually retry HTTP calls if they fail, rather than immediately exiting if the first call fails.

Created on 2 Oct 2020  路  3Comments  路  Source: jupyterhub/zero-to-jupyterhub-k8s

Bug description

Currently, the image-awaiter/daemonset.go logic attempts to contact the Kube API server on startup.

If this initial HTTP call fails, the image awaiter quits.

This creates problems in scenarios where e.g. you have Istio sidecars - the Istio sidecar needs to be up for the pod to be able to talk to the rest of the cluster, but the image-awaiter might try to contact the API server before the Istio sidecar is ready, and will simply quit, assuming that a temporary failure is catastrophic.

Expected behaviour

To be resilient and well-behaved, the image-awaiter/daemonset.go code should add basic retry logic to its startup HTTP call to the Kube API server, rather than exiting after one failure.

Actual behaviour

image-awaiter/daemonset.goexits after one failure to contact the Kube API server

How to reproduce

  1. Set up a cluster
  2. Install Istio
  3. Install the Jupyter helm chart
  4. Note that the hook-image-awaiter pod, if the initial HTTP call to the Kube API server fails, gets stuck due to the image-awaiter process exiting on the first failure.

Your personal set up

Minikube on macOS, Jupyter Helm chart 0.9.1

bug good first issue help wanted

Most helpful comment

I do plan to come with a PR shortly.

All 3 comments

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:

Welcome to the Jupyter community! :tada:

I do plan to come with a PR shortly.

PR created: #1830

Was this page helpful?
0 / 5 - 0 ratings

Related issues

betatim picture betatim  路  4Comments

consideRatio picture consideRatio  路  4Comments

sgibson91 picture sgibson91  路  3Comments

consideRatio picture consideRatio  路  4Comments

consideRatio picture consideRatio  路  3Comments