Flux: Tracing a bad deployment/HelmRelease

Created on 9 May 2019  路  5Comments  路  Source: fluxcd/flux

I am deploying HelmRelease through flux. And many times Flux applies the manifest but HelmOperator fails due to bad configuration. How can I trace a bad deployment?

e.g. I updated a chart version from 1.0 to 1.1
Scenario 1:
chart 1.1 does not exist, Flux applies the manifest successfully but HelmOperator errors saying Chart does not exist.

Scenario 2:
chart 1.1 gets deployed but there was a configuration issue. Now chart 1.0 is removed but chart 1.1 also cannot be deployed due to config issue increasing down time.

Is there a way we can track/trace these two scenarios without having to look into logs manually And ensuring 100 percent uptime?

Can there be any alert/any other method for this so that I know exactly if a helm release fails and if I can either roll back or fix to avoid downtime

question

Most helpful comment

@squaremo Even if it roll backs, there isn't any way for a developer to know that his latest change is not published and it was failed due to a certain reason except looking into logs of operator pod. I believe there should be a UI for this. I have created an issue for this

All 5 comments

  • .status.conditions in the HelmRelease will usually suggest what is going on with a particular release -- especially if it's a problem fetching the chart
  • there's work somewhere on adding Prometheus metrics to helm-operator, so they can be used for alerts (sorry, I can't find it right now :-S)
  • meanwhile, we've some ongoing work to make the helm-operator deal with failed releases better -- there's #2008 which improved one particular case, and #2006 which may be better still

Does that answer your question @usamaahmadkhan ? If not, what would you like to see (i.e., what should be the next step here)?

@squaremo Even if it roll backs, there isn't any way for a developer to know that his latest change is not published and it was failed due to a certain reason except looking into logs of operator pod. I believe there should be a UI for this. I have created an issue for this

Flux v2, based on the GitOps Toolkit, has support for health assessment of deployments https://toolkit.fluxcd.io/components/kustomize/kustomization/#health-assessment

Was this page helpful?
0 / 5 - 0 ratings