Charts: [stable/airflow] flower dashboard cannot be accessed and liveness probe failure after upgrade or new install

Created on 25 Jul 2019 · 14Comments · Source: helm/charts

Describe the bug
A clear and concise description of what the bug is.
Flower dashboard cannot be accessed in port 5555 and flower pod stuck in restart loop

Version of Helm and Kubernetes:
helm version

Client: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.0", GitCommit:"05811b84a3f93603dd6c2fcfe57944dfa7ab7fd0", GitTreeState:"clean"}

kubectl version

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.3", GitCommit:"721bfa751924da8d1680787490c54b9179b1fed0", GitTreeState:"clean", BuildDate:"2019-02-01T20:08:12Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.5", GitCommit:"2166946f41b36dea2c4626f90a77706f426cdea2", GitTreeState:"clean", BuildDate:"2019-03-25T15:19:22Z", GoVersion:"go1.11.5", Compiler:"gc", Platform:"linux/amd64"}

Which chart:
stable/airflow

What happened:
Flower dashboard cannot be accessed using port-forward at both service or directly from pod at port 5555 and liveness probe failed. After 5 fails it gets restated. Flower pod gets stuck in restart loop
Problem is in version 3.0.6 (current lastest) and not in 3.0.2, after reinstalling 3.0.2 everything works fine.

What you expected to happen:
liveness probe passed and flower dashboard accessible

How to reproduce it (as minimally and precisely as possible):
helm install --name air -f airflow/values.yaml --namespace airflow stable/airflow --version 3.0.6
kubectl logs -f air-flower-55cb567664-hxhm6 -n airflow

Anything else we need to know:
Logs

[2019-07-25 06:50:58,740] {{settings.py:182}} INFO - settings.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800, pid=1
[2019-07-25 06:50:59,080] {{__init__.py:51}} INFO - Using executor CeleryExecutor
[I 190725 06:50:59 command:136] Visit me at http://0.0.0.0:5555
[I 190725 06:50:59 command:141] Broker: redis://air-redis-master:6379/1
[I 190725 06:50:59 command:144] Registered tasks:
    ['celery.accumulate',
     'celery.backend_cleanup',
     'celery.chain',
     'celery.chord',
     'celery.chord_unlock',
     'celery.chunks',
     'celery.group',
     'celery.map',
     'celery.starmap']
[I 190725 06:56:14 command:44] SIGTERM detected, shutting down

lifecyclstale

Source

deepaksood619

Most helpful comment

After some investigation I found out the pods were having some issues connecting to redis and postgres and thats why Flower dashboard wasn't reachable.

What I did was to create the secrets for redis and postgres and included the environment variables on the values.yaml file. As follows:

Create secrets:
kubectl create secret generic airflow-postgres --from-literal=postgres-password=$(openssl rand -base64 13)
kubectl create secret generic airflow-redis --from-literal=redis-password=$(openssl rand -base64 13)

on values.yaml:

postgres:
  existingSecret: airflow-postgres

redis:
  existingSecret: airflow-redis

...

extraEnv:
  - name: POSTGRES_PASSWORD
    valueFrom:
      secretKeyRef:
        name: airflow-postgres
        key: postgres-password
  - name: REDIS_PASSWORD 
    valueFrom:
      secretKeyRef:
        name: airflow-redis
        key: redis-password

This worked for me and I'm now able to reach Flower dashboard.

nfds89 on 1 Aug 2019

👍2

All 14 comments

Am new to using this chart and deployed it many times in last week, but also having exact same issue, with no idea how to debug from here. (Edit, now I tried 3.0.2 which also fails same way for me.)

gabrielsyapse on 25 Jul 2019

Having exact same issue!

will-beta on 26 Jul 2019

Having the same issue here. For me the version 3.0.2 also fails to open flower dashboard.

nfds89 on 30 Jul 2019

After some investigation I found out the pods were having some issues connecting to redis and postgres and thats why Flower dashboard wasn't reachable.

What I did was to create the secrets for redis and postgres and included the environment variables on the values.yaml file. As follows:

on values.yaml:

postgres:
  existingSecret: airflow-postgres

redis:
  existingSecret: airflow-redis

...

extraEnv:
  - name: POSTGRES_PASSWORD
    valueFrom:
      secretKeyRef:
        name: airflow-postgres
        key: postgres-password
  - name: REDIS_PASSWORD 
    valueFrom:
      secretKeyRef:
        name: airflow-redis
        key: redis-password

This worked for me and I'm now able to reach Flower dashboard.

nfds89 on 1 Aug 2019

👍2

@mohannadbanayosi Your latest PR could also solve this issue?

will-beta on 23 Aug 2019

@will-beta Yes I think it should solve it.
@deepaksood619 Can you try out version 4.0.5 of the chart? It has a fix for this.

The keys used for the secrets should be airflow-postgres and redis-password.

mohannadbanayosi on 23 Aug 2019

@mohannadbanayosi
I was having issues with the Airflow web server failing to start due to being unable to resolve the host name airflow-postgres. I tried installing 4.0.5 and now get the error

ModuleNotFoundError: No module named 'werkzeug.wrappers.json'; 'werkzeug.wrappers' is not a package

benrifkind on 3 Sep 2019

@benrifkind

Are you using the default values? I ran the following commands and all components were initialised properly:
helm repo update
helm install --name=airflow --version 4.0.5 stable/airflow

But if you use the docker image tag 1.10.2 it will break.

mohannadbanayosi on 4 Sep 2019

@mohannadbanayosi

Thanks for the response. Unfortunately I still get the same error. Full stack trace

executing webserver...
Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 21, in <module>
    from airflow import configuration
  File "/usr/local/lib/python3.6/site-packages/airflow/__init__.py", line 40, in <module>
    from flask_admin import BaseView
  File "/usr/local/lib/python3.6/site-packages/flask_admin/__init__.py", line 6, in <module>
    from .base import expose, expose_plugview, Admin, BaseView, AdminIndexView  # noqa: F401
  File "/usr/local/lib/python3.6/site-packages/flask_admin/base.py", line 6, in <module>
    from flask import Blueprint, current_app, render_template, abort, g, url_for
  File "/usr/local/lib/python3.6/site-packages/flask/__init__.py", line 21, in <module>
    from .app import Flask
  File "/usr/local/lib/python3.6/site-packages/flask/app.py", line 69, in <module>
    from .wrappers import Request
  File "/usr/local/lib/python3.6/site-packages/flask/wrappers.py", line 14, in <module>
    from werkzeug.wrappers.json import JSONMixin as _JSONMixin
ModuleNotFoundError: No module named 'werkzeug.wrappers.json'; 'werkzeug.wrappers' is not a package

Any other ideas?

Edit: How do I update the docker image tag? Do I need to alter this value https://github.com/helm/charts/blob/master/stable/airflow/values.yaml#L66
What should the value be?

benrifkind on 4 Sep 2019

@benrifkind

Okay then just use version 4.0.7 of this chart and it should fix it. The latest version uses Airflow docker image 1.10.4 which fixes this issue (check this link).

So just run the following:
helm repo update
helm install --name=airflow --version 4.0.7 stable/airflow

and should be okay. Just make sure you delete any previous chart installations to have a fresh one.

mohannadbanayosi on 5 Sep 2019

@mohannadbanayosi

Just to close the loop on this. That install got rid of the ModuleNotFoundError but I am still running into the following error on the web pod:

sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) could not translate host name "airflow-postgresql" to address: Temporary failure in name resolution

I execed into the pod to check if I could curl that pod and I was able to. This error only occurs when I am running this helm chart locally on Minikube. Installing it on EKS on Amazon works.

benrifkind on 10 Sep 2019

@benrifkind I replied to issue #16816 to move the discussion there.

mohannadbanayosi on 11 Sep 2019

👍1

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.