Airflow: Better unit tests for Helm Chart

Created on 19 Oct 2020 · 25Comments · Source: apache/airflow

Hello,

Overview

Currently, Helm Chart for Airflow has two types of tests: (Learn the best practices of Helm Chart testing)

Template testing (unit testing): these tests render the templates against various input values. These types of tests let us verify that the template rendered the expected resources in the manner you intended. These tests are fast to execute and can catch syntactic errors of your template
Integration testing: these tests take the rendered templates and deploy them to a real Kubernetes cluster. We can then verify the deployed infrastructure works as intended by hitting the endpoints or querying Kubernetes for the resources. These tests closely resemble an actual deployment and give us a close approximation of how it might behave when ready to push the chart to production.

In each case, we use a different framework:

For template testing, we use Helm-unittest
For integration testing, we use bash scripts and Pytest + kind cluster

Start reading here

Today I would like to talk about template testing. In my opinion, using helm-unittest is not a very good solution.

This framework is not very popular, so any contributor who wants to change must read the documentation first to add new tests. (People are lazy by nature, so they don't add these tests).
Another problem is the form - the YAML file. The YAML file is not very flexible and makes it very difficult to write more unusual tests. It would be more flexible to express the tests in a more normal programming language.

During the discussion with @dimberman, I prepared a small POC that shows how we can test the Helm Chart with Pytest + Python.
Here is link: https://github.com/apache/airflow/pull/11533#discussion_r504998933

import subprocess
import unittest
from tempfile import NamedTemporaryFile

import yaml
from typing import Dict, Optional

OBJECT_COUNT_IN_BASIC_DEPLOYMENT = 22


class TestBaseChartTest(unittest.TestCase):
    def render_chart(self, name, values: Optional[Dict] = None):
        values = values or {}
        with NamedTemporaryFile() as tmp_file:
            content = yaml.dump(values)
            tmp_file.write(content.encode())
            tmp_file.flush()
            templates = subprocess.check_output(["helm", "template", name, "..", '--values', tmp_file.name])
            k8s_objects = yaml.load_all(templates)
            k8s_objects = [k8s_object for k8s_object in k8s_objects if k8s_object]
            return k8s_objects

    def test_basic_deployments(self):
        k8s_objects = self.render_chart("TEST-BASIC", {"chart": {'metadata': 'AA'}})
        list_of_kind_names_tuples = [
            (k8s_object['kind'], k8s_object['metadata']['name'])
            for k8s_object in k8s_objects
        ]
        self.assertEqual(
            list_of_kind_names_tuples,
            [
                ('ServiceAccount', 'TEST-BASIC-scheduler'),
                ('ServiceAccount', 'TEST-BASIC-webserver'),
                ('ServiceAccount', 'TEST-BASIC-worker'),
                ('Secret', 'TEST-BASIC-postgresql'),
                ('Secret', 'TEST-BASIC-airflow-metadata'),
                ('Secret', 'TEST-BASIC-airflow-result-backend'),
                ('ConfigMap', 'TEST-BASIC-airflow-config'),
                ('ClusterRole', 'TEST-BASIC-pod-launcher-role'),
                ('ClusterRoleBinding', 'TEST-BASIC-pod-launcher-rolebinding'),
                ('Service', 'TEST-BASIC-postgresql-headless'),
                ('Service', 'TEST-BASIC-postgresql'),
                ('Service', 'TEST-BASIC-statsd'),
                ('Service', 'TEST-BASIC-webserver'),
                ('Deployment', 'TEST-BASIC-scheduler'),
                ('Deployment', 'TEST-BASIC-statsd'),
                ('Deployment', 'TEST-BASIC-webserver'),
                ('StatefulSet', 'TEST-BASIC-postgresql'),
                ('Secret', 'TEST-BASIC-fernet-key'),
                ('Secret', 'TEST-BASIC-redis-password'),
                ('Secret', 'TEST-BASIC-broker-url'),
                ('Job', 'TEST-BASIC-create-user'),
                ('Job', 'TEST-BASIC-run-airflow-migrations')
            ]
        )
        self.assertEqual(OBJECT_COUNT_IN_BASIC_DEPLOYMENT, len(k8s_objects))

    def test_basic_deployment_without_default_users(self):
        k8s_objects = self.render_chart("TEST-BASIC", {"webserver": {'defaultUser': {'enabled': False}}})
        list_of_kind_names_tuples = [
            (k8s_object['kind'], k8s_object['metadata']['name'])
            for k8s_object in k8s_objects
        ]
        self.assertNotIn(('Job', 'TEST-BASIC-create-user'), list_of_kind_names_tuples)
        self.assertEqual(OBJECT_COUNT_IN_BASIC_DEPLOYMENT - 1, len(k8s_objects))

Alternatively, we can also migrate to terratest, which has native Helm Chart integration but uses Go-lang.

Now there is a question for the community. What do you think about it? Should we migrate all tests to Pytestt? Is this the only reason contributors don't add unit tests? What else can we do to encourage contributors to write tests?

Best regards,
Kamil Breguła

helm-chart feature

Source

mik-laj

❤5

Most helpful comment

@mik-laj I played around with your python testing framework a bit more this morning and I LOVE it!!!

I made some modifications to clean up a bit and allow us to play with k8s objects instead of dicts but PTAL I think this will make everything MUCH easier :) https://github.com/apache/airflow/pull/11693

dimberman on 20 Oct 2020

❤2

All 25 comments

CC: @potiuk @ashb @kaxil @dimberman @turbaszek

mik-laj on 19 Oct 2020

CC: @OmairK I know you wanted to know more about Kubernetes.

mik-laj on 19 Oct 2020

❤1

I am absolutely for this. Comparing to the above proposal, it's heaven and earth.

I very much dislike the current yaml based testing, they're difficult to maintain, refactor, automate, run from the IDE, debug, duplicate to write new tests etc. (unless I do not now how - maybe I just need some training).

I've added a few yaml-based tests in the past and fixed a few failing ones when I added new features (like kerberos sidecar). And it was a pain. For me this is the main reason why I have not added tests to my changes to the helm chart - because I disliked the idea of being close to that part of the code.

Sorry for the rant, it's not really, really bad with those yaml tests, but it is pretty close. I certainly feel subconscious " let's not go anywhere near those tests" when I think about them :). And looking at the proposal above, it is so much better.

potiuk on 19 Oct 2020

CC: @schnie

mik-laj on 19 Oct 2020

This is an interesting problem. I personally lean more towards terratest than I do towards any home-bakes solution. This of course raises the question of "but is airflow a python-only project" to which my reply would be "helm is a golang-based system. You're using go templating and all major tooling around it is written in golang."

@kaxil @ashb would love your thoughts as well, but especially since the helm chart is basically a separate entity from airflow itself, and that any python solution would essentially be home-baked, I'm pretty heavily for terratest.

dimberman on 19 Oct 2020

Snap comment without reading much of the context (sorry!): Yeah, I have no love of yaml, but I value not having to write our own test framework more.

I'll read this in detail tomorrow

ashb on 19 Oct 2020

@dimberman DevOps world uses Bash, Python and Golang. I think anyone who knows Helm knows the basics of Python. However, this is not true the other way around. Not everyone who knows Python also knows Go.

We also don't need all Terratest features. If we have one goal - testing Helm templates, Then pytest will meet the expectations and we won't have to write a lot of our own code. It is enough to generate a template (render_chart method) and use the standard assertions available in Python.

@ashb I don't want to write my own framework, but using pytest instead of helm-unittest.

mik-laj on 19 Oct 2020

Yeah I am strongly with @mik-laj on that one. This has nothing to do with writing own framework. Pytest is there and we all know it well. Those tests are basically about rendering yaml files and comparing if they are what they are expected to be. We use precisely "0" other features of terratest.

potiuk on 19 Oct 2020

To be precise, terratest is just a library that provides a set of functions that make it easier to write tests, but it is not a full framework. You can use any framework, but the documentation recommends using Go’s built-in package testing. The module of interest for us is:
(modules/helm/template.go)[https://github.com/gruntwork-io/terratest/blob/master/modules/helm/template.go]

So now we have discussions as to which we want to use one of the following set of tools.

pytest as a test runner, unittest.TestCase as a framework (most likely) and one of our own functions,
go test command as a test runner, go/pkg/testing as a framework, and terratest with a ready function.

I don't think we use any unique golang features (good asynchronous libraries, security, etc.) on the other hand, Python is a language valued for its simple syntax and user-friendliness. I would like to add that currently the integration tests for Helm Chart are also written in Python.

What are the benefits of Golang in our case? The potential popularity with other contributors is a benefit, but I don't see them being active in the development of this Helm Chart. Rather, new features are being added by active contributors to the project who are already familiar with Python. However, if someone is not familiar with Python, writing new tests should not be a challenge for him, because Python is simpler than golang.

mik-laj on 19 Oct 2020

❤1 👍1

@mik-laj I played around with your python testing framework a bit more this morning and I LOVE it!!!

dimberman on 20 Oct 2020

❤2

(also it's a draft PR so tests will certainly fail/code quality needs clean-up but I think the basic idea will scale really well)

dimberman on 20 Oct 2020

Yes, in quickly skimming the issue last night on my phone I missed a lot of the context.

And an alternative version that uses pytest fixtures:

https://github.com/apache/airflow/pull/11694

ashb on 20 Oct 2020

Hello,

I come from https://github.com/apache/airflow/pull/11681#issuecomment-712790910 .

So a bit of feedback on the experience with the current setup:

The hardest part was for me to install the helm plugin (which doesn't seem to be maintained much ; I ended up on the fork https://github.com/quintush/helm-unittest)
Maybe add a bit of documentation in the chart's README (the Overview in the description of this issue is pretty nice) on the steps needed to run the tests locally,
It's really slow to run,
It's nice to be able to test template files individually,
I feel it's ok to test with yaml ; but it's not very flexible (let's say I wanted to add a test that checks that after rendering, if kerberos.enabled=false, then kerberos is not present in any of the templates, ...)

Going for something a bit more custom with golang or python will enable easier "advance" tests (this chart has some very complex logic, so I have a feeling that it's going to be useful).

And on Golang vs Python, I'd agree with:

What are the benefits of Golang in our case? The potential popularity with other contributors is a benefit, but I don't see them being active in the development of this Helm Chart. Rather, new features are being added by active contributors to the project who are already familiar with Python. However, if someone is not familiar with Python, writing new tests should not be a challenge for him, because Python is simpler than golang.

I think some small "internal" python testing logic like what I can see in #11693 is nice.
It will also enable easy debugging, etc.

FloChehab on 20 Oct 2020

@mik-laj @FloChehab @potiuk Do you prefer the unit test style, or the pytest fixture style? #11693 and #11694

ashb on 21 Oct 2020

@ashb I prefer unittest.TestCase, but I don't have a strong opinion. Any solution that is written in Python works for me.

mik-laj on 21 Oct 2020

Honestly I think my biggest complaint now is the self.assertEqual(a,b) vs assert a == b - the later I find easier to read, and now we are using pytest for the runner it also gives better diagnostics in case of failure.

I think pytest fixtures are powerful, but on first seeing them they are a bit "magic" I'll grant.

ashb on 21 Oct 2020

Honestly I think my biggest complaint now is the self.assertEqual(a,b) vs assert a == b - the later I find easier to read, and now we are using pytest for the runner it also gives better diagnostics in case of failure.

+1 for assert style, I think we should fix it in all database after 2.0 - there's a tool to do that.

I think pytest fixtures are powerful, but on first seeing them they are a bit "magic" I'll grant.

I think I'm ok with both - fixtures and mocking. Sometimes I prefer to use one way and in other situation I prefer the other one.

In general, I think the test style is something that we should discuss after 2.0 - we didn't want to do any changes/enforcement previously because we had a looooot of changes (pre-commits, pytest, AIP-21 and more).

turbaszek on 22 Oct 2020

I think pytest fixtures are powerful, but on first seeing them they are a bit "magic" I'll grant.

I like fixtures when used with the scope module or session. When I only need to use fixtures for one test, I prefer to create simple functions instead of using fixtures/magic.

mik-laj on 22 Oct 2020

I think I agree with all of you @ashb @turbaszek and @mik-laj :). Seems we all agree actually :)

+1 asserts are better (but let's not make it consistent just yet)
+1 on using both fixtures and mocks - and @mik-laj's case is a good example. I also prefer an explicit method where I do not have to something "common" to modules/session. And simple setup/tearDown is so much embedded now in my unit test thinking that in simple cases I prefer to use them. But this might change in the future as I get used to it.

potiuk on 22 Oct 2020

Pytest has setup/teardown equivalents too, even without fixtures https://docs.pytest.org/en/stable/xunit_setup.html#method-and-function-level-setup-teardown

ashb on 22 Oct 2020

Pytest has setup/teardown equivalents too, even without fixtures https://docs.pytest.org/en/stable/xunit_setup.html#method-and-function-level-setup-teardown

Correct. But it feels like joining the worst of both worlds. Classic unit-test setUp/tearDown are familiar and "natural" where pytest fixtures are modern and a bit less familiar. I'd rather move to fixtures rather than to pytest's setup/teardown.

potiuk on 22 Oct 2020

@mik-laj @FloChehab @potiuk Do you prefer the unit test style, or the pytest fixture style? #11693 and #11694

I don't have an opinion for going for the one or the other ; I am sure you'll make the good choice.

FloChehab on 23 Oct 2020

I like both of them - @ashb and @dimberman . However, if I had to choose one PR it would be @dimberman's PR because it includes additional k8s API validation.

mik-laj on 23 Oct 2020

Yeah Daniel has taken it further than I have - I was just creating a prototype.

I suspect once he's done and merged I may show converting it to pytest

ashb on 23 Oct 2020

I open this ticket because we need to migrate the rest of the tests to complete the migrations.

mik-laj on 28 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Detect automatically the lack of reference to the guide in the operator descriptions

mik-laj · 4Comments

Loading logging configuration from .yaml or .ini file

d-lee · 4Comments

Create SendGridMovedRule to ease upgrade to Airflow 2.0

turbaszek · 3Comments

BigQueryHook broken by dependency google-cloud-bigquery-datatransfer

darwinyip · 3Comments

Kubernetes Executor - invalid hostname in the database

Labbs · 4Comments