flux does not work with kustomize

Created on 14 Aug 2019  路  16Comments  路  Source: fluxcd/flux

Describe the bug
Flux does not seem to work with a kuszomize setup. We used to have a dedicated repo per stage. These repos were merged into a single repo and modified to use kustomize based on this docs:
https://github.com/weaveworks/flux-kustomize-example

To Reproduce
I have installed flux using these values:

additionalArgs:
- --connect=ws://fluxcloud
- --manifest-generation=true
git:
  url: [email protected]:example/flux-repo-customer.git
  branch: 'immediate-deployment'
  path: patches/dev
rbac:
  create: true
helmOperator:
  create: true
  createCRD: false
prometheus:
  enabled: true

My folder structure looks like this:

$ find | grep -v git
.
./flux_values_dev.yaml
./flux_values_preprod.yaml
./custom-templates
./custom-templates/dev
./custom-templates/dev/kustomization.yaml
./custom-templates/prod
./custom-templates/prod/kustomization.yaml
./custom-templates/all
./custom-templates/all/kustomization.yaml
./custom-templates/all/phpmyadmin
./custom-templates/all/phpmyadmin/phpmyadmin-release.yaml
./custom-templates/all/phpmyadmin/phpmyadmin-ns.yaml
./custom-templates/all/kube-system
./custom-templates/all/kube-system/aws-alb-ingress-controller-release.yaml
./custom-templates/all/kube-system/fluentd.yaml
./custom-templates/preprod
./custom-templates/preprod/kustomization.yaml
./flux_values_prod.yaml
./base-templates
./base-templates/traefik
./base-templates/traefik/namespace.yaml
./base-templates/traefik/release.yaml
./base-templates/cert-manager
./base-templates/cert-manager/namespace.yaml
./base-templates/cert-manager/letsencrypt-prod-cluster-issuer.yaml
./base-templates/cert-manager/release.yaml
./base-templates/cert-manager/crd.yaml
./base-templates/kube-public
./base-templates/kube-public/mem-limit.yaml
./base-templates/kustomization.yaml
./base-templates/flux
./base-templates/flux/flux-ns.yaml
./base-templates/flux/fluxcloud.yaml
./base-templates/default
./base-templates/default/default-ns.yaml
./base-templates/weave
./base-templates/weave/namespace.yaml
./base-templates/weave/release.yaml
./base-templates/botkube
./base-templates/botkube/namespace.yaml
./base-templates/botkube/release.yaml
./base-templates/monitoring
./base-templates/monitoring/hpa-rules.yaml
./base-templates/monitoring/namespace.yaml
./base-templates/monitoring/pvc-rules.yaml
./base-templates/monitoring/graylog-rule.yaml
./base-templates/monitoring/pod-rule.yaml
./base-templates/monitoring/storage-rule.yaml
./base-templates/monitoring/logstash-rule.yaml
./base-templates/monitoring/prometheus-release.yaml
./base-templates/monitoring/elasticsearch-rule.yaml
./base-templates/monitoring/mongodb-rule.yaml
./base-templates/kube-system
./base-templates/kube-system/coredns-hpa.yaml
./base-templates/kube-system/namespace.yaml
./base-templates/kube-system/spot-termination-handler-release.yaml
./base-templates/kube-system/kubernetes-dashboard.yaml
./base-templates/kube-system/kubedb-operator-release.yaml
./base-templates/kube-system/external-dns-release.yaml
./base-templates/kube-system/storage-class-st1.yaml
./base-templates/kube-system/cluster-autoscaler-release.yaml
./base-templates/kube-system/kubedb-catalog-release.yaml
./base-templates/kube-system/kuberhealthy-release.yaml
./base-templates/kube-system/storage-class-sc1.yaml
./base-templates/kube-system/metrics-server.yaml
./base-templates/kube-system/external-auth-server.yaml
./base-templates/kube-system/kube2iam-release.yaml
./Readme.md
./patches
./patches/dev
./patches/dev/.env
./patches/dev/kustomization.yaml
./patches/dev/default-patch.yaml
./patches/dev/fluentd-patch.yaml
./patches/dev/fluxcloud-patch.yaml
./patches/dev/flux-patch.yaml
./patches/prod
./patches/prod/.env
./patches/prod/kustomization.yaml
./patches/prod/default-patch.yaml
./patches/prod/fluentd-patch.yaml
./patches/prod/fluxcloud-patch.yaml
./patches/prod/flux-patch.yaml
./patches/preprod
./patches/preprod/.env
./patches/preprod/kustomization.yaml
./patches/preprod/default-patch.yaml
./patches/preprod/fluentd-patch.yaml
./patches/preprod/fluxcloud-patch.yaml
./patches/preprod/flux-patch.yaml
./.flux.yaml

I can run kubectl apply in the folder:

$ kubectl apply --dry-run=true -k .
namespace/botkube configured (dry run)
namespace/cert-manager configured (dry run)
namespace/default configured (dry run)
namespace/flux configured (dry run)
namespace/kube-system configured (dry run)
namespace/monitoring configured (dry run)
namespace/phpmyadmin created (dry run)
namespace/traefik configured (dry run)
namespace/weave configured (dry run)
storageclass.storage.k8s.io/sc1 configured (dry run)
storageclass.storage.k8s.io/st1 configured (dry run)
customresourcedefinition.apiextensions.k8s.io/certificates.certmanager.k8s.io configured (dry run)
customresourcedefinition.apiextensions.k8s.io/challenges.certmanager.k8s.io configured (dry run)
customresourcedefinition.apiextensions.k8s.io/clusterissuers.certmanager.k8s.io configured (dry run)
customresourcedefinition.apiextensions.k8s.io/issuers.certmanager.k8s.io configured (dry run)
customresourcedefinition.apiextensions.k8s.io/orders.certmanager.k8s.io configured (dry run)
serviceaccount/fluentd configured (dry run)
clusterrole.rbac.authorization.k8s.io/fluentd configured (dry run)
clusterrolebinding.rbac.authorization.k8s.io/fluentd configured (dry run)
configmap/fluentd-config configured (dry run)
service/fluxcloud configured (dry run)
deployment.extensions/fluxcloud configured (dry run)
horizontalpodautoscaler.autoscaling/coredns configured (dry run)
clusterissuer.certmanager.k8s.io/letsencrypt-prod configured (dry run)
daemonset.extensions/fluentd-cloudwatch configured (dry run)
helmrelease.flux.weave.works/botkube configured (dry run)
helmrelease.flux.weave.works/cert-manager configured (dry run)
helmrelease.flux.weave.works/aws-alb-ingress-controller configured (dry run)
helmrelease.flux.weave.works/cluster-autoscaler configured (dry run)
helmrelease.flux.weave.works/external-auth-server configured (dry run)
helmrelease.flux.weave.works/external-dns configured (dry run)
helmrelease.flux.weave.works/k8s-spot-termination-handler configured (dry run)
helmrelease.flux.weave.works/kube2iam configured (dry run)
helmrelease.flux.weave.works/kubedb-catalog configured (dry run)
helmrelease.flux.weave.works/kubedb-operator configured (dry run)
helmrelease.flux.weave.works/kuberhealthy configured (dry run)
helmrelease.flux.weave.works/kubernetes-dashboard configured (dry run)
helmrelease.flux.weave.works/metrics-server configured (dry run)
helmrelease.flux.weave.works/prometheus-operator configured (dry run)
helmrelease.flux.weave.works/phpmyadmin created (dry run)
helmrelease.flux.weave.works/traefik configured (dry run)
helmrelease.flux.weave.works/weave-scope configured (dry run)
prometheusrule.monitoring.coreos.com/additional-storage.rules configured (dry run)
prometheusrule.monitoring.coreos.com/elasticsearch.rules configured (dry run)
prometheusrule.monitoring.coreos.com/graylog.rules configured (dry run)
prometheusrule.monitoring.coreos.com/hpa.rules configured (dry run)
prometheusrule.monitoring.coreos.com/logstash.rules configured (dry run)
prometheusrule.monitoring.coreos.com/mongodb.rules configured (dry run)
prometheusrule.monitoring.coreos.com/pod.rules configured (dry run)
prometheusrule.monitoring.coreos.com/pvc.rules configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range configured (dry run)
limitrange/mem-limit-range created (dry run)
limitrange/mem-limit-range configured (dry run)

But flux does not seem to apply anything:

鈺扳攢 kubectl get helmrelease --all-namespaces
NAMESPACE      NAME                           AGE
botkube        botkube                        22h
cert-manager   cert-manager                   57d
kube-system    aws-alb-ingress-controller     62d
kube-system    cluster-autoscaler             84d
kube-system    external-auth-server           22h
kube-system    external-dns                   84d
kube-system    k8s-spot-termination-handler   84d
kube-system    kube2iam                       84d
kube-system    kubedb-catalog                 22h
kube-system    kubedb-operator                22h
kube-system    kuberhealthy                   22h
kube-system    kubernetes-dashboard           84d
kube-system    metrics-server                 84d
monitoring     prometheus-operator            27d
traefik        traefik                        56d
weave          weave-scope                    22h
````

As you can see the phpmyadmin helmrelease is missing in this list. Changes are only applied if I manually run kubectl apply -k . in the dev folder.

**Expected behavior**
Flux should apply the same changes as kubectl apply would do.

**Logs**
If applicable, please provide logs of `fluxd` or the helm-operator. In a standard stand-alone installation of Flux, you'd get this by running `kubectl logs -n default deploy/flux`.

flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:53:26.512721079Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:53:26.512767633Z caller=images.go:27 component=sync-loop msg="no automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:54:12.638175561Z caller=warming.go:198 component=warmer info="refreshing image" image=765015417150.dkr.ecr.eu-central-1.amazonaws.com/wegweiserdigitalschule tag_count=28 to_update=2 of_which_refresh=2 of_which_missing=0
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:54:12.761597307Z caller=warming.go:206 component=warmer updated=765015417150.dkr.ecr.eu-central-1.amazonaws.com/wegweiserdigitalschule successful=2 attempted=2
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:54:24.023961973Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:54:24.024004941Z caller=images.go:27 component=sync-loop msg="no automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:55:24.4155126Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:55:24.415560091Z caller=images.go:27 component=sync-loop msg="no automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:55:59.891914621Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:55:59.891955042Z caller=images.go:27 component=sync-loop msg="no automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:56:30.200716696Z caller=loop.go:111 component=sync-loop event=refreshed [email protected]:example/flux-repo-customer.git branch=immediate-deployment HEAD=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:56:47.342211074Z caller=daemon.go:654 component=daemon event="Sync: 8a9eef1, no workloads changed" logupstream=true
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:56:50.614284325Z caller=loop.go:206 component=sync-loop tag=flux-sync old=72948de891e9723e9e9b922f8e942c2bdaacb9d5 new=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:56:51.799259623Z caller=loop.go:111 component=sync-loop event=refreshed [email protected]:example/flux-repo-customer.git branch=immediate-deployment HEAD=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:57:05.968502208Z caller=warming.go:198 component=warmer info="refreshing image" image=765015417150.dkr.ecr.eu-central-1.amazonaws.com/teachtoprotect tag_count=28 to_update=2 of_which_refresh=2 of_which_missing=0
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:57:06.096387652Z caller=warming.go:206 component=warmer updated=765015417150.dkr.ecr.eu-central-1.amazonaws.com/teachtoprotect successful=2 attempted=2
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:57:10.584530622Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:57:10.6104923Z caller=images.go:27 component=sync-loop msg="no automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:58:13.876056957Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:58:13.903995506Z caller=images.go:27 component=sync-loop msg="no automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:59:03.922219705Z caller=warming.go:198 component=warmer info="refreshing image" image=765015417150.dkr.ecr.eu-central-1.amazonaws.com/unterweisungplus tag_count=31 to_update=2 of_which_refresh=2 of_which_missing=0
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:59:04.027545089Z caller=warming.go:206 component=warmer updated=765015417150.dkr.ecr.eu-central-1.amazonaws.com/unterweisungplus successful=2 attempted=2
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:59:08.530941415Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
flux/flux-7dbbf48d4b-7lfm8[flux]: ts=2019-08-14T07:59:08.55352944Z caller=images.go:27 component=sync-loop msg="no automated workloads"
```

Additional context
Add any other context about the problem here, e.g

  • Flux version: 1.13.3 (also tested docker.io/fluxcd/flux-prerelease:master-1728c8d6-wip)
  • Helm Operator version: 0.10.1
  • Kubernetes version: 13.x
  • Git provider: github
  • Container registry provider: ecr
question

All 16 comments

@runningman84 can you please post here the .flux.yaml content?

yes of course:

cat .flux.yaml 
version: 1
patchUpdated:
  generators:
    - command: kustomize build .
  patchUpdated: flux-patch.yaml
  # updaters:
  #   # use https://github.com/squaremo/kubeyaml on flux-patch.yaml
  #   - containerImage:
  #       command: >-
  #         cat flux-patch.yaml |
  #         kubeyaml image --namespace $FLUX_WL_NS --kind $FLUX_WL_KIND --name $FLUX_WL_NAME --container $FLUX_CONTAINER --image "$FLUX_IMG:$FLUX_TAG"
  #         > new-flux-patch.yaml &&
  #         mv new-flux-patch.yaml flux-patch.yaml
  #     policy:
  #       command: >-
  #         cat flux-patch.yaml |
  #         kubeyaml annotate --namespace $FLUX_WL_NS --kind $FLUX_WL_KIND --name $FLUX_WL_NAME "flux.weave.works/$FLUX_POLICY=$FLUX_POLICY_VALUE"
  #         > new-flux-patch.yaml &&
  #         mv new-flux-patch.yaml flux-patch.yaml

Ok so Flux will use the ./patches/dev/kustomization.yaml. if you do a dry run inside ./patches/dev/ what does it output?

that works just fine:

$ patches/dev kubectl kustomize . | head
apiVersion: v1
kind: Namespace
metadata:
  labels:
    name: botkube
  name: botkube
---
apiVersion: v1
kind: Namespace
metadata:

or

$ kubectl kustomize patches/dev | head
apiVersion: v1
kind: Namespace
metadata:
  labels:
    name: botkube
  name: botkube
---
apiVersion: v1
kind: Namespace
metadata:

Actually I am deploying everything use "kubectl apply -k . " because flux does not work...

~Your fluxd config says --git-path=patches/dev but your .flux.yaml file is in the root directory of the repo -- fluxd will only look in the paths you tell it (or the root, if you tell it nothing).~

My mistake -- fluxd will look in parent directories for .flux.yaml, if it's not found in the target path.

do you think that this is a bug or is my configuration wrong?

Can you please delete the flux pod and capture all logs after the restart and post them here.

ok these are the logs generated from a new flux pod:

kubectl logs flux-7dbbf48d4b-psmkw -n flux
Flag --registry-poll-interval has been deprecated, changed to --automation-interval, use that instead
ts=2019-08-14T11:20:04.274678195Z caller=main.go:228 version=master-1728c8d6-wip
ts=2019-08-14T11:20:04.274735056Z caller=main.go:331 msg="using in cluster config to connect to the cluster"
ts=2019-08-14T11:20:04.302679695Z caller=main.go:410 component=cluster identity=/etc/fluxd/ssh/identity
ts=2019-08-14T11:20:04.302736395Z caller=main.go:411 component=cluster identity.pub="ssh-rsaxxxxxxxxxxxxxxxxxxxxxxxxxxx"
ts=2019-08-14T11:20:04.302768985Z caller=main.go:416 host=https://172.20.0.1:443 version=kubernetes-v1.13.8-eks-a977ba
ts=2019-08-14T11:20:04.302844166Z caller=main.go:428 kubectl=/usr/local/bin/kubectl
ts=2019-08-14T11:20:04.304195902Z caller=main.go:440 ping=true
ts=2019-08-14T11:20:04.306548157Z caller=main.go:577 [email protected]:example/flux-repo-customer.git user="Weave Flux" [email protected] signing-key= verify-signatures=false sync-tag=flux-sync notes-ref=flux set-author=false git-secret=false
ts=2019-08-14T11:20:04.306612998Z caller=main.go:618 component=upstream URL=ws://fluxcloud
ts=2019-08-14T11:20:04.310604752Z caller=upstream.go:133 component=upstream connecting=true
ts=2019-08-14T11:20:04.311218169Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
ts=2019-08-14T11:20:04.31126008Z caller=images.go:27 component=sync-loop msg="no automated workloads"
ts=2019-08-14T11:20:04.311328131Z caller=loop.go:85 component=sync-loop err="git repo not ready: git repo has not been cloned yet"
ts=2019-08-14T11:20:04.312913057Z caller=main.go:660 addr=:3030
ts=2019-08-14T11:20:04.324934231Z caller=upstream.go:147 component=upstream connected=true
ts=2019-08-14T11:20:04.898855892Z caller=aws.go:137 component=aws info="detected cluster region" source="EC2 metadata service" region=eu-central-1
ts=2019-08-14T11:20:04.898913002Z caller=aws.go:104 component=aws info="restricting ECR registry scans" regions=[eu-central-1] include-ids=[] exclude-ids=[111111111111]
ts=2019-08-14T11:20:04.898932643Z caller=aws.go:198 component=aws info="attempting to refresh auth tokens" region=eu-central-1 account-ids=22222222222
ts=2019-08-14T11:20:05.267463941Z caller=checkpoint.go:24 component=checkpoint msg="up to date" latest=1.13.3
ts=2019-08-14T11:20:20.281839131Z caller=loop.go:111 component=sync-loop event=refreshed [email protected]:example/flux-repo-customer.git branch=immediate-deployment HEAD=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4
ts=2019-08-14T11:20:43.268106844Z caller=loop.go:206 component=sync-loop tag=flux-sync old=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4 new=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4
ts=2019-08-14T11:20:44.465668143Z caller=images.go:17 component=sync-loop msg="polling for new images for automated workloads"
ts=2019-08-14T11:20:44.492747257Z caller=images.go:27 component=sync-loop msg="no automated workloads"
ts=2019-08-14T11:20:44.49405478Z caller=loop.go:111 component=sync-loop event=refreshed [email protected]:example/flux-repo-customer.git branch=immediate-deployment HEAD=8a9eef1a38dc142c92529e2c3fc0427ee2d617f4

Is it possible to increase the loglevel? Do you need any other info in order to fix this issue?

@runningman84 would be great if you can make a public git repo that we can use to reproduce this.

@runningman84 Also, can you paste the (redacted if needed) yaml of the Flux deployment running in the cluster?

@2opremio I think I have the simplest possible scenario set up in a public repo and tested it just with k3d for a cluster. I setup just a namespace with a default memory limit, which differs for some fiktive environments, so it is part of the kustomize patch. Now I have changed it with a commit to be 256 MB instead of 512 MB and did a fluxctl sync, which finishes, but after that I still find the de facto value to be 512, until I do a kubectl apply -k . in the environment folder.

The repo can be found here: https://github.com/arvatoaws-labs/flux-demo. I basically setup flux with just the base setup instuctions from here: https://github.com/fluxcd/helm-operator-get-started, but used a values file like this for the flux helm deployment:

additionalArgs:
- --manifest-generation=true
git:
  url: [email protected]:arvatoaws-labs/flux-demo.git
  branch: 'immediate-deployment'
  path: patches/demo
rbac:
  create: true
helmOperator:
  create: true
  createCRD: false
prometheus:
  enabled: true

Of course I also uploaded the deploy key, I'm even getting a flux-sync tag attached to the latest commit.

Here is what fluxctl says:

鈺扳攢 fluxctl sync
Synchronizing with [email protected]:arvatoaws-labs/flux-demo.git
HEAD of immediate-deployment is b604018
Waiting for b604018 to be applied ...
Done.

That commit did change the limit value, but I can wait an arbitrary amount of time and it doesn't actually apply the change.

@runningman84 I found what's wrong.
It took me a while to identify, but your .flux.yaml has a typo in it.
If you fix this, it works:

5c5
<   patchUpdated: flux-patch.yaml
---
>   patchFile: flux-patch.yaml

I also needed to copy your .flux.yaml to the patches/demo/ directory for it to properly generate manifests.
I'm not familiar with how the relative patch to the patchFile is calculated when the .flux.yaml is from a parent directory.

In general, we need to improve the error reporting for .flux.yaml files.
I had a similar experience where it wasn't generating my manifests, because I was using a .yml extension.

Ok, I tried it out with just the typo fixed, that seems to have fixed it. I don't think having a .flux.yaml in each patch folder is necessary, otherwise the base example in https://github.com/weaveworks/flux-kustomize-example wouldn't work either. While it does just a _env_/ not a patches/_env_/, that is still a subdirectory and it also only has one .flux.yaml

@stealthybox if .flux.yaml is not found in the specified --git-path, Flux will look in parent directories for it.

It works for us now. We are looking forward to getting better log messages from flux. At the moment there are many "spam" messages, but real problems like these are not logged.

Was this page helpful?
0 / 5 - 0 ratings