While testing external-dns, we notice that during its check loop it runs an upsert on all hosts even if the records already exist. This is currently happening with a private zone.
I was also seeing the same thing with cloudflare. rolled back to image registry.opensource.zalan.do/teapot/external-dns:v0.5.16 and it stopped
This needs a bit more clarification, did you already turn on debug?
Could you add some more details incl. logs?
@njuettner We were only evaluating yesterday, so I'm new to the project. Is there just a --debug flag on the binary or some other method for enabling debug-level logging?
there is --log-level=debug
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: external-dns
rules:
- apiGroups: [""]
resources: ["services"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["pods"]
verbs: ["get","watch","list"]
- apiGroups: ["extensions"]
resources: ["ingresses"]
verbs: ["get","watch","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["list","watch"]
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
name: external-dns-viewer
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: external-dns
subjects:
- kind: ServiceAccount
name: external-dns-provider
namespace: default
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns-public
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns-public
template:
metadata:
labels:
app: external-dns-public
spec:
serviceAccountName: external-dns-provider
containers:
- name: external-dns-public
image: registry.opensource.zalan.do/teapot/external-dns:latest
args:
- --source=service
- --source=ingress
- --domain-filter=dev.foo.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
- --provider=aws
#- --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
- --aws-zone-type=public # only look at public hosted zones (valid values are public, private or no value for both)
- --annotation-filter=kubernetes.io/ingress.class=external-ingress
- --registry=txt
- --txt-owner-id=XXXXXX
securityContext:
fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns-private
spec:
strategy:
type: Recreate
selector:
matchLabels:
app: external-dns-private
template:
metadata:
labels:
app: external-dns-private
spec:
serviceAccountName: external-dns-provider
containers:
- name: external-dns-private
image: registry.opensource.zalan.do/teapot/external-dns:latest
args:
- --source=service
- --source=ingress
- --domain-filter=dev.foo.com # will make ExternalDNS see only the hosted zones matching provided domain, omit to process all available hosted zones
- --provider=aws
#- --policy=upsert-only # would prevent ExternalDNS from deleting any records, omit to enable full synchronization
- --aws-zone-type=private # only look at public hosted zones (valid values are public, private or no value for both)
- --annotation-filter=kubernetes.io/ingress.class=internal-ingress
- --registry=txt
- --txt-owner-id=XXXXXX
securityContext:
fsGroup: 65534 # For ExternalDNS to be able to read Kubernetes and AWS token files
time="2020-02-12T13:57:43Z" level=debug msg="pod added"
time="2020-02-12T13:57:44Z" level=debug msg="Considering zone: /hostedzone/xxx (domain: us-east-1.dev.foo.com.)"
time="2020-02-12T13:57:44Z" level=debug msg="Considering zone: /hostedzone/xxx (domain: dev.foo.com.)"
time="2020-02-12T13:57:44Z" level=debug msg="Considering zone: /hostedzone/xxx (domain: mycluster.dev.foo.com.)"
time="2020-02-12T13:57:44Z" level=debug msg="Endpoints generated from service: elastic-stack/efk-kibana-lb: [logs.mycluster.dev.foo.com 60 IN CNAME internal-xxx-xxx.us-east-1.elb.amazonaws.com []]"
time="2020-02-12T13:57:44Z" level=debug msg="Endpoints generated from service: monitoring/grafana-lb: [metrics.mycluster.dev.foo.com 60 IN CNAME internal-xxx-xxx.us-east-1.elb.amazonaws.com []]"
time="2020-02-12T13:57:44Z" level=debug msg="Endpoints generated from service: kubernetes-dashboard/kubernetes-dashboard: [dashboard.mycluster.dev.foo.com 60 IN CNAME internal-xxx-xxx.us-east-1.elb.amazonaws.com []]"
time="2020-02-12T13:57:44Z" level=debug msg="Considering zone: /hostedzone/xxx (domain: mycluster.dev.foo.com.)"
time="2020-02-12T13:57:44Z" level=debug msg="Considering zone: /hostedzone/xxx (domain: us-east-1.dev.foo.com.)"
time="2020-02-12T13:57:44Z" level=debug msg="Considering zone: /hostedzone/xxx (domain: dev.foo.com.)"
time="2020-02-12T13:57:44Z" level=debug msg="Adding metrics.mycluster.dev.foo.com. to zone mycluster.dev.foo.com. [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=debug msg="Adding dashboard.mycluster.dev.foo.com. to zone mycluster.dev.foo.com. [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=debug msg="Adding logs.mycluster.dev.foo.com. to zone mycluster.dev.foo.com. [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=debug msg="Adding metrics.mycluster.dev.foo.com. to zone mycluster.dev.foo.com. [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=debug msg="Adding dashboard.mycluster.dev.foo.com. to zone mycluster.dev.foo.com. [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=debug msg="Adding logs.mycluster.dev.foo.com. to zone mycluster.dev.foo.com. [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="Desired change: UPSERT metrics.mycluster.dev.foo.com A [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="Desired change: UPSERT dashboard.mycluster.dev.foo.com A [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="Desired change: UPSERT logs.mycluster.dev.foo.com A [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="Desired change: UPSERT metrics.mycluster.dev.foo.com TXT [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="Desired change: UPSERT dashboard.mycluster.dev.foo.com TXT [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="Desired change: UPSERT logs.mycluster.dev.foo.com TXT [Id: /hostedzone/xxx]"
time="2020-02-12T13:57:44Z" level=info msg="6 record(s) in zone mycluster.dev.foo.com. [Id: /hostedzone/xxx] were successfully updated"
That log just repeats over and over.
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
name: grafana
namespace: monitoring
spec:
ports:
- name: http
port: 3000
targetPort: http
selector:
app: grafana
---
apiVersion: v1
kind: Service
metadata:
labels:
app: grafana
annotations:
service.beta.kubernetes.io/aws-load-balancer-internal: 0.0.0.0/0
external-dns.alpha.kubernetes.io/hostname: metrics.mycluster.dev.foo.com
external-dns.alpha.kubernetes.io/ttl: "60"
kubernetes.io/ingress.class: external-ingress
name: grafana
namespace: monitoring
spec:
type: LoadBalancer
ports:
- name: http
port: 80
targetPort: http
selector:
app: grafana
Using tag: 0.6.0-debian-10-r0 I see this in the log files constantly:
time="2020-02-12T14:21:27Z" level=info msg="Changing record." action=CREATE record=test.example.com targets=1 ttl=120 type=CNAME zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:21:27Z" level=info msg="Changing record." action=CREATE record=test.example.com targets=1 ttl=1 type=TXT zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:22:27Z" level=info msg="Changing record." action=UPDATE record=test.example.comtargets=1 ttl=120 type=CNAME zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:22:29Z" level=info msg="Changing record." action=UPDATE record=test.example.com targets=1 ttl=1 type=TXT zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:23:26Z" level=info msg="Changing record." action=UPDATE record=test.example.comtargets=1 ttl=120 type=CNAME zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:23:27Z" level=info msg="Changing record." action=UPDATE record=test.example.com targets=1 ttl=1 type=TXT zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:24:28Z" level=info msg="Changing record." action=UPDATE record=test.example.com targets=1 ttl=120 type=CNAME zone=0060f5ee41789fc2dd77dacda3fa3845
time="2020-02-12T14:24:29Z" level=info msg="Changing record." action=UPDATE record=test.example.com targets=1 ttl=1 type=TXT zone=0060f5ee41789fc2dd77dacda3fa3845
Doesn't seem like it should be "Changing record." every minute if nothing's different.
I guess the problem is when using ALIAS record and trying to set a TTL annotation, it won't work and it will try to update it again and again. You won't get any errors from AWS.
By default you can't specify a TTL on ALIAS records which are pointing to an ALB/ELB.
Ah, that makes perfect sense. Thanks for looking at it.
(I'm going to leave this open in case you want to address it by adding some logic to your check loop to ignore this case, but if you want to close it as a "won't fix" then I won't complain)
@jmcclell I'll create a PR to circumvent this problem and will refer it to this issue. Thank you for sharing your information and addressing this problem 馃憤.
I guess this is a duplicate of https://github.com/kubernetes-sigs/external-dns/issues/992#issuecomment-585579081
I'm seeing the same error in Cloudflare, we are not using "ALIAS" records and there are no ALB/ELBs involved - we are using Cloudflare with Google Cloud Platform. When using 0.5.18 or higher, the external DNS controller deletes and creates DNS records every minute continuously.
@njuettner does your fix include resolution to that problem?
Hi,
We have this on AWS with 0.5.14, 0.5.15, 0.5.16, 0.5.17, 0.5.18 and 0.6.0, only not an upsert but a Create. On 0.5.9 this does not happen, but there every seconds 2 calls to listHostedZones occur which is going to flood our api call quota.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
This should be fixed at least for cloudflare as of 0.7.2, could you check?
It seems to work well with 0.7.2 - thanks 馃帀 I tested using the cloudflare provider.
OK, I'll consider this fixed. If not, please create another issue with steps to reproduce, or ideally a test in cloudflare_test.go or other affected providers. Tests for Cloudflare provider are really easy to write :)
/close
@sheerun: Closing this issue.
In response to this:
OK, I'll consider this fixed. If not, please create another issue with steps to reproduce, or ideally a test in cloudflare_test.go or other affected providers. Tests for Cloudflare provider are really easy to write :)
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
just for the record, removing external-dns.alpha.kubernetes.io/ttl from my manifest just worked, it seems that if you don't try to force the ttl, it doesn't try to change the record again and again.
@sheerun seeing this issue on AWS. Removing external-dns.alpha.kubernetes.io/ttl from Service LoadBalancer fixed it for that service, but it remains for Istio Gateway, which, AFAIK, has no TTL setting.
Most helpful comment
Using tag: 0.6.0-debian-10-r0 I see this in the log files constantly:
Doesn't seem like it should be "Changing record." every minute if nothing's different.