external-dns cannot automatically change record type in Google DNS

Created on 4 Jan 2018  路  15Comments  路  Source: kubernetes-sigs/external-dns

I'm migrating from managing DNS records using CNAMEs

ie.

external-dns.alpha.kubernetes.io/target:

to a more standard setup of using an A record to point at a LoadBalancer.

When I removed the target annotation, I get this error:

time="2018-01-04T16:28:41Z" level=info msg="Del records: test-consul.dev.foo.io. CNAME [helm.dev.foo.io.]"
time="2018-01-04T16:28:41Z" level=info msg="Add records: test-consul.dev.foo.io. CNAME [10.1.1.8]"
time="2018-01-04T16:28:42Z" level=error msg="googleapi: Error 400: Invalid value for 'entity.change.additions[0].rrdata[0]': '10.1.1.8', invalid"

It looks like it's trying to replace the CNAME value with an IP, rather than delete the CNAME record and create a new A record.

Manually removing the old CNAME DNS record allows external-dns to do the right thing.

time="2018-01-04T16:31:44Z" level=info msg="Add records: test-consul.dev.foo.io. A [10.1.1.8]"
time="2018-01-04T16:31:44Z" level=info msg="Add records: helm-dev-test-consul.dev.foo.io. TXT ["heritage=external-dns,external-dns/owner=helm-dev"]"

Any hints on how to avoid a manual migration here?

closing-soon-if-no-response help wanted lifecyclfrozen

All 15 comments

Until the current release, we deliberately added that the record type doesn't change once the record was created to avoid flapping behaviour: https://github.com/kubernetes-incubator/external-dns/blob/v0.4.8/plan/plan.go#L79-L81

However, it looks like this functionality has been removed during our latest (non-released) Plan refactorings: https://github.com/kubernetes-incubator/external-dns/blob/ec07f45c8e8d54bd6b6ed982fd4ce58acfe2f556/plan/plan.go#L99-L103

@ideahitme can you confirm this?

@james-masson therefore, you could try a version from master if you're brave: docker pull registry.opensource.zalan.do/teapot/external-dns:v0.4.8-13-g1ed025a

@james-masson In the currently released version you need to force ExternalDNS to recreate the record instead of updating it. The Del+Add dance is just an implementation detail on Google CloudDNS but for ExternalDNS it's still just an update.

SInce you're modifying your Services anyways by removing the external-dns.alpha.kubernetes.io/target annotation you might be able to also set the external-dns.alpha.kubernetes.io/hostname annotation to another value so it becomes a different desired hostname. Alternatively if you're using the --fqdn-template feature you can just change the template.

Then let ExternalDNS do one more sync which will drop your existing records and create some bogus ones. Then change your Services/--fqdn-template back to the original values and you'll get back your desired records but with the correct A type set. This will only work if you're using the sync policy (default).

@ideahitme can you confirm this?

yes, that's right

However, it looks like this functionality has been removed during our latest (non-released) Plan refactorings:

This seemed like a bug to me, rather than a functionality, we "hard" forced preserving record type without checking actual target (is it hostname or ip), which leads to op described bug. Which is already fixed in the current "master"

Our team just ran into this with the Google Provider.
The behavior is definitely buggy.

The use case was changing from Google Internal LoadBalancers to a single ILB fronting a shared Ingress Controller.

This meant Services previously having annotations for A records got removed and changed to Ingress Objects with annotations for CNAME records of the same hostname.

What we found is that when you have an A record in CloudDNS that collides with a requested CNAME target on the Ingress object, external-dns sometimes does not do the Delete and Re-add of the record. (sometimes it works with no human intervention.)

What's concerning is that when it does not re-add, external-dns is unable to create new CNAME records. The controller seems to stop doing work, but the loop is still running and outputs logs like so:

time="2018-04-10T18:28:34Z" level=error msg="googleapi: Error 400: Invalid value for 'entity.change.additions[14].rrdata[0]': 'internal-ingress.companyci.com'
More details:
Reason: invalid, Message: Invalid value for 'entity.change.additions[14].rrdata[0]': 'internal-ingress.companyci.com'
Reason: invalid, Message: Invalid value for 'entity.change.additions[15].rrdata[0]': 'internal-ingress.companyci.com'

Manually deleting the colliding A records from CloudDNS seems to resolve the issue, and the controller picks up the necessary work and makes the new CNAME records.

This might be a race condition in the controller code or a bug in the provider specific implementation.
I haven't dug into it.

App Version: 0.4.8
Chart Version: 0.5.1

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

/remove-lifecycle rotten
This doesn't appear to have been fixed.
( I could be wrong. )

I'd consider this a requirement for GA software.
IMHO, external-dns surpasses requirements for beta with its very wide adoption.

casual bump.

/help

@stealthybox:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stealthybox unfortunately we have limited to no access to Google Cloud (read: that would cost us money that we don't have) that makes really hard to reproduce and fix those issues. I'm unsure if @linki or @njuettner can get a cluster on GKE to test that.

I totally second the "help wanted" label, please feel free to jump in.

One of the goals of #540 is to get ExternalDNS to a state where we can have resources to run such tests. I'm slowly working on it, hoping to get it to a final state by the end of the month.

Same issue happens in azure dns when the service has AWS ELB as the EXTERNAL-IP.
External dns will set it as the CNAME DNS record in azure dns.
After I deleted and re-applied the service, the AWS ELB name will change, but external dns seems unable to update the CNAME DNS record accordingly.
I had to manually delete the CNAME DNS record in azure dns and then external dns will register the CNAME DNS record correctly.
This seems a bug to me.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

solved this problem by adding --txt-prefix

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

/lifecycle frozen

This happens in AWS as well: https://github.com/kubernetes-sigs/external-dns/issues/1852.

From an earlier comment in here, it sounds like this was intentional behavior. But I'm not quite sure why.

Was this page helpful?
0 / 5 - 0 ratings