external-dns creates an A record to IP instead of ALB

Created on 13 Dec 2017 · 21Comments · Source: kubernetes-sigs/external-dns

May be related to #223

When creating an ingress resource (via alb-ingress-controller), we sometimes get an error where the RecordSet maps to an IP address.

time="2017-12-13T19:12:49Z" level=info msg="Changing records: CREATE {
  Action: "CREATE",
  ResourceRecordSet: {
    Name: "test.example.com",
    ResourceRecords: [{
        Value: "172.17.3.177"
      }],
    TTL: 300,
    Type: "A"
  }
} ..."
time="2017-12-13T19:12:49Z" level=info msg="Changing records: CREATE {
  Action: "CREATE",
  ResourceRecordSet: {
    Name: "test.example.com",
    ResourceRecords: [{
        Value: "\"heritage=external-dns,external-dns/owner=my-identifier\""
      }],
    TTL: 300,
    Type: "TXT"
  }
} ..."

After that, external-dns seems to be trying to update that record:

time="2017-12-13T19:19:50Z" level=info msg="Changing records: UPSERT {
  Action: "UPSERT",
  ResourceRecordSet: {
    Name: "test.example.com",
    ResourceRecords: [{
        Value: "xxx-yyy-afc4-111847805.us-east-1.elb.amazonaws.com"
      }],
    TTL: 300,
    Type: "A"
  }
} ..."
time="2017-12-13T19:19:50Z" level=error msg="InvalidChangeBatch: Invalid Resource Record: FATAL problem: ARRDATAIllegalIPv4Address (Value is not a valid IPv4 address) encountered with 'xxx-yyy-afc4-111847805.us-east-1.elb.amazonaws.com'
        status code: 400, request id: 96f84f28-e03a-11e7-85e7-1927c4da9ba3"
time="2017-12-13T19:20:50Z" level=info msg="Changing records: UPSERT {
  Action: "UPSERT",
  ResourceRecordSet: {
    Name: "test.example.com",
    ResourceRecords: [{
        Value: "xxx-yyy-afc4-111847805.us-east-1.elb.amazonaws.com"
      }],
    TTL: 300,
    Type: "A"
  }
} ..."

I tried to get my ingress resource:

$ kubectl get ing
NAME                  HOSTS                     ADDRESS            PORTS     AGE
test-ingress   test.example.com   xxx-yyy-afc4-...   80        18m

If I into Route 53 and delete the A record pointing to 172.17.3.177 and the TXT record, then external-dns will correctly create the ALIAS record to my ALB and all seems well.

lifecyclrotten

Source

twang-rs

👍1

Most helpful comment

@dieterrosch what did you set it to?

I'm having a very similar issue using alb-ingress-controller with external-dns.
I don't have a specific kubernetes service associated with ingress, so I'm not sure what to set it to.

rexroof on 8 Mar 2018

👍2

All 21 comments

Note, the DNS names and ALB names were changed above.

Versions:
kubernetes v1.7.10
alb-ingress-controller v1.0-alpha.3
external-dns v0.4.2

twang-rs on 13 Dec 2017

@twang-rs please do kubectl get ingress blah -o yaml and paste output here

ideahitme on 14 Dec 2017

That's strange.

@twang-rs @ksindi Can you please closely monitor the ADDRESS column of your test-ingress while this is happening? Or even better, for more details, like @ideahitme suggests via

kubectl get ing test-ingress -o json | jq .status.loadBalancer.ingress

It seems the field is populated with some IP before being populated with the ELB cname. In that case it would be an issue somewhere else. Please also paste your full ingress definition.

linki on 2 Jan 2018

@linki it only happened to me once and I can't reproduce anymore. Will let you now if I see it happening again.

ksindi on 2 Jan 2018

$ kubectl get ing alertmanager-system -o yaml
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  annotations:
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:xxxxxxxxxx:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
    alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80,"HTTPS": 443}]'
    alb.ingress.kubernetes.io/scheme: internet-facing
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"extensions/v1beta1","kind":"Ingress","metadata":{"annotations":{"alb.ingress.kubernetes.io/certificate-arn":"arn:aws:acm:us-east-1:xxxxxxxxxx:certificate/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx","alb.ingress.kubernetes.io/listen-ports":"[{\"HTTP\": 80,\"HTTPS\": 443}]","alb.ingress.kubernetes.io/scheme":"internet-facing"},"name":"alertmanager-system","namespace":"prometheus-system"},"spec":{"rules":[{"host":"test.example.com","http":{"paths":[{"backend":{"serviceName":"alertmanager-system","servicePort":9093},"path":"/"}]}}]}}
  creationTimestamp: 2018-01-02T21:28:32Z
  generation: 1
  name: alertmanager-system
  namespace: prometheus-system
  resourceVersion: "5482879"
  selfLink: /apis/extensions/v1beta1/namespaces/prometheus-system/ingresses/alertmanager-system
  uid: e1d5318f-f003-11e7-8638-0ef1ee2132d2
spec:
  rules:
  - host: test.example.com
    http:
      paths:
      - backend:
          serviceName: alertmanager-system
          servicePort: 9093
        path: /
status:
  loadBalancer:
    ingress:
    - hostname: xxxyyyyzzzz-prometheussyst-afc4-xxxxxxxx.us-east-1.elb.amazonaws.com

twang-rs on 2 Jan 2018

So, polling kubectl get ing blah -o json | jq .status.loadBalancer.ingress, I do see that as I create and destroy the ingress resource, occassionally, it would show in IP address:

#!/bin/bash

catch()
{
eval "$({
__2="$(
  { __1="$("${@:3}")"; } 2>&1;
  ret=$?;
  printf '%q=%q\n' "$1" "$__1" >&2;
  exit $ret
  )"
ret="$?";
printf '%s=%q\n' "$2" "$__2" >&2;
printf '( exit %q )' "$ret" >&2;
} 2>&1 )";
}

q() {
    kubectl get ing alertmanager-system -o json | jq .status.loadBalancer.ingress
}

LAST=
NL=0
while true; do
    catch OUT ERR q
    RESP="${OUT}${ERR}"
    if [ "$RESP" != "$LAST" ]; then
        if [ "$NL" -eq 1 ]; then
            NL=0
            echo
        fi
        echo $RESP
    else
        NL=1
        echo -n .
    fi
    LAST=$RESP
    sleep 0.5
done

Error from server (NotFound): ingresses.extensions "alertmanager-system" not found
.......
null
...................
[ { "hostname": "xxxyyyyzzzz-prometheussyst-afc4-1111111111.us-east-1.elb.amazonaws.com" } ]
...........................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Error from server (NotFound): ingresses.extensions "alertmanager-system" not found
...............................................
null
.....
[ { "hostname": "xxxyyyyzzzz-prometheussyst-afc4-1111111111.us-east-1.elb.amazonaws.com" } ]
.....................................
[ { "hostname": "xxxyyyyzzzz-prometheussyst-afc4-2222222222.us-east-1.elb.amazonaws.com" } ]
.............................................................................................................................................................................
Error from server (NotFound): ingresses.extensions "alertmanager-system" not found
.........................................................................
null
..
[ { "ip": "172.17.3.177" } ]
.....................................
[ { "hostname": "xxxyyyyzzzz-prometheussyst-afc4-3333333333.us-east-1.elb.amazonaws.com" } ]
.................................................................................................................................................

twang-rs on 2 Jan 2018

In the last case above, external-dns didn't create records until the ingress resource changed from ip to hostname, therefore the records created were correct (A records to the DNS name of the ELB).

Also note, that one time the hostname was actually the name of the previous ELB and then updated to the new ELB (1111111111 -> 2222222222). I'm pretty sure the ELB had been fully deprovisioned from the previous delete before I re-added the ingress resource.

So, it seems that it can be racy in terms of when external-dns wakes up to make the changes, versus whatever is setting the address of the load balancer (I assume alb-ingress-controller, in my case).

twang-rs on 2 Jan 2018

Thanks @twang-rs for the detailed log.

To me this seems to be an issue with your ingress controller. ExternalDNS never modifies any Ingress objects. It only reads several attributes including .status.loadBalancer.ingress to construct the desired DNS records.

Switching between DNS record types isn't supported nor is having multiple values in the .status.loadBalancer.ingress field (I think you have two values in there in your first post). This will lead to ExternalDNS printing several errors.

However, the underlying problem seems to be coming from the ingress controller putting those fluctuating values in the .status.loadBalancer.ingress field. I'm not sure if that's expected behaviour, though. If it is ExternalDNS should handle it better.

linki on 4 Jan 2018

i had the same issue, it occurred to me when I first deployed nginx-ingress without publishing the service (so external-dns created an A record to an in internal IP) and the update the nginx-ingress configuration to publish the service. by removing manually the old entry in route 53, everything when fine.

chicco785 on 10 Jan 2018

This is occurring with me as well. It happens which you don't enable 'publish service' for the nginx ingress controller. The Ingress endpoint will get set instead to the node IP of the nginx controller pod. When you enable 'publish service', the ingress endpoint is updated to the ALB hostname, but when external DNS attempts an upsert, it fails because it does not create the records as ALIAS.

jay-rob on 15 Jan 2018

@jrthrawny Thank you for that! Adding the correct publish-service option to my Nginx daemon set solved my issues too.

dieterrosch on 1 Mar 2018

@dieterrosch what did you set it to?

I'm having a very similar issue using alb-ingress-controller with external-dns.
I don't have a specific kubernetes service associated with ingress, so I'm not sure what to set it to.

rexroof on 8 Mar 2018

👍2

Here is part of the JSON definition of my Nginx Daemonset:

      "spec": {
        "containers": [
          {
            "name": "nginx",
            "image": "gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.11",
            "args": [
              "/nginx-ingress-controller",
              "--default-backend-service=nginx-ingress/default-http-backend",
              "--configmap=nginx-ingress/nginx",
              "--tcp-services-configmap=nginx-ingress/tcp-ports",
              "--publish-service=$(POD_NAMESPACE)/nginx"
            ],

I needed:
"--publish-service=$(POD_NAMESPACE)/nginx"

dieterrosch on 8 Mar 2018

👍1

thanks for following up.
I guess I'm confused about what service that namespace/nginx is.
my ingress is a deployment without a service.
I have lots of other kubernetes services that are exposed via ingress, not sure which service I'd point towards here.

rexroof on 8 Mar 2018

My NGinx pods are deployed in a namespace named nginx-ingress.
In that same namespace I have two services named nginx and default-http-backend.

All of these were created by Kops for me (This cluster is running on AWS).
The ingress does not need a service. It looks like this:

Outside internet -> Ingress -> Nginx Service -> Nginx Pods.

You would point it to the service that is sitting in front of your nginx containers, ie the service that load balances calls to your nginx.

dieterrosch on 8 Mar 2018

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 23 Apr 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot on 23 May 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot on 22 Jun 2019

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot on 22 Jun 2019

Here is part of the JSON definition of my Nginx Daemonset:

      "spec": {
        "containers": [
          {
            "name": "nginx",
            "image": "gcr.io/google_containers/nginx-ingress-controller:0.9.0-beta.11",
            "args": [
              "/nginx-ingress-controller",
              "--default-backend-service=nginx-ingress/default-http-backend",
              "--configmap=nginx-ingress/nginx",
              "--tcp-services-configmap=nginx-ingress/tcp-ports",
              "--publish-service=$(POD_NAMESPACE)/nginx"
            ],

I needed:
"--publish-service=$(POD_NAMESPACE)/nginx"

has anyone found the solution to this for alb-ingress-controller (not using nginx-ingress controller)? external-dns pointing to the internal IPs instead of the published alb DNS name occurs frequently but doesn't always happen.

i've confirmed that the alb DNS name exists from:
kubectl get ingress -o json | jq .status.loadBalancer

jtai-omniex on 21 Aug 2019

@jtai-omniex Please check that the value in kubectl get ingress -o json | jq .status.loadBalancer doesn't change in a similar way. Normally ExternalDNS only reads that value and the ingress controller actually causes the changes.

linki on 21 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings