External-dns: "failed to sync cache: timed out waiting for the condition" for Istio Gateways

Created on 20 Oct 2020  路  6Comments  路  Source: kubernetes-sigs/external-dns

What happened:
External DNS is unable to sync Istio Gateways

On v0.7.4 I get:

time="2020-10-20T06:20:58Z" level=info msg="config: {APIServerURL: KubeConfig: RequestTimeout:30s ContourLoadBalancerService:heptio-contour/contour SkipperRouteGroupVersion:zalando.org/v1 Sources:[istio-gateway] Namespace: AnnotationFilter: LabelFilter: FQDNTemplate: CombineFQDNAndAnnotation:false IgnoreHostnameAnnotation:false IgnoreIngressTLSSpec:false Compatibility: PublishInternal:false PublishHostIP:false AlwaysPublishNotReadyAddresses:false ConnectorSourceServer:localhost:8080 Provider:cloudflare GoogleProject: GoogleBatchChangeSize:1000 GoogleBatchChangeInterval:1s DomainFilter:[robertsmieja.com] ExcludeDomains:[] ZoneNameFilter:[] ZoneIDFilter:[] AlibabaCloudConfigFile:/etc/kubernetes/alibaba-cloud.json AlibabaCloudZoneType: AWSZoneType: AWSZoneTagFilter:[] AWSAssumeRole: AWSBatchChangeSize:1000 AWSBatchChangeInterval:1s AWSEvaluateTargetHealth:true AWSAPIRetries:3 AWSPreferCNAME:false AWSZoneCacheDuration:0s AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: AzureSubscriptionID: AzureUserAssignedIdentityClientID: CloudflareProxied:false CloudflareZonesPerPage:50 CoreDNSPrefix:/skydns/ RcodezeroTXTEncrypt:false AkamaiServiceConsumerDomain: AkamaiClientToken: AkamaiClientSecret: AkamaiAccessToken: InfobloxGridHost: InfobloxWapiPort:443 InfobloxWapiUsername:admin InfobloxWapiPassword: InfobloxWapiVersion:2.3.1 InfobloxSSLVerify:true InfobloxView: InfobloxMaxResults:0 DynCustomerName: DynUsername: DynPassword: DynMinTTLSeconds:0 OCIConfigFile:/etc/kubernetes/oci.yaml InMemoryZones:[] OVHEndpoint:ovh-eu OVHApiRateLimit:20 PDNSServer:http://localhost:8081 PDNSAPIKey: PDNSTLSEnabled:false TLSCA: TLSClientCert: TLSClientCertKey: Policy:upsert-only Registry:txt TXTOwnerID:default TXTPrefix: TXTSuffix: Interval:1m0s Once:false DryRun:false UpdateEvents:true LogFormat:text MetricsAddress::7979 LogLevel:trace TXTCacheInterval:0s ExoscaleEndpoint:https://api.exoscale.ch/dns ExoscaleAPIKey: ExoscaleAPISecret: CRDSourceAPIVersion:externaldns.k8s.io/v1alpha1 CRDSourceKind:DNSEndpoint ServiceTypeFilter:[] CFAPIEndpoint: CFUsername: CFPassword: RFC2136Host: RFC2136Port:0 RFC2136Zone: RFC2136Insecure:false RFC2136TSIGKeyName: RFC2136TSIGSecret: RFC2136TSIGSecretAlg: RFC2136TAXFR:false RFC2136MinTTL:0s NS1Endpoint: NS1IgnoreSSL:false NS1MinTTLSeconds:0 TransIPAccountName: TransIPPrivateKeyFile: DigitalOceanAPIPageSize:50}"
time="2020-10-20T06:20:58Z" level=info msg="Instantiating new Kubernetes client"
time="2020-10-20T06:20:58Z" level=debug msg="apiServerURL: "
time="2020-10-20T06:20:58Z" level=debug msg="kubeConfig: "
time="2020-10-20T06:20:58Z" level=info msg="Using inCluster-config based on serviceaccount-token"
time="2020-10-20T06:20:58Z" level=info msg="Created Kubernetes client https://10.43.0.1:443"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:20:58Z" level=debug msg="service added"
time="2020-10-20T06:21:59Z" level=fatal msg="failed to sync cache: timed out waiting for the condition"
failed to sync cache: timed out waiting for the condition

on v0.7.3 I get:

time="2020-10-20T06:25:59Z" level=info msg="config: {APIServerURL: KubeConfig: RequestTimeout:30s IstioIngressGatewayServices:[] ContourLoadBalancerService:heptio-contour/contour SkipperRouteGroupVersion:zalando.org/v1 Sources:[istio-gateway] Namespace: AnnotationFilter: FQDNTemplate: CombineFQDNAndAnnotation:false IgnoreHostnameAnnotation:false Compatibility: PublishInternal:false PublishHostIP:false AlwaysPublishNotReadyAddresses:false ConnectorSourceServer:localhost:8080 Provider:cloudflare GoogleProject: GoogleBatchChangeSize:1000 GoogleBatchChangeInterval:1s DomainFilter:[robertsmieja.com] ExcludeDomains:[] ZoneIDFilter:[] AlibabaCloudConfigFile:/etc/kubernetes/alibaba-cloud.json AlibabaCloudZoneType: AWSZoneType: AWSZoneTagFilter:[] AWSAssumeRole: AWSBatchChangeSize:1000 AWSBatchChangeInterval:1s AWSEvaluateTargetHealth:true AWSAPIRetries:3 AWSPreferCNAME:false AzureConfigFile:/etc/kubernetes/azure.json AzureResourceGroup: AzureSubscriptionID: AzureUserAssignedIdentityClientID: CloudflareProxied:false CloudflareZonesPerPage:50 CoreDNSPrefix:/skydns/ RcodezeroTXTEncrypt:false AkamaiServiceConsumerDomain: AkamaiClientToken: AkamaiClientSecret: AkamaiAccessToken: InfobloxGridHost: InfobloxWapiPort:443 InfobloxWapiUsername:admin InfobloxWapiPassword: InfobloxWapiVersion:2.3.1 InfobloxSSLVerify:true InfobloxView: InfobloxMaxResults:0 DynCustomerName: DynUsername: DynPassword: DynMinTTLSeconds:0 OCIConfigFile:/etc/kubernetes/oci.yaml InMemoryZones:[] OVHEndpoint:ovh-eu OVHApiRateLimit:20 PDNSServer:http://localhost:8081 PDNSAPIKey: PDNSTLSEnabled:false TLSCA: TLSClientCert: TLSClientCertKey: Policy:upsert-only Registry:txt TXTOwnerID:default TXTPrefix: TXTSuffix: Interval:1m0s Once:false DryRun:false UpdateEvents:true LogFormat:text MetricsAddress::7979 LogLevel:trace TXTCacheInterval:0s ExoscaleEndpoint:https://api.exoscale.ch/dns ExoscaleAPIKey: ExoscaleAPISecret: CRDSourceAPIVersion:externaldns.k8s.io/v1alpha1 CRDSourceKind:DNSEndpoint ServiceTypeFilter:[] CFAPIEndpoint: CFUsername: CFPassword: RFC2136Host: RFC2136Port:0 RFC2136Zone: RFC2136Insecure:false RFC2136TSIGKeyName: RFC2136TSIGSecret: RFC2136TSIGSecretAlg: RFC2136TAXFR:false RFC2136MinTTL:0s NS1Endpoint: NS1IgnoreSSL:false TransIPAccountName: TransIPPrivateKeyFile: DigitalOceanAPIPageSize:50}"
time="2020-10-20T06:25:59Z" level=info msg="Instantiating new Kubernetes client"
time="2020-10-20T06:25:59Z" level=debug msg="apiServerURL: "
time="2020-10-20T06:25:59Z" level=debug msg="kubeConfig: "
time="2020-10-20T06:25:59Z" level=info msg="Using inCluster-config based on serviceaccount-token"
time="2020-10-20T06:25:59Z" level=info msg="Created Kubernetes client https://10.43.0.1:443"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:25:59Z" level=debug msg="service added"
time="2020-10-20T06:26:05Z" level=debug msg="no zoneIDFilter configured, looking at all zones"
time="2020-10-20T06:26:07Z" level=error msg="v1alpha3.GatewayList.Items: []v1alpha3.Gateway: v1alpha3.Gateway.v1alpha3.Gateway.Spec: unmarshalerDecoder: unknown field \"targetPort\" in v1alpha3.Port, error found in #10 byte of ...|:25565}}]}},{\"apiVer|..., bigger context ...|ber\":25565,\"protocol\":\"TCP\",\"targetPort\":25565}}]}},{\"apiVersion\":\"networking.istio.io/v1alpha3\",\"ki|..."
time="2020-10-20T06:27:06Z" level=debug msg="no zoneIDFilter configured, looking at all zones"
time="2020-10-20T06:27:08Z" level=error msg="v1alpha3.GatewayList.Items: []v1alpha3.Gateway: v1alpha3.Gateway.v1alpha3.Gateway.Spec: unmarshalerDecoder: unknown field \"targetPort\" in v1alpha3.Port, error found in #10 byte of ...|:25565}}]}},{\"apiVer|..., bigger context ...|ber\":25565,\"protocol\":\"TCP\",\"targetPort\":25565}}]}},{\"apiVersion\":\"networking.istio.io/v1alpha3\",\"ki|..."

I'm assuming something changed with the CRD in Istio 1.7.X that is causing issues.
I know the apiVersion is now networking.istio.io/v1beta1 for Gateways.

Oddly enough, I can confirm --source=istio-virtualservice has no issues.

What you expected to happen:

No error messages when using istio-gateway, or more specific error messages as to what failed.

How to reproduce it (as minimally and precisely as possible):
I haven't tested this, but I'm assuming:

  1. Install Istio 1.7.3 via Istio Operator
  2. Install External-DNS with --source=istio-gateway

Anything else we need to know?:
My external-dns config:

      containers:
      - name: external-dns
        image: k8s.gcr.io/external-dns/external-dns:v0.7.3
        args:
        - --events
        - --domain-filter=<omitted> 
        - --provider=cloudflare
        - --policy=upsert-only 
        - --log-level=trace

My ClusterRole for external-dns:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: external-dns
rules:
- apiGroups: [""]
  resources: ["services","endpoints","pods"]
  verbs: ["get","watch","list"]
- apiGroups: ["extensions","networking.k8s.io"]
  resources: ["ingresses"] 
  verbs: ["get","watch","list"]
- apiGroups: [""]
  resources: ["nodes"]
  verbs: ["watch","list"]
- apiGroups: ["networking.istio.io"]
  resources: ["gateways", "virtualservices"]
  verbs: ["get","watch","list"]

Environment:

  • External-DNS version (use external-dns --version): v0.7.4
  • DNS provider: CloudFlare
  • Others: k3s - v1.19.3+k3s1, Istio 1.7.3, RBAC enabled

If there's anything I left out that would be useful, please let me know and I'd be happy to share. Thanks for looking at this.

kinbug

All 6 comments

I'm seeing this same issue, but with v0.7.3, using AWS Route53 provider, on EKS 1.18 with Istio 1.7.1

Looking into it more and similar issues here I found it to be a problem with the ClusterRoleBinding.
I install external-dns into a specific namespace, but the ClusterRoleBinding is trying to bind a ServiceAccount in the default namespace...

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  labels:
    app.kubernetes.io/instance: external-dns
  name: external-dns-viewer
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: external-dns
subjects:
  - kind: ServiceAccount
    name: external-dns
    namespace: default

Fixing that issue solve it for me.

I'm able to use --source=istio-gateway with no issues, so it's not specific to RBAC in my case.

Oddly enough I couldn't reproduce the issue on KinD 0.9.0, while trying to come up with steps to reproduce.

Figured out the root cause:

I had an operator syncing in changes from a Git repo - https://github.com/fluxcd/toolkit
Apparently that causes external-dns to timeout while trying to read my istio-gateway resources.

Deleting the operator causes external-dns to work correctly.

Is this an issue I should open over at that repo?
Is there anyway to improve the error messages to narrow in on the root causes quicker?

EDIT:
The following versions do NOT work:

  • v0.7.4
  • v0.7.3
  • v0.7.2

The following versions do work:

  • v0.7.1
  • v0.7.0

I had the same issue today. I used the istio operator to install istio and the installation failed. The CRD for the istio gateway resource was missing. after reinstalling of istio it worked.

I also use the fluxcd toolkit but I think it has nothing to do with the toolkit itself.

Version: v0.7.5

Interestingly I get the same issue. With v0.7.4+ having moved from Istio (Anthos Service Mesh version) 1.7.3 to 1.7.6. with only the timeout message from the OP being seen.

Moving external-dns back to v0.7.3 I get the following:-

time="2021-01-25T08:39:36Z" level=debug msg="No endpoints could be generated from service bookinfo/productpage"
time="2021-01-25T08:39:36Z" level=error msg="v1alpha3.GatewayList.Items: []v1alpha3.Gateway: v1alpha3.Gateway.v1alpha3.Gateway.Spec: unmarshalerDecoder: unknown field \"targetPort\" in v1alpha3.Port, error found in #10 byte of ...|\":8080}}]}},{\"apiVer|..., bigger context ...|number\":80,\"protocol\":\"HTTP\",\"targetPort\":8080}}]}},{\"apiVersion\":\"networking.istio.io/v1alpha3\",\"ki|..."

Looking at api-versions might be a thing as mentioned with the Istio API changes - or the above is throwing a config error - I'll look into that:-

$ k api-versions | grep istio
authentication.istio.io/v1alpha1
config.istio.io/v1alpha2
install.istio.io/v1alpha1
networking.istio.io/v1alpha3
networking.istio.io/v1beta1
rbac.istio.io/v1alpha1
security.istio.io/v1beta1

Update: Someone had put a Gateway resource in with a "targetPort" which seems to break with this. Removing this for me allows all versions including 0.7.6 of external-dns to work.

So something in 0.7.6 was not producing the error message in this case. Moving back identified the problem in my environment. Moved and latest version works.

Although I don't know why it's having an issue with targetPort I believe that's valid. FYI this is it:-

```

apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: kubei-gw
namespace: kubei
labels:
app: kubei-gw
spec:
selector:
istio: ingressgateway # use Istio default gateway implementation
servers:
- port:
number: 80
name: http
protocol: HTTP
targetPort: 8080
hosts:
- kubei-gcp.example.com
```

Was this page helpful?
0 / 5 - 0 ratings