Istio: 1.5.2: istio-proxy with policy & telemetry pods crashing.

Created on 3 May 2020  路  2Comments  路  Source: istio/istio

Bug description
Pods keeps restarting.

Expected behavior
Not to crash.
Steps to reproduce the bug
Install istio 1.5.2
Version (include the output of istioctl version --remote and kubectl version and helm version if you used Helm)
istioctl:
client version: 1.5.2
control plane version: 1.5.2
data plane version: 1.5.2 (2 proxies)
How was Istio installed?
istioctl

Environment where bug was observed (cloud vendor, OS, etc)
AWS EKS 1.16.
configuration

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
  addonComponents:
    kiali:
      enabled: true
    prometheus:
      enabled: false
  components:
    ingressGateways:
    - enabled: true
      k8s:
        hpaSpec:
          minReplicas: 2
        overlays:
        - apiVersion: v1
          kind: Service
          name: istio-ingressgateway
          patches:
          - path: spec.ports
            value:
            - name: https
              port: 443
              targetPort: 443
    pilot:
      enabled: true
      k8s:
        hpaSpec:
          minReplicas: 2
        overlays:
        - apiVersion: policy/v1beta1
          kind: PodDisruptionBudget
          name: istiod
          patches:
          - path: spec.selector.matchLabels
            value:
              app: istiod
              istio: pilot
    policy:
      enabled: true # required for rate limit
    telemetry:
      enabled: true
  values:
    gateways:
      istio-ingressgateway:
        applicationPorts: ""
        type: LoadBalancer
        ports:
        - name: https
          port: 443
          targetPort: 80
        serviceAnnotations:
          service.beta.kubernetes.io/aws-load-balancer-type: "elb"
    global:
      controlPlaneSecurityEnabled: true
      defaultNodeSelector:
        role: internal
      defaultPodDisruptionBudget:
        enabled: true
      istioNamespace: istio-system
      mtls:
        auto: false
        enabled: false
      policyCheckFailOpen: true
      proxy:
        accessLogEncoding: JSON
        accessLogFile: "/dev/stdout"
        accessLogFormat: ""
        autoInject: enabled

istio-proxy log dump

2020-05-03T17:32:51.948905Z info    FLAG: --binaryPath="/usr/local/bin/envoy"
2020-05-03T17:32:51.948928Z info    FLAG: --concurrency="0"
2020-05-03T17:32:51.948933Z info    FLAG: --configPath="/etc/istio/proxy"
2020-05-03T17:32:51.948939Z info    FLAG: --connectTimeout="1s"
2020-05-03T17:32:51.948942Z info    FLAG: --controlPlaneAuthPolicy="MUTUAL_TLS"
2020-05-03T17:32:51.948962Z info    FLAG: --controlPlaneBootstrap="true"
2020-05-03T17:32:51.948965Z info    FLAG: --customConfigFile=""
2020-05-03T17:32:51.948968Z info    FLAG: --datadogAgentAddress=""
2020-05-03T17:32:51.948972Z info    FLAG: --disableInternalTelemetry="false"
2020-05-03T17:32:51.948975Z info    FLAG: --discoveryAddress="istio-pilot:15010"
2020-05-03T17:32:51.948979Z info    FLAG: --dnsRefreshRate="300s"
2020-05-03T17:32:51.948983Z info    FLAG: --domain="istio-system.svc.cluster.local"
2020-05-03T17:32:51.948986Z info    FLAG: --drainDuration="45s"
2020-05-03T17:32:51.948990Z info    FLAG: --envoyAccessLogService=""
2020-05-03T17:32:51.948993Z info    FLAG: --envoyMetricsService=""
2020-05-03T17:32:51.948996Z info    FLAG: --help="false"
2020-05-03T17:32:51.949000Z info    FLAG: --id=""
2020-05-03T17:32:51.949003Z info    FLAG: --ip=""
2020-05-03T17:32:51.949006Z info    FLAG: --lightstepAccessToken=""
2020-05-03T17:32:51.949009Z info    FLAG: --lightstepAddress=""
2020-05-03T17:32:51.949013Z info    FLAG: --lightstepCacertPath=""
2020-05-03T17:32:51.949016Z info    FLAG: --lightstepSecure="false"
2020-05-03T17:32:51.949019Z info    FLAG: --log_as_json="false"
2020-05-03T17:32:51.949022Z info    FLAG: --log_caller=""
2020-05-03T17:32:51.949026Z info    FLAG: --log_output_level="default:info"
2020-05-03T17:32:51.949029Z info    FLAG: --log_rotate=""
2020-05-03T17:32:51.949032Z info    FLAG: --log_rotate_max_age="30"
2020-05-03T17:32:51.949036Z info    FLAG: --log_rotate_max_backups="1000"
2020-05-03T17:32:51.949040Z info    FLAG: --log_rotate_max_size="104857600"
2020-05-03T17:32:51.949043Z info    FLAG: --log_stacktrace_level="default:none"
2020-05-03T17:32:51.949050Z info    FLAG: --log_target="[stdout]"
2020-05-03T17:32:51.949054Z info    FLAG: --mixerIdentity=""
2020-05-03T17:32:51.949057Z info    FLAG: --outlierLogPath=""
2020-05-03T17:32:51.949060Z info    FLAG: --parentShutdownDuration="1m0s"
2020-05-03T17:32:51.949063Z info    FLAG: --pilotIdentity=""
2020-05-03T17:32:51.949068Z info    FLAG: --proxyAdminPort="15000"
2020-05-03T17:32:51.949073Z info    FLAG: --proxyComponentLogLevel="misc:error"
2020-05-03T17:32:51.949076Z info    FLAG: --proxyLogLevel="warning"
2020-05-03T17:32:51.949080Z info    FLAG: --serviceCluster="istio-telemetry"
2020-05-03T17:32:51.949083Z info    FLAG: --serviceregistry="Kubernetes"
2020-05-03T17:32:51.949087Z info    FLAG: --statsdUdpAddress=""
2020-05-03T17:32:51.949090Z info    FLAG: --statusPort="0"
2020-05-03T17:32:51.949094Z info    FLAG: --stsPort="0"
2020-05-03T17:32:51.949098Z info    FLAG: --templateFile="/var/lib/envoy/envoy.yaml.tmpl"
2020-05-03T17:32:51.949102Z info    FLAG: --tokenManagerPlugin="GoogleTokenExchange"
2020-05-03T17:32:51.949105Z info    FLAG: --trust-domain="cluster.local"
2020-05-03T17:32:51.949108Z info    FLAG: --zipkinAddress=""
2020-05-03T17:32:51.949134Z info    Version 1.5.2-68d381dde45b34f82a7247f20840829f1ee56fc1-Clean
2020-05-03T17:32:51.949236Z info    Obtained private IP [10.7.94.62]
2020-05-03T17:32:51.949277Z info    Proxy role: &model.Proxy{ClusterID:"", Type:"sidecar", IPAddresses:[]string{"10.7.94.62", "10.7.94.62"}, ID:"istio-telemetry-9b6d64c85-8ghzr.istio-system", Locality:(*envoy_api_v2_core.Locality)(nil), DNSDomain:"istio-system.svc.cluster.local", ConfigNamespace:"", Metadata:(*model.NodeMetadata)(nil), SidecarScope:(*model.SidecarScope)(nil), MergedGateway:(*model.MergedGateway)(nil), ServiceInstances:[]*model.ServiceInstance(nil), WorkloadLabels:labels.Collection(nil), IstioVersion:(*model.IstioVersion)(nil)}
2020-05-03T17:32:51.949288Z info    PilotSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-pilot-service-account"}
2020-05-03T17:32:51.949294Z info    MixerSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-mixer-service-account"}
2020-05-03T17:32:51.949745Z info    Effective config: binaryPath: /usr/local/bin/envoy
configPath: /etc/istio/proxy
connectTimeout: 1s
controlPlaneAuthPolicy: MUTUAL_TLS
discoveryAddress: istio-pilot:15010
drainDuration: 45s
envoyAccessLogService: {}
envoyMetricsService: {}
parentShutdownDuration: 60s
proxyAdminPort: 15000
proxyBootstrapTemplatePath: /var/lib/envoy/envoy.yaml.tmpl
serviceCluster: istio-telemetry
statNameLength: 189

2020-05-03T17:32:51.949754Z info    JWT policy is third-party-jwt
2020-05-03T17:32:51.949794Z warn    Missing JWT token, can't use in process SDS ./var/run/secrets/tokens/istio-tokenstat ./var/run/secrets/tokens/istio-token: no such file or directory
2020-05-03T17:32:51.949805Z info    Monitored certs: []string{"/etc/certs/cert-chain.pem", "/etc/certs/key.pem", "/etc/certs/root-cert.pem"}
2020-05-03T17:32:51.949812Z info    waiting 2m0s for /etc/certs/cert-chain.pem
2020-05-03T17:32:52.950980Z info    waiting for file
2020-05-03T17:32:53.051141Z info    waiting for file

--> snip <---

2020-05-03T17:38:51.995709Z info    waiting for file
2020-05-03T17:38:52.095835Z info    waiting for file
2020-05-03T17:38:52.196263Z warn    file still not available after2m0s
2020-05-03T17:38:52.196819Z info    Static config:
admin:
  access_log_path: /dev/null
  address:
    socket_address:
      address: 127.0.0.1
      port_value: 15000
stats_config:
  use_all_default_tags: false
  stats_tags:
  - tag_name: cluster_name
    regex: '^cluster\.((.+?(\..+?\.svc\.cluster\.local)?)\.)'
  - tag_name: tcp_prefix
    regex: '^tcp\.((.*?)\.)\w+?$'
  - tag_name: response_code
    regex: '_rq(_(\d{3}))$'
  - tag_name: response_code_class
    regex: '_rq(_(\dxx))$'
  - tag_name: http_conn_manager_listener_prefix
    regex: '^listener(?=\.).*?\.http\.(((?:[_.[:digit:]]*|[_\[\]aAbBcCdDeEfF[:digit:]]*))\.)'
  - tag_name: http_conn_manager_prefix
    regex: '^http\.(((?:[_.[:digit:]]*|[_\[\]aAbBcCdDeEfF[:digit:]]*))\.)'
  - tag_name: listener_address
    regex: '^listener\.(((?:[_.[:digit:]]*|[_\[\]aAbBcCdDeEfF[:digit:]]*))\.)'

static_resources:
  clusters:
  - name: prometheus_stats
    type: STATIC
    connect_timeout: 0.250s
    lb_policy: ROUND_ROBIN
    hosts:
    - socket_address:
        protocol: TCP
        address: 127.0.0.1
        port_value: 15000

  - name: inbound_9092
    circuit_breakers:
      thresholds:
      - max_connections: 100000
        max_pending_requests: 100000
        max_requests: 100000
        max_retries: 3
    connect_timeout: 1.000s
    hosts:
    - pipe:
        path: /sock/mixer.socket
    http2_protocol_options: {}

  - name: out.galley.15019
    http2_protocol_options: {}
    connect_timeout: 1.000s
    type: STRICT_DNS

    circuit_breakers:
      thresholds:
        - max_connections: 100000
          max_pending_requests: 100000
          max_requests: 100000
          max_retries: 3

    tls_context:
      common_tls_context:
        tls_certificates:
        - certificate_chain:
            filename: /etc/certs/cert-chain.pem
          private_key:
            filename: /etc/certs/key.pem
        validation_context:
          trusted_ca:
            filename: /etc/certs/root-cert.pem
          verify_subject_alt_name:
          - spiffe://cluster.local/ns/istio-system/sa/istio-galley-service-account

    hosts:
      - socket_address:
          address: istio-galley.istio-system
          port_value: 15019


  listeners:
  - name: "15090"
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 15090
    filter_chains:
    - filters:
      - name: envoy.http_connection_manager
        config:
          codec_type: AUTO
          stat_prefix: stats
          route_config:
            virtual_hosts:
            - name: backend
              domains:
              - '*'
              routes:
              - match:
                  prefix: /stats/prometheus
                route:
                  cluster: prometheus_stats
          http_filters:
          - name: envoy.router

  - name: "15004"
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 15004
    filter_chains:
    - filters:
      - config:
          codec_type: HTTP2
          http2_protocol_options:
            max_concurrent_streams: 1073741824
          generate_request_id: true
          http_filters:
          - config:
              default_destination_service: istio-telemetry.istio-system.svc.cluster.local
              service_configs:
                istio-telemetry.istio-system.svc.cluster.local:
                  disable_check_calls: true
                  mixer_attributes:
                    attributes:
                      destination.service.host:
                        string_value: istio-telemetry.istio-system.svc.cluster.local
                      destination.service.uid:
                        string_value: istio://istio-system/services/istio-telemetry
                      destination.service.name:
                        string_value: istio-telemetry
                      destination.service.namespace:
                        string_value: istio-system
                      destination.uid:
                        string_value: kubernetes://istio-telemetry-9b6d64c85-8ghzr.istio-system
                      destination.namespace:
                        string_value: istio-system
                      destination.ip:
                        bytes_value: AAAAAAAAAAAAAP//CgdePg==
                      destination.port:
                        int64_value: 15004
                      context.reporter.kind:
                        string_value: inbound
                      context.reporter.uid:
                        string_value: kubernetes://istio-telemetry-9b6d64c85-8ghzr.istio-system
              transport:
                check_cluster: mixer_check_server
                report_cluster: inbound_9092
            name: mixer
          - name: envoy.router
          route_config:
            name: "15004"
            virtual_hosts:
            - domains:
              - '*'
              name: istio-telemetry.istio-system.svc.cluster.local
              routes:
              - decorator:
                  operation: Report
                match:
                  prefix: /
                route:
                  cluster: inbound_9092
                  timeout: 0.000s
          stat_prefix: "15004"
        name: envoy.http_connection_manager
      tls_context:
        common_tls_context:
          alpn_protocols:
          - h2
          tls_certificates:
          - certificate_chain:
              filename: /etc/certs/cert-chain.pem
            private_key:
              filename: /etc/certs/key.pem
          validation_context:
            trusted_ca:
              filename: /etc/certs/root-cert.pem
        require_client_certificate: true

  - name: "9091"
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 9091
    filter_chains:
    - filters:
      - config:
          codec_type: HTTP2
          http2_protocol_options:
            max_concurrent_streams: 1073741824
          generate_request_id: true
          http_filters:
          - config:
              default_destination_service: istio-telemetry.istio-system.svc.cluster.local
              service_configs:
                istio-telemetry.istio-system.svc.cluster.local:
                  disable_check_calls: true
                  mixer_attributes:
                    attributes:
                      destination.service.host:
                        string_value: istio-telemetry.istio-system.svc.cluster.local
                      destination.service.uid:
                        string_value: istio://istio-system/services/istio-telemetry
                      destination.service.name:
                        string_value: istio-telemetry
                      destination.service.namespace:
                        string_value: istio-system
                      destination.uid:
                        string_value: kubernetes://istio-telemetry-9b6d64c85-8ghzr.istio-system
                      destination.namespace:
                        string_value: istio-system
                      destination.ip:
                        bytes_value: AAAAAAAAAAAAAP//CgdePg==
                      destination.port:
                        int64_value: 9091
                      context.reporter.kind:
                        string_value: inbound
                      context.reporter.uid:
                        string_value: kubernetes://istio-telemetry-9b6d64c85-8ghzr.istio-system
              transport:
                check_cluster: mixer_check_server
                report_cluster: inbound_9092
            name: mixer
          - name: envoy.router
          route_config:
            name: "9091"
            virtual_hosts:
            - domains:
              - '*'
              name: istio-telemetry.istio-system.svc.cluster.local
              routes:
              - decorator:
                  operation: Report
                match:
                  prefix: /
                route:
                  cluster: inbound_9092
                  timeout: 0.000s
          stat_prefix: "9091"
        name: envoy.http_connection_manager

  - name: "local.15019"
    address:
      socket_address:
        address: 127.0.0.1
        port_value: 15019
    filter_chains:
      - filters:
          - name: envoy.http_connection_manager
            config:
              codec_type: HTTP2
              stat_prefix: "15019"
              stream_idle_timeout: 0s
              http2_protocol_options:
                max_concurrent_streams: 1073741824

              access_log:
                - name: envoy.file_access_log
                  config:
                    path: /dev/stdout

              http_filters:
                - name: envoy.router

              route_config:
                name: "15019"

                virtual_hosts:
                  - name: istio-galley

                    domains:
                      - '*'

                    routes:
                      - match:
                          prefix: /
                        route:
                          cluster: out.galley.15019
                          timeout: 0.000s
2020-05-03T17:38:52.197098Z info    PilotSAN []string{"spiffe://cluster.local/ns/istio-system/sa/istio-pilot-service-account"}
2020-05-03T17:38:52.197120Z info    Starting proxy agent
2020-05-03T17:38:52.197488Z info    Received new config, creating new Envoy epoch 0
2020-05-03T17:38:52.197552Z info    watching /etc/certs for changes
2020-05-03T17:38:52.197568Z info    Epoch 0 starting
2020-05-03T17:38:52.197631Z info    Envoy command: [-c /etc/istio/proxy/envoy.yaml --restart-epoch 0 --drain-time-s 45 --parent-shutdown-time-s 60 --service-cluster istio-telemetry --service-node sidecar~10.7.94.62~istio-telemetry-9b6d64c85-8ghzr.istio-system~istio-system.svc.cluster.local --max-obj-name-len 189 --local-address-ip-version v4 --log-format [Envoy (Epoch 0)] [%Y-%m-%d %T.%e][%t][%l][%n] %v -l warning --component-log-level misc:error]
[Envoy (Epoch 0)] [2020-05-03 17:38:52.227][14][critical][main] [external/envoy/source/server/server.cc:96] error initializing configuration '/etc/istio/proxy/envoy.yaml': Invalid path: /etc/certs/root-cert.pem
Invalid path: /etc/certs/root-cert.pem
2020-05-03T17:38:52.229140Z warn    Failed to delete config file /etc/istio/proxy/envoy-rev0.json for 0, remove /etc/istio/proxy/envoy-rev0.json: no such file or directory
2020-05-03T17:38:52.229198Z error   Epoch 0 exited with error: exit status 1
2020-05-03T17:38:52.229206Z info    No more active epochs, terminating
areenvironments areextensions and telemetry

Most helpful comment

In 1.5, controlPlaneSecurityEnabled with Mixer requires citadel to be deployed as well

All 2 comments

In 1.5, controlPlaneSecurityEnabled with Mixer requires citadel to be deployed as well

i believe we can close this, given @howardjohn 's answer. Feel free to reopen if you disagree or need more detail.

Was this page helpful?
0 / 5 - 0 ratings