Test-infra: prow.k8s.io: build cluster failing to start jobs

Created on 14 Jun 2019  路  18Comments  路  Source: kubernetes/test-infra

Example job link.

Failing to start the pod with:

  Warning  FailedCreatePodSandBox  4m9s (x153 over 37m)  kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6769374aadfa0a00be6c6e7b2be9b6da812dcc496c1c2ca0ab4b8ec1c611fe15": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254

Debug info


kubectl get prowjob

apiVersion: prow.k8s.io/v1
kind: ProwJob
metadata:
  annotations:
    prow.k8s.io/job: pull-test-infra-bazel
  creationTimestamp: 2019-06-14T16:28:09Z
  generation: 3
  labels:
    created-by-prow: "true"
    event-GUID: 64edf720-8ec1-11e9-917a-ebca143d37d5
    preset-bazel-scratch-dir: "true"
    preset-service-account: "true"
    prow.k8s.io/id: 6555d8aa-8ec1-11e9-a433-9ee7625cd1db
    prow.k8s.io/job: pull-test-infra-bazel
    prow.k8s.io/refs.org: kubernetes
    prow.k8s.io/refs.pull: "13038"
    prow.k8s.io/refs.repo: test-infra
    prow.k8s.io/type: presubmit
  name: 6555d8aa-8ec1-11e9-a433-9ee7625cd1db
  namespace: default
  resourceVersion: "235042395"
  selfLink: /apis/prow.k8s.io/v1/namespaces/default/prowjobs/6555d8aa-8ec1-11e9-a433-9ee7625cd1db
  uid: 655622fe-8ec1-11e9-8ebe-42010a800112
spec:
  agent: kubernetes
  cluster: default
  context: pull-test-infra-bazel
  decoration_config:
    gcs_configuration:
      bucket: kubernetes-jenkins
      default_org: kubernetes
      default_repo: kubernetes
      path_strategy: legacy
    gcs_credentials_secret: service-account
    grace_period: 15s
    timeout: 2h0m0s
    utility_images:
      clonerefs: gcr.io/k8s-prow/clonerefs:v20190610-3be53b072
      entrypoint: gcr.io/k8s-prow/entrypoint:v20190610-3be53b072
      initupload: gcr.io/k8s-prow/initupload:v20190610-3be53b072
      sidecar: gcr.io/k8s-prow/sidecar:v20190610-3be53b072
  job: pull-test-infra-bazel
  namespace: test-pods
  pod_spec:
    containers:
    - args:
      - test
      - --config=ci
      - --nobuild_tests_only
      - //...
      command:
      - hack/bazel.sh
      env:
      - name: GOOGLE_APPLICATION_CREDENTIALS
        value: /etc/service-account/service-account.json
      - name: E2E_GOOGLE_APPLICATION_CREDENTIALS
        value: /etc/service-account/service-account.json
      - name: TEST_TMPDIR
        value: /bazel-scratch/.cache/bazel
      image: launcher.gcr.io/google/bazel:0.26.0
      name: ""
      resources: {}
      volumeMounts:
      - mountPath: /etc/service-account
        name: service
        readOnly: true
      - mountPath: /bazel-scratch/.cache
        name: bazel-scratch
    volumes:
    - name: service
      secret:
        secretName: service-account
    - emptyDir: {}
      name: bazel-scratch
  refs:
    base_link: https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca
    base_ref: master
    base_sha: f1b872102fc8841673644eca2f32c30b16b4f6ca
    org: kubernetes
    pulls:
    - author: stevekuznetsov
      author_link: https://github.com/stevekuznetsov
      commit_link: https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5
      link: https://github.com/kubernetes/test-infra/pull/13038
      number: 13038
      sha: 410c774b4a2fbf4e6f1b150e307fc8f9a12082a5
    repo: test-infra
    repo_link: https://github.com/kubernetes/test-infra
  report: true
  rerun_command: /test pull-test-infra-bazel
  type: presubmit
status:
  build_id: "1139570293107331073"
  description: Job triggered.
  pod_name: 6555d8aa-8ec1-11e9-a433-9ee7625cd1db
  prev_report_states:
    github-reporter: pending
  startTime: 2019-06-14T16:28:09Z
  state: pending
  url: https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/test-infra/13038/pull-test-infra-bazel/1139570293107331073


kubectl get pod

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubernetes.io/limit-ranger: 'LimitRanger plugin set: memory request for container
      test; memory request for container sidecar; memory request for init container
      clonerefs; memory request for init container initupload; memory request for
      init container place-entrypoint'
    prow.k8s.io/job: pull-test-infra-bazel
  creationTimestamp: 2019-06-14T16:28:24Z
  labels:
    created-by-prow: "true"
    event-GUID: 64edf720-8ec1-11e9-917a-ebca143d37d5
    preset-bazel-scratch-dir: "true"
    preset-service-account: "true"
    prow.k8s.io/id: 6555d8aa-8ec1-11e9-a433-9ee7625cd1db
    prow.k8s.io/job: pull-test-infra-bazel
    prow.k8s.io/refs.org: kubernetes
    prow.k8s.io/refs.pull: "13038"
    prow.k8s.io/refs.repo: test-infra
    prow.k8s.io/type: presubmit
  name: 6555d8aa-8ec1-11e9-a433-9ee7625cd1db
  namespace: test-pods
  resourceVersion: "1169775978"
  selfLink: /api/v1/namespaces/test-pods/pods/6555d8aa-8ec1-11e9-a433-9ee7625cd1db
  uid: 6ea12de0-8ec1-11e9-a2d6-42010a8000bd
spec:
  automountServiceAccountToken: false
  containers:
  - command:
    - /tools/entrypoint
    env:
    - name: GOOGLE_APPLICATION_CREDENTIALS
      value: /etc/service-account/service-account.json
    - name: E2E_GOOGLE_APPLICATION_CREDENTIALS
      value: /etc/service-account/service-account.json
    - name: TEST_TMPDIR
      value: /bazel-scratch/.cache/bazel
    - name: ARTIFACTS
      value: /logs/artifacts
    - name: BUILD_ID
      value: "1139570293107331073"
    - name: BUILD_NUMBER
      value: "1139570293107331073"
    - name: GOPATH
      value: /home/prow/go
    - name: JOB_NAME
      value: pull-test-infra-bazel
    - name: JOB_SPEC
      value: '{"type":"presubmit","job":"pull-test-infra-bazel","buildid":"1139570293107331073","prowjobid":"6555d8aa-8ec1-11e9-a433-9ee7625cd1db","refs":{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}}'
    - name: JOB_TYPE
      value: presubmit
    - name: PROW_JOB_ID
      value: 6555d8aa-8ec1-11e9-a433-9ee7625cd1db
    - name: PULL_BASE_REF
      value: master
    - name: PULL_BASE_SHA
      value: f1b872102fc8841673644eca2f32c30b16b4f6ca
    - name: PULL_NUMBER
      value: "13038"
    - name: PULL_PULL_SHA
      value: 410c774b4a2fbf4e6f1b150e307fc8f9a12082a5
    - name: PULL_REFS
      value: master:f1b872102fc8841673644eca2f32c30b16b4f6ca,13038:410c774b4a2fbf4e6f1b150e307fc8f9a12082a5
    - name: REPO_NAME
      value: test-infra
    - name: REPO_OWNER
      value: kubernetes
    - name: ENTRYPOINT_OPTIONS
      value: '{"timeout":7200000000000,"grace_period":15000000000,"artifact_dir":"/logs/artifacts","args":["hack/bazel.sh","test","--config=ci","--nobuild_tests_only","//..."],"process_log":"/logs/process-log.txt","marker_file":"/logs/marker-file.txt","metadata_file":"/logs/artifacts/metadata.json"}'
    image: launcher.gcr.io/google/bazel:0.26.0
    imagePullPolicy: IfNotPresent
    name: test
    resources:
      requests:
        memory: 1Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/service-account
      name: service
      readOnly: true
    - mountPath: /bazel-scratch/.cache
      name: bazel-scratch
    - mountPath: /logs
      name: logs
    - mountPath: /tools
      name: tools
    - mountPath: /home/prow/go
      name: code
    workingDir: /home/prow/go/src/github.com/kubernetes/test-infra
  - command:
    - /sidecar
    env:
    - name: JOB_SPEC
      value: '{"type":"presubmit","job":"pull-test-infra-bazel","buildid":"1139570293107331073","prowjobid":"6555d8aa-8ec1-11e9-a433-9ee7625cd1db","refs":{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}}'
    - name: SIDECAR_OPTIONS
      value: '{"gcs_options":{"items":["/logs/artifacts"],"bucket":"kubernetes-jenkins","path_strategy":"legacy","default_org":"kubernetes","default_repo":"kubernetes","gcs_credentials_file":"/secrets/gcs/service-account.json","dry_run":false},"entries":[{"args":["hack/bazel.sh","test","--config=ci","--nobuild_tests_only","//..."],"process_log":"/logs/process-log.txt","marker_file":"/logs/marker-file.txt","metadata_file":"/logs/artifacts/metadata.json"}]}'
    image: gcr.io/k8s-prow/sidecar:v20190610-3be53b072
    imagePullPolicy: IfNotPresent
    name: sidecar
    resources:
      requests:
        memory: 1Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /logs
      name: logs
    - mountPath: /secrets/gcs
      name: gcs-credentials
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - command:
    - /clonerefs
    env:
    - name: CLONEREFS_OPTIONS
      value: '{"src_root":"/home/prow/go","log":"/logs/clone.json","git_user_name":"ci-robot","git_user_email":"[email protected]","refs":[{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}]}'
    image: gcr.io/k8s-prow/clonerefs:v20190610-3be53b072
    imagePullPolicy: IfNotPresent
    name: clonerefs
    resources:
      requests:
        memory: 1Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /logs
      name: logs
    - mountPath: /home/prow/go
      name: code
  - command:
    - /initupload
    env:
    - name: INITUPLOAD_OPTIONS
      value: '{"bucket":"kubernetes-jenkins","path_strategy":"legacy","default_org":"kubernetes","default_repo":"kubernetes","gcs_credentials_file":"/secrets/gcs/service-account.json","dry_run":false,"log":"/logs/clone.json"}'
    - name: JOB_SPEC
      value: '{"type":"presubmit","job":"pull-test-infra-bazel","buildid":"1139570293107331073","prowjobid":"6555d8aa-8ec1-11e9-a433-9ee7625cd1db","refs":{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}}'
    image: gcr.io/k8s-prow/initupload:v20190610-3be53b072
    imagePullPolicy: IfNotPresent
    name: initupload
    resources:
      requests:
        memory: 1Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /logs
      name: logs
    - mountPath: /secrets/gcs
      name: gcs-credentials
  - args:
    - /entrypoint
    - /tools/entrypoint
    command:
    - /bin/cp
    image: gcr.io/k8s-prow/entrypoint:v20190610-3be53b072
    imagePullPolicy: IfNotPresent
    name: place-entrypoint
    resources:
      requests:
        memory: 1Gi
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /tools
      name: tools
  nodeName: gke-prow-containerd-pool-bigger-170c3937-fzx9
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - name: service
    secret:
      defaultMode: 420
      secretName: service-account
  - emptyDir: {}
    name: bazel-scratch
  - emptyDir: {}
    name: logs
  - emptyDir: {}
    name: tools
  - name: gcs-credentials
    secret:
      defaultMode: 420
      secretName: service-account
  - emptyDir: {}
    name: code
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2019-06-14T16:28:24Z
    message: 'containers with incomplete status: [clonerefs initupload place-entrypoint]'
    reason: ContainersNotInitialized
    status: "False"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: 2019-06-14T16:28:24Z
    message: 'containers with unready status: [test sidecar]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: 2019-06-14T16:28:24Z
    message: 'containers with unready status: [test sidecar]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: 2019-06-14T16:28:24Z
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: gcr.io/k8s-prow/sidecar:v20190610-3be53b072
    imageID: ""
    lastState: {}
    name: sidecar
    ready: false
    restartCount: 0
    state:
      waiting:
        reason: PodInitializing
  - image: launcher.gcr.io/google/bazel:0.26.0
    imageID: ""
    lastState: {}
    name: test
    ready: false
    restartCount: 0
    state:
      waiting:
        reason: PodInitializing
  hostIP: 10.128.0.121
  initContainerStatuses:
  - image: gcr.io/k8s-prow/clonerefs:v20190610-3be53b072
    imageID: ""
    lastState: {}
    name: clonerefs
    ready: false
    restartCount: 0
    state:
      waiting:
        reason: PodInitializing
  - image: gcr.io/k8s-prow/initupload:v20190610-3be53b072
    imageID: ""
    lastState: {}
    name: initupload
    ready: false
    restartCount: 0
    state:
      waiting:
        reason: PodInitializing
  - image: gcr.io/k8s-prow/entrypoint:v20190610-3be53b072
    imageID: ""
    lastState: {}
    name: place-entrypoint
    ready: false
    restartCount: 0
    state:
      waiting:
        reason: PodInitializing
  phase: Pending
  qosClass: Burstable
  startTime: 2019-06-14T16:28:24Z


kubectl describe pod

Name:               6555d8aa-8ec1-11e9-a433-9ee7625cd1db
Namespace:          test-pods
Priority:           0
PriorityClassName:  <none>
Node:               gke-prow-containerd-pool-bigger-170c3937-fzx9/10.128.0.121
Start Time:         Fri, 14 Jun 2019 09:28:24 -0700
Labels:             created-by-prow=true
                    event-GUID=64edf720-8ec1-11e9-917a-ebca143d37d5
                    preset-bazel-scratch-dir=true
                    preset-service-account=true
                    prow.k8s.io/id=6555d8aa-8ec1-11e9-a433-9ee7625cd1db
                    prow.k8s.io/job=pull-test-infra-bazel
                    prow.k8s.io/refs.org=kubernetes
                    prow.k8s.io/refs.pull=13038
                    prow.k8s.io/refs.repo=test-infra
                    prow.k8s.io/type=presubmit
Annotations:        kubernetes.io/limit-ranger:
                      LimitRanger plugin set: memory request for container test; memory request for container sidecar; memory request for init container clonere...
                    prow.k8s.io/job: pull-test-infra-bazel
Status:             Pending
IP:                 
Init Containers:
  clonerefs:
    Container ID:  
    Image:         gcr.io/k8s-prow/clonerefs:v20190610-3be53b072
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /clonerefs
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      memory:  1Gi
    Environment:
      CLONEREFS_OPTIONS:  {"src_root":"/home/prow/go","log":"/logs/clone.json","git_user_name":"ci-robot","git_user_email":"[email protected]","refs":[{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}]}
    Mounts:
      /home/prow/go from code (rw)
      /logs from logs (rw)
  initupload:
    Container ID:  
    Image:         gcr.io/k8s-prow/initupload:v20190610-3be53b072
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /initupload
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      memory:  1Gi
    Environment:
      INITUPLOAD_OPTIONS:  {"bucket":"kubernetes-jenkins","path_strategy":"legacy","default_org":"kubernetes","default_repo":"kubernetes","gcs_credentials_file":"/secrets/gcs/service-account.json","dry_run":false,"log":"/logs/clone.json"}
      JOB_SPEC:            {"type":"presubmit","job":"pull-test-infra-bazel","buildid":"1139570293107331073","prowjobid":"6555d8aa-8ec1-11e9-a433-9ee7625cd1db","refs":{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}}
    Mounts:
      /logs from logs (rw)
      /secrets/gcs from gcs-credentials (rw)
  place-entrypoint:
    Container ID:  
    Image:         gcr.io/k8s-prow/entrypoint:v20190610-3be53b072
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /bin/cp
    Args:
      /entrypoint
      /tools/entrypoint
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      memory:     1Gi
    Environment:  <none>
    Mounts:
      /tools from tools (rw)
Containers:
  test:
    Container ID:  
    Image:         launcher.gcr.io/google/bazel:0.26.0
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /tools/entrypoint
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      memory:  1Gi
    Environment:
      GOOGLE_APPLICATION_CREDENTIALS:      /etc/service-account/service-account.json
      E2E_GOOGLE_APPLICATION_CREDENTIALS:  /etc/service-account/service-account.json
      TEST_TMPDIR:                         /bazel-scratch/.cache/bazel
      ARTIFACTS:                           /logs/artifacts
      BUILD_ID:                            1139570293107331073
      BUILD_NUMBER:                        1139570293107331073
      GOPATH:                              /home/prow/go
      JOB_NAME:                            pull-test-infra-bazel
      JOB_SPEC:                            {"type":"presubmit","job":"pull-test-infra-bazel","buildid":"1139570293107331073","prowjobid":"6555d8aa-8ec1-11e9-a433-9ee7625cd1db","refs":{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}}
      JOB_TYPE:                            presubmit
      PROW_JOB_ID:                         6555d8aa-8ec1-11e9-a433-9ee7625cd1db
      PULL_BASE_REF:                       master
      PULL_BASE_SHA:                       f1b872102fc8841673644eca2f32c30b16b4f6ca
      PULL_NUMBER:                         13038
      PULL_PULL_SHA:                       410c774b4a2fbf4e6f1b150e307fc8f9a12082a5
      PULL_REFS:                           master:f1b872102fc8841673644eca2f32c30b16b4f6ca,13038:410c774b4a2fbf4e6f1b150e307fc8f9a12082a5
      REPO_NAME:                           test-infra
      REPO_OWNER:                          kubernetes
      ENTRYPOINT_OPTIONS:                  {"timeout":7200000000000,"grace_period":15000000000,"artifact_dir":"/logs/artifacts","args":["hack/bazel.sh","test","--config=ci","--nobuild_tests_only","//..."],"process_log":"/logs/process-log.txt","marker_file":"/logs/marker-file.txt","metadata_file":"/logs/artifacts/metadata.json"}
    Mounts:
      /bazel-scratch/.cache from bazel-scratch (rw)
      /etc/service-account from service (ro)
      /home/prow/go from code (rw)
      /logs from logs (rw)
      /tools from tools (rw)
  sidecar:
    Container ID:  
    Image:         gcr.io/k8s-prow/sidecar:v20190610-3be53b072
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /sidecar
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Requests:
      memory:  1Gi
    Environment:
      JOB_SPEC:         {"type":"presubmit","job":"pull-test-infra-bazel","buildid":"1139570293107331073","prowjobid":"6555d8aa-8ec1-11e9-a433-9ee7625cd1db","refs":{"org":"kubernetes","repo":"test-infra","repo_link":"https://github.com/kubernetes/test-infra","base_ref":"master","base_sha":"f1b872102fc8841673644eca2f32c30b16b4f6ca","base_link":"https://github.com/kubernetes/test-infra/commit/f1b872102fc8841673644eca2f32c30b16b4f6ca","pulls":[{"number":13038,"author":"stevekuznetsov","sha":"410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","link":"https://github.com/kubernetes/test-infra/pull/13038","commit_link":"https://github.com/kubernetes/test-infra/pull/13038/commits/410c774b4a2fbf4e6f1b150e307fc8f9a12082a5","author_link":"https://github.com/stevekuznetsov"}]}}
      SIDECAR_OPTIONS:  {"gcs_options":{"items":["/logs/artifacts"],"bucket":"kubernetes-jenkins","path_strategy":"legacy","default_org":"kubernetes","default_repo":"kubernetes","gcs_credentials_file":"/secrets/gcs/service-account.json","dry_run":false},"entries":[{"args":["hack/bazel.sh","test","--config=ci","--nobuild_tests_only","//..."],"process_log":"/logs/process-log.txt","marker_file":"/logs/marker-file.txt","metadata_file":"/logs/artifacts/metadata.json"}]}
    Mounts:
      /logs from logs (rw)
      /secrets/gcs from gcs-credentials (rw)
Conditions:
  Type              Status
  Initialized       False 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  service:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  service-account
    Optional:    false
  bazel-scratch:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  logs:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  tools:
    Type:    EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:  
  gcs-credentials:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  service-account
    Optional:    false
  code:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason                  Age                   From                                                    Message
  ----     ------                  ----                  ----                                                    -------
  Normal   Scheduled               39m                   default-scheduler                                       Successfully assigned test-pods/6555d8aa-8ec1-11e9-a433-9ee7625cd1db to gke-prow-containerd-pool-bigger-170c3937-fzx9
  Warning  FailedCreatePodSandBox  39m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b44007428cd156be1435328e60724d53bda421d19861469d1619c84963e2310c": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  39m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3492fd3b74f05259384f65c919d09220cb91cc1a7158dfefd501e5ce87a54312": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  38m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "da1db319f6c19f0fced3404ca9eba9231868d151d0a27216ee119dc5a212353c": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  38m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b79a0319862c99e133bc677597e17d6ccaadb06574dc05d7df65c60f602dfd13": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  38m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2a841c485d2cefe61623c15451365fb09b4bc99ab9610e582592ad0087b957e8": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  38m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d90b25d56dc0c66db6bff7c51e5342f1ec8b1466e379b223248fc62309c2ba1f": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  37m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "27e31071c1dd28633a21cf256aa2917b961694500bf9562e24cf4eef3e258ce4": fork/exec /home/containerd/opt/cni/bin/host-local: resource temporarily unavailable
  Warning  FailedCreatePodSandBox  37m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3248bb997030fdc2db7598c0b5a0ec580f610d79aa49bee6e0371abfd10522f3": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  37m                   kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "107b9a35a2d07fb2fd1b458ab80a63766a2cbaa7613aef94311823b0cab0f5e7": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254
  Warning  FailedCreatePodSandBox  4m9s (x153 over 37m)  kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6769374aadfa0a00be6c6e7b2be9b6da812dcc496c1c2ca0ab4b8ec1c611fe15": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254

kinbug lifecyclrotten prioritimportant-soon sinode

All 18 comments

There might be something messy with the CNI? Seeing a couple unique types of errors:

  • failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254

    • netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input

    • fork/exec /home/containerd/opt/cni/bin/host-local: resource temporarily unavailable


kubectl -n test-pods get events -o wide

kubectl -n test-pods get events -o wide | grep 'FailedCreatePodSandBox'
27m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "91986dde1f7454ab5a2017e84fd13989daf31234b7ae4c9d8df6db01434e3bab": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                           27m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81e9dba729542
27m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6804e34e63bd0469efb027d80908a61e1af775adde030f287536e5d192312674": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                27m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81ea05e0188e0
27m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "23da118535fa5a6dcb4c482eedf79de85d4050540791a995240f0f2a8c0e757a": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                           27m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81ea392b8be36
27m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "35bbadddbb9a15c0decb65a8c34f30587ac680d50d0f090b18bd4168cc6c4bda": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                27m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81ea6e2841d2a
26m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "40a2b8b804efc8c122e687b8d1bf4c759760932ccce5a8ab81211f3ddd7fc3d4": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                           26m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81ea9a01750f4
26m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7538b0f9293489152e5fdc87fb6324e92ed0356f4e6717792cf0bcd1dd490feb": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                           26m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81eaca6ae2212
26m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6412fca9f9f2ac49e92a689b0441efdeb63f4c423a727c10dafa1fe4cbcc2c75": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                26m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81eaff6722718
26m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "384dc4f65d9793d166bf97a061b52ca07a39797ebd9ee6999e0c2c345454a696": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                           26m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81eb2f3957397
26m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "36350985112f4c0761020d4aa4d46a919ccc2f809f6b6b571e5f14750f9bf342": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                26m          1       3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81eb58d20106b
2m40s       Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "42bf93c90d996c112f711d9bddbd40fbb03e0a5a8eeccee5f38a98c4a494ca90": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                           25m          106     3dcae290-8ec4-11e9-982b-d256d20a15b8.15a81eb81cd8b454
48m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b44007428cd156be1435328e60724d53bda421d19861469d1619c84963e2310c": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            48m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d7fb1be4274
48m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3492fd3b74f05259384f65c919d09220cb91cc1a7158dfefd501e5ce87a54312": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            48m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d8266bc91bd
47m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "da1db319f6c19f0fced3404ca9eba9231868d151d0a27216ee119dc5a212353c": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            47m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d85e339b0f0
47m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b79a0319862c99e133bc677597e17d6ccaadb06574dc05d7df65c60f602dfd13": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            47m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d8873af15e4
47m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "2a841c485d2cefe61623c15451365fb09b4bc99ab9610e582592ad0087b957e8": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            47m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d8b02ae3140
47m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d90b25d56dc0c66db6bff7c51e5342f1ec8b1466e379b223248fc62309c2ba1f": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            47m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d8dce8a0a8b
46m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "27e31071c1dd28633a21cf256aa2917b961694500bf9562e24cf4eef3e258ce4": fork/exec /home/containerd/opt/cni/bin/host-local: resource temporarily unavailable                                                                                                        46m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d910fdcb3b1
46m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "3248bb997030fdc2db7598c0b5a0ec580f610d79aa49bee6e0371abfd10522f3": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            46m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d948f007d95
46m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "107b9a35a2d07fb2fd1b458ab80a63766a2cbaa7613aef94311823b0cab0f5e7": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                                                            46m          1       6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d971e98b946
3m1s        Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d95583cbeed6d3b9d62b20b2368abe148dc810a2e2af55388f74623abc2ee9d5": failed to allocate for range 0: no IP addresses available in range set: 10.60.28.1-10.60.28.254                                                            46m          199     6555d8aa-8ec1-11e9-a433-9ee7625cd1db.15a81d9a60a5019d
37m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container: failed to create containerd task: ttrpc: client shutting down: ttrpc: closed: unknown                                                                                                                                                                                                                                                                                                              37m          3       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e1210fc75bf
37m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container: failed to start sandbox container task "c6affc36b33decd5b3441b280e3397497a45b82e64e3a6f3686a77ea0f648f6c": OCI runtime start failed: fork/exec /home/containerd/usr/local/sbin/runc: resource temporarily unavailable: : unknown                                                                                                                                                                   37m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e1b79147ef3
36m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e2b141ff2ac5c55af166323ae33587f3eb77df59ddcd5b89c7511014b8d9f1f9": fork/exec /home/containerd/opt/cni/bin/ptp: resource temporarily unavailable                                                                                                                                                                                                                                                  36m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e1ee484a86f
36m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d02f4ae81eaff5e2c1ba364d6a7e9d425f052c929ecdd823feeb844e8f3f6dba": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                                                                                                                                                                    36m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e21c1329301
36m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "6dab0e0278a9d94d31caf41b0fc9e19c3589411671d106003ff461bfffa192e0": fork/exec /home/containerd/opt/cni/bin/ptp: resource temporarily unavailable                                                                                                                                                                                                                                                  36m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e24f501d34f
36m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5ce128c2b6de773e82ac3c4c73e8b4576a55a501b8db4641e17fcdb3de894f96": fork/exec /home/containerd/opt/cni/bin/ptp: resource temporarily unavailable                                                                                                                                                                                                                                                  36m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e2873029881
35m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "a54f96ed994cac8cd0de178974cc6b21a6e0d244f5611addeb101df34a2d6952": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                                                                                                                                                                    35m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e2b0d190386
35m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "331ea5217fe402cd23a488f76d29397ee2c50cfe03ad8bbded6543fddb9af4ba": fork/exec /home/containerd/opt/cni/bin/ptp: resource temporarily unavailable                                                                                                                                                                                                                                                  35m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e2dcccad84b
35m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "946f13508ad7ae099bd48bc03163ad3b7084542a6eb4bf8bb911635355b179d6": fork/exec /home/containerd/opt/cni/bin/ptp: resource temporarily unavailable                                                                                                                                                                                                                                                  35m          1       d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e310eaf5334
2m31s       Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-st39   (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b3004a16918d3369107a5691743675e1dc8da90fa970e22dead0267a9eac33aa": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input                                                                                                                                                                                                    35m          150     d7fcb59f-8ec2-11e9-a433-9ee7625cd1db.15a81e33e80be89b
24m         Warning   FailedCreatePodSandBox   Pod    kubelet, gke-prow-containerd-pool-bigger-170c3937-fzx9   (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "b755a6282f14214fee42c35a39da3efa7e5bcf5ae612b9c7f08893518fc64723": fork/exec /home/containerd/opt/cni/bin/host-local: resource temporarily unavailable                                                            77m          242     f73e291c-8ebc-11e9-8958-2a4039bc9d1b.15a81be2d908b537

fork/exec /home/containerd/opt/cni/bin/host-local: resource temporarily unavailable

That sounds an awful lot like we're hitting I/O issues again, should check the I/O vs the new quotas.

Does look like disk is being slammed with read/write IOPS and throughput at some points, though only for certain nodes and very spikily. Graphs for gke-prow-containerd-pool-bigger-170c3937-0s3l:

Disk IO

(Thanks @cjwagner, @Katharine for debugging and info)

If there are more occurrences, link them here (and encourage others to); will continue looking in

Do we have any way to mitigate this besides increasing the boot disk capacity even further? Perhaps some way to throttle bazel's disk I/O? (I assume bazel is to blame.)

How frequently are we seeing this kind of failure? It is unclear to me how significant of a problem this is right now.

https://cloud.google.com/compute/docs/disks/performance#size_price_performance has a table explaining the scaling performance of I/O based on disk size. We have 500GB disks and should have limits of:

(Read | Write)
Sustained random IOPS limit | 375.00 | 750.00
Sustained throughput limit (MB/s) | 60.00 | 60.00

Spikes are, for instance:

  • Read throughput (MB): 99.49
  • Write throughput (MB): 12.56
  • Read ops: 885
  • Write ops: 752

This doesn't seem to be high impact at the moment, and we're unsure on how to address or fix it, so for the moment will keep an eye out for more reports of this happening. (I think we can drop the urgent label)

It shouldn't be a problem to bump Prow either, so going to bump and monitor Prow.

/remove-priority critical-urgent
/priority important-soon

Getting more of these, but without a spike in Disk IOPS or throughput. :/ (Thanks @cjwagner!)

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/test-infra/13078/pull-test-infra-verify-file-perms/1141118142936780802

kubectl describe pod -n test-pods 90713123-921c-11e9-ac17-16331ef92eac:

713123-921c-11e9-ac17-16331ef92eac to gke-prow-containerd-pool-bigger-170c3937-glwj
  Warning  FailedCreatePodSandBox  43m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "0ae58c9651cfc164d0e48d2c42faef3f16a5962fe2919acf76adfd4a7ef42ea7": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
  Warning  FailedCreatePodSandBox  43m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "7a55ca6430218e323309e2ec9edc0c1cecce29b96d711913bae9f63a601a79a5": fork/exec /home/containerd/opt/cni/bin/loopback: resource temporarily unavailable
  Warning  FailedCreatePodSandBox  42m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "5043a15e8cb3ded675b91f1656727b0c6abab2952d72acf722d5f5d5d7f7c675": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
  Warning  FailedCreatePodSandBox  42m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "c22dc99a76efee6f9a11c04902ba8a135a4254d74cc2903a00a4fb4be58f5b98": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
  Warning  FailedCreatePodSandBox  42m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "982f7cf5ac0d62feefc226d43e830c56ad79cb662b41f95b5eaf83e654a39d65": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
  Warning  FailedCreatePodSandBox  42m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: connection refused"
  Warning  FailedCreatePodSandBox  42m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "e4af98b15761929487423e8a3b9c780c11dc9147bcc2e03ccdabdc4037ec68e5": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
  Warning  FailedCreatePodSandBox  41m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "d6e24979203779a322e7d6240dbfa92c95694bbdfeed7966330d1a4bb77b96dd": netplugin failed but error parsing its diagnostic message "": unexpected end of JSON input
  Warning  FailedCreatePodSandBox  41m                    kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "f5527a6c05e6d185f0f5e56b5804fb7e4a6eacf5a7c51e316197c949076cca07": fork/exec /home/containerd/opt/cni/bin/ptp: resource temporarily unavailable
  Warning  FailedCreatePodSandBox  3m12s (x177 over 41m)  kubelet, gke-prow-containerd-pool-bigger-170c3937-glwj  (combined from similar events): Failed create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "db42c5b264a2f65fa92a2c86b6ae6fc14e045010fb91642abf72d77720727d55": fork/exec /home/containerd/opt/cni/bin/loopback: resource temporarily unavailable

This should have had impact since around 4 PM Pacific Time, but the graphs appear flat.

image

Taking a further look.

This appears to be the only affected node atm; couldn't find these errors happening for any other pods.

we definitely have spikes surpassing this that are sustained long enough to probably be considered sustained usage:
image

/cc @Katharine

Update:

It looks like this is the most common error pairing:

  1. FailedCreateSandBox: Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod [...]": operation timeout: context deadline exceeded
  2. FailedCreateSandBox: Failed create pod sandbox: rpc error: code = Unknown desc = failed to create a sandbox for pod "[...]": Error response from daemon: Conflict. The container name "[...]" is already in use by container "[..]". You have to remove (or rename) that container to be able to reuse that name.

We have at this point tried:

  • Adding more disk capacity didn't help. At this point we are substantially overprovisioned on IOPS. We are still hitting disk throughput limits but I doubt this is the cause?
  • Switching away from containerd: didn't help
  • Switching from COS to Ubuntu: didn't help

We can still observe:

  • Periodically hitting our disk throughput cap
  • Pod creation failures seem to occur when a node is around 100% CPU utilisation

A reasonable next step is probably to start enforcing CPU limits on pods.

@kubernetes/sig-node-bugs could we please get some guidance as to how to root-cause this issue and find a remedy?

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

what (if anything) was done to finally resolve this?

Was this page helpful?
0 / 5 - 0 ratings