Serving: x509 certificate errors when pulling image / deploying the example application

Created on 24 Jan 2019  ยท  12Comments  ยท  Source: knative/serving

/kind bug

Expected Behavior

Successfully deploying the Go example application

Actual Behavior

Unable to deploy the pod / x509 cert errors.

dev/k8s/k8sfiles  master โœ—                                                                                                    2d โœ– โš‘ โ—’  
โ–ถ kubectl apply -f examples/knative-example.yaml 
service.serving.knative.dev/helloworld-go created

dev/k8s/k8sfiles  master โœ—                                                                                                    2d โœ– โš‘ โ—’  
โ–ถ kubectl get ksvc                              
NAME            DOMAIN   LATESTCREATED         LATESTREADY   READY     REASON
helloworld-go            helloworld-go-00001                 Unknown   RevisionMissing

dev/k8s/k8sfiles  master โœ—                                                                                                                                                                                                                                             2d โœ– โš‘ โ—’  
โ–ถ kubectl describe ksvc/helloworld-go            
Name:         helloworld-go
Namespace:    default
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"serving.knative.dev/v1alpha1","kind":"Service","metadata":{"annotations":{},"name":"helloworld-go","namespace":"default"},"...
API Version:  serving.knative.dev/v1alpha1
Kind:         Service
Metadata:
  Creation Timestamp:  2019-01-24T19:52:52Z
  Generation:          1
  Resource Version:    24480
  Self Link:           /apis/serving.knative.dev/v1alpha1/namespaces/default/services/helloworld-go
  UID:                 a258f698-2011-11e9-a8a7-52540046b08b
Spec:
  Generation:  1
  Run Latest:
    Configuration:
      Revision Template:
        Spec:
          Container:
            Env:
              Name:         TARGET
              Value:        Go Sample v1
            Image:          gcr.io/knative-samples/helloworld-go
          Timeout Seconds:  300
Status:
  Conditions:
    Last Transition Time:        2019-01-24T19:53:00Z
    Message:                     Revision "helloworld-go-00001" failed with message: "Unable to fetch image \"gcr.io/knative-samples/helloworld-go\": Get https://gcr.io/v2/: x509: certificate has expired or is not yet valid".
    Reason:                      RevisionFailed
    Severity:                    Error
    Status:                      False
    Type:                        ConfigurationsReady
    Last Transition Time:        2019-01-24T19:53:00Z
    Message:                     Configuration "helloworld-go" does not have any ready Revision.
    Reason:                      RevisionMissing
    Severity:                    Error
    Status:                      False
    Type:                        Ready
    Last Transition Time:        2019-01-24T19:53:00Z
    Message:                     Configuration "helloworld-go" does not have any ready Revision.
    Reason:                      RevisionMissing
    Severity:                    Error
    Status:                      False
    Type:                        RoutesReady
  Latest Created Revision Name:  helloworld-go-00001
  Observed Generation:           1
Events:
  Type    Reason   Age              From                Message
  ----    ------   ----             ----                -------
  Normal  Created  9s               service-controller  Created Configuration "helloworld-go"
  Normal  Created  9s               service-controller  Created Route "helloworld-go"
  Normal  Updated  1s (x5 over 9s)  service-controller  Updated Service "helloworld-go"

Steps to Reproduce the Problem

  1. Use the example application here: https://github.com/knative/docs/blob/master/install/getting-started-knative-app.md#configuring-your-deployment
  2. kubectl get ksvc/helloworld-go

Additional Info

kinbug

Most helpful comment

@tcnghia @jonjohnsonjr

After 3 days of trying to fix this error, I found out what it was...

I'll leave you guys with this wonderful picture:

it was dns

In the end, DNS was setup incorrectly and it wasn't retrieving the correct ip(s) for gcr.io

All 12 comments

Not too sure if I was suppose to post this here or https://github.com/knative/docs

Things I've tried doing so far:

  • Resetting my entire infrastructure (removing and redeploying kubernetes)
  • Configuring and enabling NTP
  • Redeploying Istio / knative

It also doesn't matter what example I use, it's the same:


dev/k8s/k8sfiles  master โœ—                                                                                                                                                                                                                                             2d โœ– โš‘ โ—’  
โ–ถ kubectl describe revision helloworld-php-00001
Name:         helloworld-php-00001
Namespace:    default
Labels:       serving.knative.dev/configuration=helloworld-php
              serving.knative.dev/configurationGeneration=1
              serving.knative.dev/configurationMetadataGeneration=1
              serving.knative.dev/service=helloworld-php
Annotations:  <none>
API Version:  serving.knative.dev/v1alpha1
Kind:         Revision
Metadata:
  Creation Timestamp:  2019-01-25T15:46:54Z
  Generation:          1
  Owner References:
    API Version:           serving.knative.dev/v1alpha1
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Configuration
    Name:                  helloworld-php
    UID:                   703d778e-20b8-11e9-9a5d-52540046b08b
  Resource Version:        5602
  Self Link:               /apis/serving.knative.dev/v1alpha1/namespaces/default/revisions/helloworld-php-00001
  UID:                     703f15a0-20b8-11e9-9a5d-52540046b08b
Spec:
  Container:
    Env:
      Name:   TARGET
      Value:  HELLO WORLD!
    Image:    gcr.io/knative-samples/helloworld-php
    Name:
    Resources:
  Generation:       1
  Timeout Seconds:  300
Status:
  Conditions:
    Last Transition Time:  2019-01-25T15:46:54Z
    Severity:              Error
    Status:                True
    Type:                  BuildSucceeded
    Last Transition Time:  2019-01-25T15:46:54Z
    Message:               Unable to fetch image "gcr.io/knative-samples/helloworld-php": Get https://gcr.io/v2/: x509: certificate has expired or is not yet valid
    Reason:                ContainerMissing
    Severity:              Error
    Status:                False
    Type:                  ContainerHealthy
    Last Transition Time:  2019-01-25T15:46:54Z
    Message:               Unable to fetch image "gcr.io/knative-samples/helloworld-php": Get https://gcr.io/v2/: x509: certificate has expired or is not yet valid
    Reason:                ContainerMissing
    Severity:              Error
    Status:                False
    Type:                  Ready
    Last Transition Time:  2019-01-25T15:46:54Z
    Severity:              Error
    Status:                Unknown
    Type:                  ResourcesAvailable
  Log URL:                 http://localhost:8001/api/v1/namespaces/knative-monitoring/services/kibana-logging/proxy/app/kibana#/discover?_a=(query:(match:(kubernetes.labels.knative-dev%2FrevisionUID:(query:'703f15a0-20b8-11e9-9a5d-52540046b08b',type:phrase))))

Events:  <none>

This looks a lot like a general issue in your environment (granted others are not failing). Can you try to access curl https://gcr.io/v2/ from anywhere in your infrastructure to narrow things down? Is this a Kubernetes cluster from one of the big cloud providers or something "manual"?

@markusthoemmes This is bare-metal with kubeadm running on Debian 9.

Curl'ing works perfectly fine on each node:

server@labboi:~$ curl -k https://gcr.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"Unauthorized access."}]}server@labboi:~$ ^C
server@labboi:~$


k8s@k8s-master:~$ curl -k https://gcr.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"Unauthorized access."}]}k8s@k8s-master:~$
k8s@k8s-master:~$

k8s@k8s-node-2:~$ curl https://gcr.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"Unauthorized access."}]}k8s@k8s-node-2:~$

@cdrage does it also work without the -k ๐Ÿ™‚ . It's kinda the point that we actually want certificate checks here.

EDIT: What's the best way to debug / see what knative is doing to the images so I could possibly dive into the code and fix it? I'm at a loss here (spent 2 days working on this so far haha)

Woops. Yup. Same message everywhere:

k8s@k8s-master:~$ curl https://gcr.io/v2/
{"errors":[{"code":"UNAUTHORIZED","message":"Unauthorized access."}]}k8s@k8s-master:~$

Another interesting point is that pulling gcr.io images is absolutely fine, the ONLY part of the Kubernetes cluster that cannot pull it is... well... in knative containers.

For example, this is the description of pod "knative-serving/activator", showing that k8s successfully pulls the gcr.io images.

Events:
  Type    Reason     Age   From                 Message
  ----    ------     ----  ----                 -------
  Normal  Scheduled  24m   default-scheduler    Successfully assigned knative-serving/activator-69b8474d6b-dhkdz to k8s-node-1
  Normal  Pulled     24m   kubelet, k8s-node-1  Container image "docker.io/istio/proxy_init:1.0.2" already present on machine
  Normal  Created    24m   kubelet, k8s-node-1  Created container
  Normal  Started    24m   kubelet, k8s-node-1  Started container
  Normal  Pulled     24m   kubelet, k8s-node-1  Container image "gcr.io/knative-releases/github.com/knative/serving/cmd/activator@sha256:8d1696bb0e5fe143b0cb273f169b1f2841f71e48490247b8cad35bc65b2b2d6e" already present on machine
  Normal  Created    24m   kubelet, k8s-node-1  Created container
  Normal  Started    24m   kubelet, k8s-node-1  Started container
  Normal  Pulled     24m   kubelet, k8s-node-1  Container image "docker.io/istio/proxyv2:1.0.2" already present on machine
  Normal  Created    24m   kubelet, k8s-node-1  Created container
  Normal  Started    24m   kubelet, k8s-node-1  Started container

EDIT:

Docker pull also works fine on all nodes:

k8s@k8s-node-1:~$ sudo docker pull gcr.io/knative-samples/helloworld-go
Using default tag: latest
latest: Pulling from knative-samples/helloworld-go
bc9ab73e5b14: Downloading [======================================>            ] 34.86 MB/45.31 MB
193a6306c92a: Download complete
e5c3f8c317dc: Download complete
a587a86c9dcb: Downloading [===============>                                   ] 15.22 MB/50.07 MB
1bc310ac474b: Downloading [=====>                                             ] 5.913 MB/57.59 MB
87ab348d90cc: Waiting
786bc4873ebc: Waiting
bc6a2cf36a2e: Waiting
6a297b1cfcdb: Waiting

/cc @jonjohnsonjr

Jon, does this look like an issue specific to images built with ko?

Quite possibly a bug in this PR loading an invalid cert somehow? https://github.com/knative/serving/pull/2481

@tcnghia What'd be the best way to test that? Deploy 0.2.1 of knative and see what happens?

@tcnghia @jonjohnsonjr

After 3 days of trying to fix this error, I found out what it was...

I'll leave you guys with this wonderful picture:

it was dns

In the end, DNS was setup incorrectly and it wasn't retrieving the correct ip(s) for gcr.io

i have same issue with knative 0.4 and istio 1.0.6, any resolution? thanks,

@jeffcai Honestly, try checking your DNS haha

Was this page helpful?
0 / 5 - 0 ratings

Related issues

josephburnett picture josephburnett  ยท  6Comments

bbrowning picture bbrowning  ยท  6Comments

maxiloEmmmm picture maxiloEmmmm  ยท  4Comments

evankanderson picture evankanderson  ยท  3Comments

wtam2018 picture wtam2018  ยท  4Comments