Knative Version: 0.12.0
/area API
/area autoscale
/area networking
Knative Serving (Rest API Example) should be able to pull from a private Google Container Registry (GCR) when using a GCP Service Account to authorize with GCR on kind.
Even tho normal Pods and even Knative Pods can use the GCR registry, Knative Revisions cannot.
kubectl get ksvc stock-service-example --output yaml
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: stock-service-example
namespace: default
...
status:
conditions:
- lastTransitionTime: "2020-02-19T00:50:26Z"
message: 'Revision "stock-service-example-5wlbf" failed with message: Unable to
fetch image "gcr.io/<redacted>/rest-api-go": failed to resolve image
to digest: failed to fetch image information: UNAUTHORIZED: You don''t have
the needed permissions to perform this operation, and you may have invalid credentials.
To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication.'
reason: RevisionFailed
status: "False"
type: ConfigurationsReady
My assumption is that the Knative Revision is not using the same docker auth config as the kubelet. But I'm not sure if it's a Knative issue, a Kubernetes issues, or a Kind issue.
mkdir ~/workspace
cd ~/workspace
git clone [email protected]:kubernetes-sigs/kind.git
cd kind
make build
sudo cp ./bin/kind /usr/local/bin/
mkdir $HOME/.config/kind/
cat > $HOME/.config/kind/cluster.yaml << EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
# mount docker config that can pull from private GCR
extraMounts:
- containerPath: /var/lib/kubelet/config.json
hostPath: $HOME/.config/kind/docker-config.json
EOF
REGISTRY=gcr.io/<redacted>
SA_NAME=kubernetes-kind
PROJECT_ID=<redacted>
SA_EMAIL="${SA_NAME}@${PROJECT_ID}.iam.gserviceaccount.com"
# Set project to own the service account
gcloud config set project "${PROJECT_ID}"
# Create GCP service account
gcloud iam service-accounts --format='value(email)' create "${SA_NAME}"
# Grant service account permission to read from GCR
gsutil iam ch "serviceAccount:${SA_EMAIL}:objectViewer" "gs://artifacts.${REGISTRY}.appspot.com"
DOCKER_CONFIG=$(mktemp -d)
export DOCKER_CONFIG
# Generate SA Credentials
gcloud iam service-accounts keys create "${DOCKER_CONFIG}/key.json" --iam-account "${SA_EMAIL}"
# Log into GCR
cat "${DOCKER_CONFIG}/key.json" | docker login -u _json_key --password-stdin https://gcr.io
# Copy config to kind bind-mount path
cp "${DOCKER_CONFIG}/config.json" "$HOME/.config/kind/docker-config.json"
rm -r "${DOCKER_CONFIG}"
unset DOCKER_CONFIG
DOCKER_CONFIG=$(mktemp -d)
# Generate Credentials
gcloud iam service-accounts keys create "${DOCKER_CONFIG}/key.json" --iam-account=$SA_EMAIL
# Generate Docker Config
cat > $HOME/.config/kind/docker-config.json << EOF
{
"auths": {
"gcr.io": {
"auth": "$((echo -n "_json_key:" && cat "${DOCKER_CONFIG}/key.json") | base64)"
}
},
"HttpHeaders": {
"User-Agent": "Docker-Client/19.03.5 (linux)"
}
}
EOF
rm -r $DOCKER_CONFIG
unset DOCKER_CONFIG
kind create cluster --config $HOME/.config/kind/cluster.yaml
cd ~/workspace
curl -L https://istio.io/downloadIstio | ISTIO_VERSION=1.4.3 sh -
export PATH="$PATH:$HOME/workspace/istio-1.4.3/bin"
istioctl manifest apply --set profile=demo
kubectl wait pods --all -n istio-system --for=condition=Ready --timeout 2m
kubectl apply --selector knative.dev/crd-install=true \
--filename https://github.com/knative/serving/releases/download/v0.12.0/serving.yaml \
--filename https://github.com/knative/eventing/releases/download/v0.12.0/eventing.yaml \
--filename https://github.com/knative/serving/releases/download/v0.12.0/monitoring.yaml
kubectl apply \
--filename https://github.com/knative/serving/releases/download/v0.12.0/serving.yaml \
--filename https://github.com/knative/eventing/releases/download/v0.12.0/eventing.yaml \
--filename https://github.com/knative/serving/releases/download/v0.12.0/monitoring.yaml
kubectl wait pods --all -n knative-serving --for=condition=Ready --timeout 2m
kubectl wait pods --all -n knative-eventing --for=condition=Ready --timeout 2m
kubectl wait pods --all -n knative-monitoring --for=condition=Ready --timeout 2m
cd ~/workspace
git clone -b "release-0.12" https://github.com/knative/docs knative-docs
cd knative-docs/
gcloud auth login
gcloud auth configure-docker
export REPO=gcr.io/<redacted>
docker build \
--tag "${REPO}/rest-api-go" \
--file docs/serving/samples/rest-api-go/Dockerfile .
docker push "${REPO}/rest-api-go"
envsubst \
< docs/serving/samples/rest-api-go/sample-template.yaml \
> docs/serving/samples/rest-api-go/sample.yaml
kubectl apply --filename docs/serving/samples/rest-api-go/sample.yaml
kubectl describe ksvc stock-service-example
...
Status:
Conditions:
Last Transition Time: 2020-02-19T00:50:26Z
Message: Revision "stock-service-example-5wlbf" failed with message: Unable to fetch image "gcr.io/<redacted>/rest-api-go:latest": failed to resolve image to digest: failed to fetch image information: UNAUTHORIZED: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication.
Reason: RevisionFailed
Status: False
Type: ConfigurationsReady
I'm betting this is related to why Cloud Run on GKE requires Workload Identity...
Knative Revisions probably tries to log into GCR using GCE instance metadata to retrieve the instance service account, and Workload Identity presents custom metadata and service accounts to each workload, while kind does neither and just configures the kubelet directly...
digestResolver.Resolve errors with failed to fetch image information after using remote.WithAuthFromKeychain:
https://github.com/knative/serving/blob/v0.12.0/pkg/reconciler/revision/resolve.go#L115-L118
https://github.com/google/go-containerregistry/blob/master/pkg/v1/remote/options.go#L100
It looks like it gets fed this k8schain.Options:
https://github.com/knative/serving/blob/master/pkg/reconciler/revision/revision.go#L72-L76
And if I'm reading this right... those options only support ImagePullSecrets?
That would suck... Do I need to configure all Knative Revisions with an ImagePullSecret even tho the kubelet can pull from GCR?
Paging @mattmoor, since it sounds like you worked on k8schain and helped resolve auth issues in https://github.com/knative/serving/issues/4435
Also paging @markusthoemmes, since it sounds like you added the clarified error messages in https://github.com/knative/serving/pull/5920
In Deploying images from a private container registry it says to create an imagePullSecret (using docker username & password) and attach this to the default service account in the default namespace. This approach has a number of problems:
_json_key (e.g. for a GCP Service Account).So... is there a plan to make this work without requiring a service account and imagePullSecret in every namespace? Like maybe some way to make a central controller do it so that the platform operator can configure it instead of every namespace tenant?
FWIW, I found how to make a _json_key secret from: http://docs.heptio.com/content/private-registries/pr-gcr.html
Basically:
TEMP_DIR=$(mktemp -d)
# Generate Credentials
gcloud iam service-accounts keys create "${TEMP_DIR}/key.json" --iam-account=$SA_EMAIL
# Create Secret
kubectl create secret docker-registry gcr-pull \
--namespace default \
--docker-server https://gcr.io \
--docker-username _json_key \
--docker-email "$SA_EMAIL" \
--docker-password "$(cat "${TEMP_DIR}/key.json")"
rm -r $TEMP_DIR
unset TEMP_DIR
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ServiceAccount
metadata:
name: demo-account
namespace: default
imagePullSecrets:
- name: gcr-pull
EOF
cat <<EOF | kubectl apply -f -
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: stock-service-example
namespace: default
spec:
template:
spec:
serviceAccountName: demo-account
containers:
- image: gcr.io/<redacted>/rest-api-go:latest
env:
- name: RESOURCE
value: stock
readinessProbe:
httpGet:
path: /
initialDelaySeconds: 0
periodSeconds: 3
timeoutSeconds: 11
failureThreshold: 3
EOF
_(skimming, so forgive any gaps I may have missed)_
I'm betting this is related to why Cloud Run on GKE requires Workload Identity...
If this is true, it is news to me (and a change since I left google). The Cloud Run on GKE controller should actually be excluded from the metadata-spoofing that workload identity does, so that it can use the Node's identity (as the kubelet does).
cc @cjcullen @mikedanese
And if I'm reading this right... those options only support ImagePullSecrets?
The imagePullSecrets are copied into the k8schain.Options, but so too is the serviceAccountName:
If imagePullSecrets are linked to the named service account then those should get pulled in as well (see here)
Right, I got it working by making a docker-registry Secret and putting it on a ServiceAccount as an imagePullSecret. The problem is that every user of Knative Serving has to do the same thing, rather than have it be some platform/controller level configuration.
It's tedious and means at least managing a Secret and ServiceAccount per namespace.
In our case it's even worse, because every workload already has its own ServiceAccount, which it uses to auth to Vault to download/inject secrets. Now, they all also need to have their own imagePullSecret and GCP service accounts in order for them to show up separately in audit logs. Some workloads were already using Vault to generate temporary credentials for GCP Service Accounts on-demand, but that doesn't really make sense when you have to generate permanent GCP SA credentials for the imagePullSecret.
@karlkfi I think I missed something. How are the workloads authenticating with GCR to pull images today?
Workloads don't pull images. The kubelet/docker pulls images.
In GKE, the nodes have a GCE service account attached to them that the kubelet finds using the GCE metadata endpoint.
In kind, you mount a docker auth json file into one of many paths on the nodes where kubeadm can find it (e.g. /var/lib/kubelet/config.json), per https://kubernetes.io/docs/concepts/containers/images/#configuring-nodes-to-authenticate-to-a-private-registry
But with knative serving, the revisions need the workload itself to pull image metadata from GCR, which it otherwise never has to do. So you have to either configure a k8s ServiceAccount with an imagePullSecret or the workload needs access to the GCE metadata server so it can pull GCP service account credentials from there (which doesn't exist in a kind cluster, unless you run it on GCE).
Ack. In a previous life I started GCR and pkg/credentialprovider in K8s. š
k8schain links in pkg/credentialprovider to mimic the kubelet's auth behavior as closely as we can (that's why we use it). There are a few environment circumstances where the chroot interferes with the fidelity of this, but in most cases we've seen you can get around this by exposing similar information to the workload. Here is on place we document that, but for environments like kind it may make the most sense to add a hostPath volume to the controller to the key you are providing to kind.
Probably fixed by https://github.com/kubernetes/kubernetes/pull/87912 but also please stop importing packages in k8s.io/kubernetes: https://github.com/google/go-containerregistry/issues/496. Those packages are not meant to be depended on.
Pretty sure this fixes that, at the cost of losing the "same code" fidelity: https://github.com/knative/serving/pull/7106
Honestly, K8s needs to fix the digest resolution problem :(
Thanks for pointing that out. Not exactly what I had in mind but I suppose it's one way... I'll ask around and see if there's a faster way to fix this.
I'd say the same thing about imagePullPolicy, Mike! š¤£
for environments like kind it may make the most sense to add a hostPath volume to the controller to the key you are providing to kind.
@mattmoor What path would i need to bind the key to in the knative controller container?
Given that it's linking the kubelet code, I'd probably give the same path in the container a shot first (then go digging if that doesn't work).
for environments like kind it may make the most sense to add a hostPath volume to the controller to the key you are providing to kind.
If you're customizing the deployment anyhow it might instead make sense to mount a secret? (not that kind is the bastion of secure systems! š ... but you could do this consistently across providers)
TBH, I donāt like either option. Bind mounts are a security hole and imagePullSecrets are a usability disaster. I wish the controller handled polling the registry for metadata so only the platform operator needed runtime permissions on the registry.
Er, I mean mount a normal secret containing the json that would otherwise
have been bind mounted?
On Tue, Mar 3, 2020, 21:39 Karl Isenberg notifications@github.com wrote:
TBH, I donāt like either option. Bind mounts are a security hole and
imagePullSecrets are a usability disaster. I wish the controller handled
polling the registry for metadata so only the platform operator needed
runtime permissions on the registry.ā
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/knative/serving/issues/6895?email_source=notifications&email_token=AAHADKZJIALMKEYXYZ22XVTRFXSSLA5CNFSM4KXP7NR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENWN6MQ#issuecomment-594337586,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAHADKYCCLDRZKRRNHM2X63RFXSSLANCNFSM4KXP7NRQ
.
Why not mount the secret it on the controller and have the Revision poll the controller as proxy to the registry or have the controller otherwise inject the metadata into the Revision?
Maybe we're talking past each other...
The doc about tag to digest resolution says the CONTROLLER needs access to the container registry at runtime. But I had to grant ALL KNATIVE SERVICES access to the container registry at runtime to get it to work...
That doc also only talks about proxies and certs, not service accounts and image pull secrets, so maybe I'm the one confused.
Why not mount the secret it on the controller ...
yes, I beleive that's what I was trying to ask.
... and have the Revision poll the controller as proxy to the registry or have the controller otherwise inject the metadata into the Revision?
this is more knative than I know about currently š .. so many fun things to play with one of these days.
_ducks out of the way so the experts can handle it_
But I had to grant ALL KNATIVE SERVICES access to the container registry at runtime to get it to work...
We resolve tag->digest in the controller, but the kubelet still needs to be able to pull the image by digest. Perhaps it's worth a call to avoid talking past one another?
Deploying images from a private container registry describes how to make an imagePullSecret on the default service account so it can be picked up by pods in the default namespace.
I got something similar working on kind by making a non-default service account with an imagePullSecret and explicitly setting spec.template.spec.serviceAccountName on Knative Services. It gets passed to the Revisions and then they can read metadata.
I also tried attaching an imagePullSecret to the controller service account used by the Knative controller deployment in the knative-serving namespace, but this didn't allow the controller to pull image metadata on behalf of the Revisions.
I'd really like the controller to be able to read image metadata on behalf of revisions in other namespaces so that each tenant in my multi-tenant system doesn't need to make their own GCP service account, secret, K8s service account, and then also set spec.template.spec.serviceAccountName on every Knative Service they make.
Per @mattmoor's suggestion, I tried mounting the config.json into the controller pod, but that doesn't seem to work:
kubectl patch Deployment controller -n knative-serving --type json --patch '[
{
"op": "add",
"path": "/spec/template/spec/volumes",
"value": [
{
"name": "docker-config",
"hostPath": {
"path": "/var/lib/kubelet/config.json",
"file": "File"
}
}
]
},
{
"op": "add",
"path": "/spec/template/spec/containers/0/volumeMounts",
"value": [
{
"name": "docker-config",
"mountPath": "/var/lib/kubelet/config.json"
}
]
}
]'
Still get the following error in the controller when creating a Service without an image digest specified:
{"level":"error","ts":"2020-03-05T23:14:36.543Z","logger":"controller.revision-controller","caller":"controller/controller.go:376","msg":"Reconcile error","commit":"bf0a848","knative.dev/controller":"revision-controller","error":"failed to resolve image to digest: failed to fetch image information: UNAUTHORIZED: You don't have the needed permissions to perform this operation, and you may have invalid credentials. To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication","stacktrace":"knative.dev/serving/vendor/knative.dev/pkg/controller.(Impl).handleErr\n\t/home/prow/go/src/knative.dev/serving/vendor/knative.dev/pkg/controller/controller.go:376\nknative.dev/serving/vendor/knative.dev/pkg/controller.(Impl).processNextWorkItem\n\t/home/prow/go/src/knative.dev/serving/vendor/knative.dev/pkg/controller/controller.go:362\nknative.dev/serving/vendor/knative.dev/pkg/controller.(*Impl).Run.func2\n\t/home/prow/go/src/knative.dev/serving/vendor/knative.dev/pkg/controller/controller.go:310"}
Issues go stale after 90 days of inactivity.
Mark the issue as fresh by adding the comment /remove-lifecycle stale.
Stale issues rot after an additional 30 days of inactivity and eventually close.
If this issue is safe to close now please do so by adding the comment /close.
Send feedback to Knative Productivity Slack channel or file an issue in knative/test-infra.
/lifecycle stale
/remove-lifecycle stale
@jonjohnsonjr I think the best way to test this would be to add a configuration representative of what Karl is trying to do here: https://github.com/google/go-containerregistry/tree/master/pkg/authn/k8schain/tests WDYT?
/lifecycle frozen
I recently saw a Twitter thread from @kelseyhightower and notice that @karlkfi pointed out that when he tried Knative he got stuck on this issue, this made me very very š¢
I tried to use Knative, but got stuck on having to specify
spec.template.spec.serviceAccountName on every Knative Service
with a serviceAccount and imagePullSecret in every namespace just
to be able to log into a custom image registry.
After doing some reading on controller source code and ks8chain I found a solution for this issue.
I created a tutorial on how to configure the Controller with the registry secret in different ways to be able to resolve the digest of the image provided.
https://github.com/csantanapr/knative-private-images
@mattmoor I would like to review it tomorrow at Office Hours, @karlkfi your welcome to join to see if this solves your issue.
I would like @karlkfi to give Knative another try š
After discussion, I can take the next steps and document how to provide registry secrets to the controller and close this issue.
Thanks @csantanapr !
Havenāt tested it yet, but Iām glad to know thereās a way to do it that doesnāt require Knative changes. Sounds like it could just be a documentation enhancement that says where to mount the Docker config into the controller.
TBH, Iām not working on the project that inspired this rabbit hole any more, but I am glad to know I can finally use KinD for local Knative Service development now.
Of course, for the record, the other workaround is to have a webhook do the image sha lookup or enforcement for you, which is something we added to K-rail a while back.
And for those who find this later, it sounds like the magic file path is ā.docker/config.jsonā in the ācontrollerā deployment in the āknative-servingā namespace.
Hi @karlkfi What in case if we are pointing the image to private insecure registry say : john.com:31320/knative/hello-world:latest . ? As we often get
Error: Unable to fetch image "john.com:31320/knative/hello-world:latest ": failed to resolve image to digest: Get "https://john.com:31320/v2/": http: server gave HTTP response to HTTPS client
Does Knative support pulling images from private insecure registry and if yes what configuration's/ flags we need to pass?
Hi @VIjayHP
Knative only needs access to the image registry for only tag resolution to digest
If your using a https with a custom private ssl certificates then you can configure the controller with the private cert to trust, just the certs not the CA
For example to make the controller trust https://john.com
Another option is to disable tag resolutiom and the controller will not need access to the your registry
You can see the details on how setup the certs in this docs here https://knative.dev/docs/serving/tag-resolution/
Hi @karlkfi
I know your with Google now, but I didnāt want to have tweet out there and people take impression that knative doesnāt support the different use cases of private images
I will update docs soon, and also link the tag resolution doc to private registries.
Thank you for looking into this and confirming that the solution I provided works for your original use case
@VIjayHP one more clarification Knative doesnāt pull the whole image it just fetch the manifest for the tag to retrieve the digest sha
The actual image pull is done via a pod and Kubernetes (kubelet & container runtime)
doesnāt pull the whole image it just fetch the manifest for the tag to retrieve the digest sha
In fact, it doesn't _really_ even fetch the manifest anymore. It just sends a HEAD request and gets the digest from headers.
@VIjayHP in which way do the tests fail? I feel like your issues are different from the ones at the top of this issue.
@VIjayHP in which way do the tests fail? I feel like your issues are different from the ones at the top of this issue.
Ya right ! This issue is resolved already with your inputs to use tag resolution. Thanks.
Most helpful comment
And for those who find this later, it sounds like the magic file path is ā.docker/config.jsonā in the ācontrollerā deployment in the āknative-servingā namespace.