Hello OpenShift folks!
I have a private registry on docker.io (or hub.docker.com aka Docker Hub) and I'm running into a particularly funky issue. OpenShift is able to import the image when creating an image stream (and see the image metadata in the UI when going to Add to Project -> Image Name -> 🔍), but then fail spectacularly when the newly created pod is trying to pull the same image from the registry.
Yes, I did make sure the image pull secret is made available to the pod by editing the deployment config and specifying it to be used, and double-checked that it is there (see commands in repro steps below).
and verified that it is being set with kubectl get pod/my-image-failing-pod -o yaml | grep -i imagePullSecrets: -A1 add the image pull secret to the Deployment Config (an alternative is to add is )
Funny thing, this exact thing was possible two weeks ago, as I created a deployment
$ oc version
oc v3.9.1
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://api.starter-us-west-1.openshift.com:443
openshift v3.9.1
kubernetes v1.9.1+a0ce1bc657
docker login into docker.io
oc secrets sosecret --docker-server=docker.io --docker-username=user --docker-password=password --docker-email=email
oc create secret docker-registry sosecret --docker-server=docker.io --docker-username=user --docker-password=password --docker-email=email
oc secrets link default sosecret --for=pull
oc import-image image:tag docker.io/user/private-reg:tag --confirm
oc new-app --name=great-expectations --image-stream=image:tag
oc patch dc/great-expectations -p '{"spec": {"template": {"spec": {"imagePullSecrets": [{"name": "sosecret"}]}}}}'
oc get po -o yaml | grep -A1 imagePullSecrets:
oc describe po
The pod never comes to life, and instead a grab-bag of events like:
Error: ImagePullBackOffBack-off pulling image "docker.io/user/private-reg@sha256:hashFailed to pull image "docker.io/user/private-reg@sha256:hash": rpc error: code = Unknown desc = unauthorized: authentication requiredPod is able to pull image and start doing those fun things pods do when they come to life.
@openshift/sig-developer-experience
@php-coder this is pods/deployments pulling from dockerhub, so @openshift/sig-master or @openshift/sig-pod
@mxxk one possiblity is some issues w/ the secret creation command generating invalid docker secret formats. Have you tried creating a dockerconfig secret by hand (supplying your .docker/config.json as a key named dockerconfigjson to a secret). you can see an example of how to create the secret in this fashion here: https://docs.openshift.org/latest/dev_guide/builds/build_inputs.html#using-docker-credentials-for-private-registries
oc create secret generic dockerhub \
--from-file=.dockerconfigjson=<path/to/.docker/config.json> \
--type=kubernetes.io/dockerconfigjson
Thanks @php-coder, @bparees. I definitely tried both docker secret formats, and at one point embedded more than one entry (i.e. multiple hostnames) in both .dockerconfigjson and .dockercfg formats to see if it would help.
Ultimately I determined the problem to not be with docker secret formats, since I was able to pull images from a private registry hosted on gitlab.com. It's docker.io private registries which were having issues. It would be good to repeat the test with a third private docker registry host (I'm thinking codefresh.io), to have another data point.
Okay, for that third data point, I just confirmed that OpenShift is able to pull from a private registry on codefresh.io. I did not test for multiple types of secrets, and instead used the newrer .dockerconfigjson format (rather than .dockercfg).
Here's a little more information about how I found out I needed two secrets in order to get the gitlab.com repo to work... Since I pushed my image to registry.gitlab.com, I used curl to inspect what the authentication with the registry might look like:
curl -i https://registry.gitlab.com/v2/
HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
Www-Authenticate: Bearer realm="https://gitlab.com/jwt/auth",service="container_registry"
X-Content-Type-Options: nosniff
Date: Tue, 13 Mar 2018 17:58:23 GMT
Content-Length: 87
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
What caught my attention is the realm="https://gitlab.com/jwt/auth" bit and how it uses the gitlab.com domain instead of registry.gitlab.com. So I created a new image pull secret with the same credentials as registry.gitlab.com, except for gitlab.com, and added both image pull secrets to my deployment config. (What will also work is to manually shove multiple secrets in whatever JSON format you end up using, .dockercfg or .dockerconfigjson, into the same image pull secret and add that one secret to the deployment config.)
docker.io is confusing because their actual domain name for registry pulls is index.docker.io (can anyone confirm?), and their curl output looks like this:
curl -i https://index.docker.io/v2/
HTTP/1.1 401 Unauthorized
Content-Type: application/json; charset=utf-8
Docker-Distribution-Api-Version: registry/2.0
Www-Authenticate: Bearer realm="https://auth.docker.io/token",service="registry.docker.io"
Date: Tue, 13 Mar 2018 18:07:22 GMT
Content-Length: 87
Strict-Transport-Security: max-age=31536000
{"errors":[{"code":"UNAUTHORIZED","message":"authentication required","detail":null}]}
Looks like they use auth.docker.io for their auth realm. In spite of all this, adding secrets to OpenShift for three domain names simultaneously (docker.io, index.docker.io, auth.docker.io), did not make for a successful deployment 😞
I can confirm (for OpenShift Online) that pods cannot pull private images from docker.io. I tried like every instruction I could find on the internet but none of them worked 😞 .
Interesting though is the fact that at some point it worked (around 6th of February) but a few days later it again stopped working. It also works on a dedicated instance that I have access to, which has the version v3.6.173.0.49, although it didn't work there a few days ago neither (maybe because a different version was deployed then, didn't check that).
@sjenning @mfojtik @derekwaynecarr @eparis who can look into this reported issue that openshift online pods can't pull images from private docker.io registries?
I struggle with the same, very new to OpenShift.
I cannot create the credentials with oc. I have to create them with the UI
docker loginUsing Authentication Type = Image Registry Credentials it also doesn't work from the UI.
(Using OpenShift Origin v3.7.0+7ed6862)
I also tried to use the dockercfg with oc secrets new dockerhub-secret .dockercfg=$HOME/.docker/config.json
-> it creates a secret of type kubernetes.io/dockercfg
OK, thanks @bparees that command works for me
$ oc create secret generic dockerhub-secret \
--from-file=.dockerconfigjson=$HOME/.docker/config.json \
--type=kubernetes.io/dockerconfigjson
$ oc secrets link default dockerhub-secret --for=pull
$ oc import-image myimage --from=docker.io/myorg/myimage --confirm
But as a Docker4Mac user I had to disable the preferences in General tab -> Securely store docker logins in macOS keychain and run docker login to create a config.json with auth entries.
@StefanScherer ok, so this sounds like the known issues w/ oc create secret docker-registry not creating an appropriately formatted docker secret. @juanvallejo can point you to where this was resolved in upstream kubernetes.
@mxxk can you confirm the same workaround works for you? it's not clear to me from your comment in https://github.com/openshift/origin/issues/18932#issuecomment-372741904 if you actually tried to create the secret from a working .docker/config.json file on disk?
@StefanScherer – there are two places where the docker image secret get used:
oc import-image or in the UI: Add to Project -> Image Name -> 🔍)oc new-app per my original description or in the UI: Add to Project -> Image Name -> 🔍 -> Deploy)In this issue I described how (1) works fine but (2) fails. In your two comments it sounds like you had an issue with (1) regardless of using oc or the UI, then you fixed it with the workaround of using oc create secret generic. However, I don't see any indication that (2) works for you, and suspect you'll run into a problem there. Can verify whether you can successfully deploy the private docker.io image and have the pod created?
@bparees – Though I believe @StefanScherer's observation is not related to this issue (see above), I tried the workaround for posterity and can confirm that the deployment still fails at pod creation, step (2). And yes, here are all of the permutations of image pull secret creation I've tried:
oc secret create ... (.dockercfg format)oc create secret docker-registry ... (.dockerconfigjson format)oc create secret generic --type=kubernetes.io/dockerconfigjson ... (.dockerconfigjson format)oc create secret generic --type=kubernetes.io/dockercfg ... (.dockercfg format)
{
"auths": {
"https://index.docker.io/v1/": {
"auth": REDACTED
},
"auth.docker.io": {
"auth": SAME AS ABOVE
},
"registry.docker.io": {
"auth": SAME AS ABOVE
},
"docker.io": {
"auth": SAME AS ABOVE
}
}
}
Thanks @mxxk
Well I also created a template with oc create -f template.yml and oc new-app --template=myapptemplate and it pulled the private images. I only have a problem with one image that doesn't pull the specified tag, but there's maybe an issue within the template itself. I had to manually pull that tag and then the deployment starts. But it's my first day with OpenShift and I'll have to explore and learn a lot.
@bparees
@juanvallejo can point you to where this was resolved in upstream kubernetes
Sure, fix for this was merged in https://github.com/kubernetes/kubernetes/pull/57463 upstream
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
I'm having the same issue, where a pod is getting the error when pulling the image not found: does not exist or no pull access.
Has anyone been able to resolve this?
@PatKayongo This is an example that works for me in OpenShift 3.9.
- kind: ImageStream
apiVersion: v1
metadata:
name: yourimage
labels:
app: yourimage
tier: application
spec:
dockerImageRepository: docker.io/yourorg/yourimage
tags:
- annotations: null
from:
kind: DockerImage
name: docker.io/yourorg/yourimage:yourtag
generation: 2
importPolicy: {}
name: yourtag
referencePolicy:
type: Source
- kind: DeploymentConfig
apiVersion: v1
metadata:
name: yourservice
generation: 1
labels:
app: yourservice
tier: application
spec:
replicas: 1
selector:
app: yourservice
deploymentconfig: yourservice
tier: application
strategy:
activeDeadlineSeconds: 21600
resources: {}
rollingParams:
intervalSeconds: 1
maxSurge: 25%
maxUnavailable: 25%
timeoutSeconds: 600
updatePeriodSeconds: 1
type: Rolling
template:
metadata:
labels:
app: yourservice
deploymentconfig: yourservice
tier: application
spec:
containers:
- name: yourservice
image: yourimage
imagePullPolicy: Always
env:
- name: SOME_ENVS
value: somevalue
ports:
- containerPort: 3000
protocol: TCP
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- name: data
mountPath: /data
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
volumes:
- name: data
emptyDir: {}
test: false
triggers:
- imageChangeParams:
automatic: true
containerNames:
- yourservice
from:
kind: ImageStreamTag
name: yourimage:yourtag
type: ImageChange
- type: ConfigChange
status: {}
I still seem to get the same error:
Failed to pull image "myorg/myimage": rpc error: code = Unknown desc = repository docker.io/myorg/myimage not found: does not exist or no pull access
This works fine on an OpenShift Origin cluster, but it seems to be failing on an OpenShift Container Platform cluster. I've created, and recreated the secrets to no avail.
@PatKayongo what command are you using to create your docker-registry secret (oc create secret ..., oc secret create ...)?
Also, what version of oc are you using?
This is a known issue:
https://bugzilla.redhat.com/show_bug.cgi?id=1561989
We've made a few attempts to fix it but have always introduced regressions elsewhere.
tl;dr the cause is an interaction between the dockershim in the kubelet stripping out the docker.io domain from the image path before making the call to docker and our docker (projectatomic/docker) supporting registry search lists.
I could not pull from private docker hub repo too:
docker version: Version 17.09.1-ce-mac42 (21090)
OpenShift version: v3.9.33
I tried setting off 'Securely store docker logins in macOS keychain' in docker preferences for mac,
then run
oc create secret generic secjson --from-file=.dockerconfigjson=/Users/myuser/.docker/config.json --type=kubernetes.io/dockerconfigjson
oc secrets link default secjson --for=pull
oc new-app myrepo/training:tag
oc describe po'
and got:
Failed to pull image "myrepo/training@sha256:a99aa5bc64...": rpc error: code = Unknown desc = unauthorized: authentication required`
oc describe sa default
Name: default
Namespace: michal-test
Labels: <none>
Annotations: <none>
Image pull secrets: default-dockercfg-q9cw5
secjson
Mountable secrets: default-token-f2b7q
default-dockercfg-q9cw5
secjson
Tokens: default-token-5bs9b
default-token-f2b7q
Events: <none>
Any advice ?
@michalrabinowitch
I'm using oc secrets link default secjson --for=pull, or is this option the default?
@seal-ss sorry I've used it too but didn't copy it, still does not work :/
@michalrabinowitch ok, my only idea is that you can check the ~/.config/docker.json again if it looks like
{
"auths": {
"https://index.docker.io/v1/": {
"auth": "here-is-your-base64-encoded-user:pass-string"
}
}
}
Maybe you have to run docker login again to create the JSON file with the new settings.
Or just create the JSON file manually and create the auth value with base64 then enter user:pass and RETURN then press CTRL+D.
@seal-ss I've decoded 'auth' from my $HOME/.docker/config.json
it returns username:password% with correct values and also the auths entry you've listed:
"auths" : {
"https://index.docker.io/v1/" : {
@michalrabinowitch So everything looks good. Now I run out of ideas ¯_(ツ)_/¯ :-)
I am still seeing this same problem on
$ oc version
oc v3.10.0+dd10d17
kubernetes v1.10.0+b81c8f8
features: Basic-Auth
Server https://ec2-18-144-36-71.us-west-1.compute.amazonaws.com:8443
openshift v3.9.31
kubernetes v1.9.1+a0ce1bc657
Does anyone have any workaround?
Here is the workaround from Redhat support. The following step worked by adding the docker registry configuration with docker.io ( which is my private registry).
Add the following line to /etc/sysconfig/docker on openshift master and worker nodes.
```
ADD_REGISTRY='--add-registry docker.io --add-registry registry.access.redhat.com'
```
Restart docker service.
# systemctl restart docker
still seeing this issue in 3.11 i tried every possible option
This did not help.
ADD_REGISTRY='--add-registry docker.io --add-registry registry.access.redhat.com'
chakwifi:gitlab-oc-cli cjonagam$ oc import-image --from=registry.gitlab.com/debianmaster/nodejs-welcome --all --confirm wel --confirm
imagestream.image.openshift.io/wel imported with errors
Name: wel
Namespace: nodejs-welcome
Created: 29 minutes ago
Labels: <none>
Annotations: openshift.io/image.dockerRepositoryCheck=2018-11-07T11:10:05Z
Docker Pull Spec: docker-registry.default.svc:5000/nodejs-welcome/wel
Image Lookup: local=false
Unique Images: 0
Tags: 1
latest
tagged from registry.gitlab.com/debianmaster/nodejs-welcome:master
! error: Import failed (InternalError): Internal error occurred: Get https://registry.gitlab.com/v2/debianmaster/nodejs-welcome/manifests/master: denied: access forbidden
29 minutes ago
Found the fix.
Based on this https://stackoverflow.com/questions/47993222/can-not-pull-image-from-gitlab-private-registryopenshift
```
So gitlab make the authentification in two times, first gitlab.com then registry.gitlab.com. Actually the error we got was the first one that was being dropped.
Just duplicate what you've done for registry.gitlab.com, but for gitlab.com```
i had create another secret that is referring to gitlab.com similar to egistry.gitlab.com then it fixed the issue
I don't know if this will help but make sure your password doesn't have slashes or * <- this thing. Should be able to use a strong password even without this. I had a similar issue with k8s refusing to pull images with 401 and 429 errors and when I eliminated these such characters from my password for my private intranet registry it worked. Should be able to work with docker hub also.
I struggled with this issue for a day and verified the solution suggested by @StefanScherer work.
Use a secret of type __generic__ instead of __docker-registry__
Step1 . Make sure that the Docker engines on all the nodes can trust the external registry
Add an entry for your external registery to the /etc/docker/certs.d folder on all nodes. More on this at [https://docs.docker.com/engine/security/certificates/]
Step2. Add the credentials to your Openshift project so that it can pull images from external registry.
e.g.
oc login ..
oc project <your project>
export reg_username=xxxuser
export reg_password=xxxpass
export authtoken=$(echo -n ${reg_username}:${reg_password} | base64 )
cat >./config.json <<EOL
{
"auths": {
"<docker_registry_hostname e.g. nexusrepo.xyz.com>:<docker_registry_port e.g 5000>": {
"auth": "${authtoken}"
}
}
}
EOL
oc create secret generic external-registry \
--from-file=.dockerconfigjson=./config.json \
--type=kubernetes.io/dockerconfigjson
oc secrets link default external-registry --for=pull
Now you should be able to install an image from your external registry.
oc new-app <registry_host>:<registry_port>/<image_path>--name <application name>
@ajarv it's not working for me.
If you use service account in our deployment config in my case I used spserviceuser so the spserviceuser have to have access to pull your private registry.
oc secrets link spserviceuser dockerauth --for=pull
If you don't specify any service user then please link the secrete default service user
oc secrets link default dockerauth --for=pull
In my case oc create secret docker-registry worked fine.
apiVersion: v1
kind: DeploymentConfig
*********
spec:
replicas: 1
selector:
app: splessons-abc
strategy:
type: Rolling
template:
metadata:
labels:
app: splessons-abc
spec:
containers:
-name: splessons-abc
serviceAccountName: spserviceuser
This is problem is occuring only when we don't use the default service account in deployment config.
Thanks,
Sree
Still not working at version 3.11.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
/remove-lifecycle stale
Are there any updates on this for OpenShift 3.11? I have tried all workarounds mentioned here and couldn't pull an image from a private repository on Docker Hub. Pulling the image using docker pull with the same credentials works.
We have moved away from Docker Hub to use Quay instead. Works as a charm both on 3.9 and 3.11.
Are there any updates on this for OpenShift 3.11? I have tried all workarounds mentioned here and couldn't pull an image from a private repository on Docker Hub. Pulling the image using
docker pullwith the same credentials works.
I'm also lost with pulling images from private docker repos.
We are using Openshift Online on version 3.11.43
Tried every suggested secret creation, but still no luck:
Error: ErrImagePull
Failed to pull image "docker.io/<org>/<repo>@sha256:<...>" rpc error: code = Unknown desc = repository docker.io/<org>/<repo> not found: does not exist or no pull access
I also spent hours trying to get an image from private registries (github then gitlab), until I found this thread.
I just added a new secret for gitlab.com (I already had one for registry.gitlab.com) and now the image gets pulled and deployed.
This is quite frustrating but now I am able to try out openshift a bit further!
Thanks.
For anyone else that winds up here due to image pull issues when running a build that is based on an ImageStream tied to a private repo -- I struggled getting a builder pod to be able to pull from a private repo because I had only linked the secret to builder with --for=pull. I could import the image to an ImageStream just fine but the build would fail with the error:
error: build error: failed to pull image: unauthorized: access to the requested resource is not authorized
Once I linked with --for=mount the pull worked within the build pod:
oc secrets link builder my-docker-secret --for=pull,mount
Incidentally, the default value for --for if it is not specified is mount ... by specifying --for=pull I caused myself some trouble :)
We also encountered problems when trying to pull a private builder image from Dockerhub, which is why I'm here.
Our setup works fine on a 3.7 cluster, but fails on 3.10 and 3.11.
We did find a solution, however, I do have questions why our setup actually works.
First of all, our OpenShift Version:
openshift v3.11.146
kubernetes v1.11.0+d4cacc0
We're setting up the following resources (which works perfectly on 3.7):
oc create secret docker-registry dockerhub-secret \
--docker-server=docker.io \
--docker-username=${DOCKER_USERNAME} \
--docker-password=${DOCKER_PASSWORD} \
--docker-email=unused
oc secrets link builder dockerhub-secret
oc import-image ${IMAGE_STREAM} \
--from=docker.io/our-org/our-image \
--confirm \
--scheduled=true \
--all
Unfortunately, on 3.10 and 3.11 every build fails:
pulling image error : repository docker.io/our-org/our-image not found: does not exist or no pull access
error: build error: unable to get docker.io/our-org/our-image@sha256:...
Possible solutions mentioned above did not work for us.
Lots of trial and error later we created our image stream with --reference-policy=local, which resolved the error and our build runs just fine.
oc import-image ${IMAGE_STREAM} \
--from=docker.io/our-org/our-image \
--confirm \
--scheduled=true \
--all \
--reference-policy=local
Problem is, we do not really understand why the change from reference policy 'source' to 'local' makes it possible to pull the image from Dockerhub.
Maybe someone with a deeper understanding of OpenShift can shed some light on this behaviour?
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
@mxxk Is this still an issue, I don't see this as a podman problem?
@rhatdan I haven't checked back on this issue for a very long time, since I stopped using OpenShift shortly after reporting it. Other folks on this thread have a more up-to-date handle on whether it has been fixed or not.
/remove-lifecycle stale
The workaround seems to work that @s1hofmann mentioned. Removing lifecycle/stale as the issue still persists in Azure Red Hat OpenShift v3.11.
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen.
Mark the issue as fresh by commenting/remove-lifecycle rotten.
Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
@StefanScherer – there are two places where the docker image secret get used:
oc import-imageor in the UI: Add to Project -> Image Name -> 🔍)oc new-appper my original description or in the UI: Add to Project -> Image Name -> 🔍 -> Deploy)In this issue I described how (1) works fine but (2) fails. In your two comments it sounds like you had an issue with (1) regardless of using
ocor the UI, then you fixed it with the workaround of usingoc create secret generic. However, I don't see any indication that (2) works for you, and suspect you'll run into a problem there. Can verify whether you can successfully deploy the private docker.io image and have the pod created?@bparees – Though I believe @StefanScherer's observation is not related to this issue (see above), I tried the workaround for posterity and can confirm that the deployment still fails at pod creation, step (2). And yes, here are all of the permutations of image pull secret creation I've tried:
oc secret create ...(.dockercfg format)oc create secret docker-registry ...(.dockerconfigjson format)oc create secret generic --type=kubernetes.io/dockerconfigjson ...(.dockerconfigjson format)oc create secret generic --type=kubernetes.io/dockercfg ...(.dockercfg format){ "auths": { "https://index.docker.io/v1/": { "auth": REDACTED }, "auth.docker.io": { "auth": SAME AS ABOVE }, "registry.docker.io": { "auth": SAME AS ABOVE }, "docker.io": { "auth": SAME AS ABOVE } } }All of the above permutations produce valid, correctly-formatted docker image pull secrets, and I can verify this because step (1) image stream creation succeeds and imports image metadata. But all fail at step (2) 😢