Origin: Missing documentation on how to redeploy broken registry.

Created on 23 Aug 2016  路  15Comments  路  Source: openshift/origin

Missing documentation on how to redeploy broken registry. An openshift install shows the following output.

oc status
svc/docker-registry - 172.30.61.89:5000
dc/docker-registry deploys registry.access.redhat.com/openshift3/ose-docker-registry:v1.2.1
deployment #1 failed 2 hours ago

But the registry says it's working

openshift admin registry
Docker registry "docker-registry" service exists

oc get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
docker-registry 172.30.61.89 5000/TCP 2h
kubernetes 172.30.0.1 443/TCP,53/UDP,53/TCP 4d
router 172.30.48.169 80/TCP,443/TCP,1936/TCP 1h

There is not documentation about how to redeploy or rebuild a broken registry. This is causing new container builds to fail.

Version

oc v1.2.1
kubernetes v1.2.0-36-g4a3f9c5

Additional Information

[Note] Running diagnostic: ClusterRegistry
Description: Check that there is a working Docker registry

ERROR: [DClu1006 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:203]
The "docker-registry" service exists but has no associated pods, so it
is not available. Builds and deployments that use the registry will fail.

ERROR: [DClu1001 from diagnostic ClusterRegistry@openshift/origin/pkg/diagnostics/cluster/registry.go:173]
The "docker-registry" service exists but no pods currently running, so it
is not available. Builds and deployments that use the registry will fail.

aredocumentation componenimageregistry lifecyclfrozen prioritP2

Most helpful comment

What works for me is the following
oc rollout latest docker-registry

All 15 comments

yes, even i am facing the same issue, can someone please help

fwiw i had the same or similar issue... I dont recall what lead to it but here is how i solved it.

Deleted everything related to the docker-registry deployment. The service, router, deployments (all of the failed ones), and the service. Then I ran

oc deploy docker-registry --latest -n default

That errored because of a few existing things, primarily the service account, but the registry itself came up.

A note to myself for documenting this.

A deployment config needs to be re-created after service. Otherwise, the registry pod will lack environment variables

${DOCKER_REGISTRY_SERVICE_HOST}
${DOCKER_REGISTRY_SERVICE_PORT}

To test it:

$ oc rsh dc/docker-registry bash -c 'echo ${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}'
172.30.30.30:5000
# If a pod is started before the service exists, it will look like this
$ oc rsh dc/docker-registry bash -c 'echo ${DOCKER_REGISTRY_SERVICE_HOST}:${DOCKER_REGISTRY_SERVICE_PORT}'
:

If undefined, DOCKER_REGISTRY_URL will be empty, causing following problems:

Pushing image 172.30.91.135:5000/haowang/ruby-ex:latest ...
Pushed 4/5 layers, 82% complete
Pushed 5/5 layers, 100% complete
Registry server Address:
Registry server User Name: serviceaccount
Registry server Email: [email protected]
Registry server Password: <<non-empty>>
error: build error: Failed to push image: received unexpected HTTP status: 500 Internal Server Error

$ oc logs -f dc/docker-registry
...
time="2017-02-14T08:57:23.804381606Z" level=error msg="error creating ImageStreamMapping: ImageStreamMapping \"ruby-ex\" is invalid: image.dockerImageReference: Invalid value: \":/zhouy/ruby-ex@sha256:79884cc0d892dd8096d3f7ca9b2484045c5210ef0e488755ce4b635f231f809a\": invalid reference format" go.version=go1.7.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v1+prettyjws" http.request.host="172.30.91.135:5000" http.request.id=d49a6588-c7b4-4426-bf17-8933dbef9780 http.request.method=PUT http.request.remoteaddr="10.129.0.1:51862" http.request.uri="/v2/zhouy/ruby-ex/manifests/latest"
time="2017-02-14T08:57:23.804494035Z" level=error msg="response completed with error" err.code=unknown err.detail="ImageStreamMapping \"ruby-ex\" is invalid: image.dockerImageReference: Invalid value: \":/zhouy/ruby-ex@sha256:79884cc0d892dd8096d3f7ca9b2484045c5210ef0e488755ce4b635f231f809a\": invalid reference format" err.message="unknown error" go.version=go1.7.4 http.request.contenttype="application/vnd.docker.distribution.manifest.v1+prettyjws" http.request.host="172.30.91.135:5000" http.request.id=d49a6588-c7b4-4426-bf17-8933dbef9780 http.request.method=PUT http.request.remoteaddr="10.129.0.1:51862" http.request.uri="/v2/zhouy/ruby-ex/manifests/latest"
...

Update: since 3.9, the following will be printed if the variables aren't set:

level=fatal msg="error parsing configuration file: configuration error in openshift.server.addr: REGISTRY_OPENSHIFT_SERVER_ADDR variable must be set when running outside of Kubernetes cluster"

I got it resolved by modifying below line in YAML playbook and re-compile the OSO.

[nodes]
master.example.com openshift_node_labels="
{'region':'infra','zone':'default'}" openshift_schedulable=true

openshift_schedulable=true is the important parameter which is not letting the registry pod spawn (if sets to "false" ) in case you have single node in infra region.

Once done, it took couple of mins to make my OSO registry working.

Regards,Sandeep

What works for me is the following
oc rollout latest docker-registry

@walidshaari thanks, you helped me out, your command worked for me 馃憤

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@walidshaari THX, worked for me

/remove-lifecycle stale

The doc work I started needs a rewrite. But the registry operator will change things in a way that all guidance will become obsolete. This could be resolved with a documentation for the operator.

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

/lifecycle rotten
/remove-lifecycle stale

/lifecycle frozen

Was this page helpful?
0 / 5 - 0 ratings