Openshift-ansible: Normalize image registry, repository, prefix and deprecated oreg_url?

Created on 12 Jun 2017 · 17Comments · Source: openshift/openshift-ansible

We have a very confusing system to define patterns for determining which images are used.

oreg_url inventory variable which sets openshift.node.registry_url and openshift.master.registry_url facts. If oreg_url is not set then those two default to openshift/origin-${component}:${version} or 'openshift3/ose-${component}:${version} based on openshift_deployment_type

Various roles compute registry_host: "{{ registry_url.split('/')[0] if '.' in registry_url.split('/')[0] else '' }}" and that's used to replace 'registry.access.redhat.com' in image streams and templates if oreg_url includes a hostname.

We have inventory variables osm_image, osn_image, osn_ovs_image, osm_etcd_image which set openshift.master.master_image, openshift.node.node_image, openshift.node.ovs_image, openshift.master.etcd_image if defined, otherwise the following defaults are applied.

    if deployment_type in ['enterprise', 'openshift-enterprise']:
        master_image = 'openshift3/ose'
        cli_image = master_image
        node_image = 'openshift3/node'
        ovs_image = 'openshift3/openvswitch'
        etcd_image = 'registry.access.redhat.com/rhel7/etcd'
        pod_image = 'openshift3/ose-pod'
        router_image = 'openshift3/ose-haproxy-router'
        registry_image = 'openshift3/ose-docker-registry'
        deployer_image = 'openshift3/ose-deployer'
    else:
        master_image = 'openshift/origin'
        cli_image = master_image
        node_image = 'openshift/node'
        ovs_image = 'openshift/openvswitch'
        etcd_image = 'registry.access.redhat.com/rhel7/etcd'
        pod_image = 'openshift/origin-pod'
        router_image = 'openshift/origin-haproxy-router'
        registry_image = 'openshift/origin-docker-registry'
        deployer_image = 'openshift/origin-deployer'

Finally we have system_images_registry which is used as a prefix to the above images but only when using system containers.

Should we normalize everything to
"{{ openshift_image_registry + "/" if openshift_image_registry | default('') != "" else "" }}{{ openshift_image_repository }}/{{ openshift_image_prefix }}{{ openshift_image_format }}"

Enterprise
openshift_image_registry = 'registry.access.redhat.com'
openshift_image_repository = 'openshift3'
openshift_image_prefix = 'ose'
openshift_image_format = '${component}:${version}'

Origin
openshift_image_repository = 'openshift'
openshift_image_prefix = 'origin'
openshift_image_format = '${component}:${version}'

lifecyclrotten

Source

sdodson

👍3

Most helpful comment

My thoughts are I'd like one documented way to specify images with consistent names. So this seems to go in the right direction. I suspect a lot of these methods of specifying the image are new/undocumented and can just be replaced wholesale. oreg_url is probably seeing the widest usage? I think we could translate this (if provided) into the new parameter names in openshift_sanitize_inventory and just use the standardized names everywhere. We also have openshift_image_tag which I think is used directly for containerized images.

sosiouxme on 12 Jun 2017

👍2

All 17 comments

@abutcher @sosiouxme @giuseppe thoughts?

sdodson on 12 Jun 2017

One alternative is we keep having people set oreg_url and then we compute all the other values based on patterns from that value and then use those values internally.

sdodson on 12 Jun 2017

sosiouxme on 12 Jun 2017

👍2

This has immediate relevance for the docker_image_availability pre-install check for whether the images you need are available. So I need to be able to easily figure out in one place what those images needed are.

sosiouxme on 12 Jun 2017

Added note: logging has a separate way of determining image and version that doesn't even look at openshift_image_tag or oreg_url. I wouldn't be surprised if it's the same for metrics.

Part of the problem here is that oreg_url and the --images flag were meant for use with oc client/server which filled in ${component} and ${version} automatically, but this capability isn't readily available in any other context, and so we have a bunch of other ways of deploying images that have had to do their own thing.

sosiouxme on 29 Jun 2017

I'm all for trying to lessen the amount of variables we have to deal with. However, with system containers we have to have containers in different repositories:

On Fedora: Use fedora registry
On RHEL: Use access registry
On CentOS: Use dockerhub

The reason for this is that these system containers are used inside and outside of OpenShift. As an example, container-engine may be used on Fedora on it's own. In fact the docker container/system container in Fedora is based off of container-engine 😄

ashcrow on 19 Jul 2017

Summarizing discussion in today's arch call...

Typically system containers like docker / etcd / cri-o will come from OS-specific registry but openshift images come from dockerhub/registry.access.redhat.com
Consider breaking _tag out of openshift_image_format
openshift_image_format="${component}"
openshift_image_tag="${version}"
Overrides for specific images should be fully qualified image references
Needs to be able to handle empty strings for all variables, e.g.: openshift_image_registry="" yields openshift3/ose-${component}:${version}
Create a filter plugin that mimics the openshift imageFormat parsing for a provided component and version

sosiouxme on 19 Jul 2017

Clearly there can be no single variable (or family of variables) to specify all images. Logic and overrides for specific cases will have to remain.

Unless we centralize all that logic (and I don't think anyone wants to), we can't determine outside a given role what images it will end up using without duplicating its logic, so the docker_image_availability check must be limited to images where the logic is "simple enough" that duplicating is worth the effort and risk of dual maintenance.

As far as user experience, though, what would be clearest and most useful?

For images that don't use our naming convention (like cockpit/kubernetes, etcd and so forth) or even our versions, we can't reuse anything -- they will always need their own defaults and overrides. But for everything else, could we specify them with a family of variables as discussed here, with defaults using this base?

So for instance, say you have the defaults:

openshift_image_registry = 'registry.access.redhat.com'
openshift_image_repository = 'openshift3'
openshift_image_prefix = 'ose'
openshift_image_format = '${component}'
openshift_image_tag = '${version}'

For logging define a parallel set of variables, openshift_logging_image_registry and so forth, and instead of logging having its own set of defaults like it does now (e.g. openshift_logging_image_version defaults to latest so you deploy alpha builds in origin unless you know to set this) all of them default to the above defaults. The user can still override everything for logging specifically, using these variables. The logging role builds image specs it needs from these variables and just varies the component getting filled in.

Then in openshift_sanitize_inventory, translate pre-existing variables the users set (e.g. oreg_url) into this scheme, refactor all the code to use the new variables (with a filter plugin to construct the string from the pieces and replace ${component} and ${version}), and deprecate the old variables.

Having said all that... I'm wondering if we actually need to specify registry, repository, and prefix separately. Is there really a use case where you would be using this scheme and want to vary those independently? For logging and metrics we saw fit to simply jam them together into prefix (and so there's the small matter that there's already openshift_logging_image_prefix with that semantic).

sosiouxme on 19 Jul 2017

👍1

although is too late to comment, i think what was proposed by @sdodson in https://github.com/openshift/openshift-ansible/issues/4415#issue-235312986 was very clear. How it is today (26/02/2018) hasn't changed at all, same old ... and confusing

DanyC97 on 26 Feb 2018

Not too late to comment and it's something that we'd like to fix in 3.10.

sdodson on 26 Feb 2018

Defaulting to registry.access.redhat.com seems to make it difficult to install openshift-enterprise in an air gaped network. I have tried setting oreg_url to our internal docker registry and when doing so containers fail to pull during installation. Looking at the deployment config it seems to be trying to pull from registry.access.redhat.com still as opposed to our internal server.

I'm not entirely sure if this is correct but it seemed to work for us so far:

oreg_url=index.docker.io/openshift3/ose-${component}:${version}  # <---- This does not seem to work

### Since the above line did not work I had to change registry.access.redhat.com in the following files to point to our internal docker.io server

# /usr/share/ansible/openshift-ansible/roles/ansible_service_broker/vars/openshift-enterprise.yml
# /usr/share/ansible/openshift-ansible/roles/openshift_hosted_templates/files/v3.9/enterprise/registry-console.yaml
# /usr/share/ansible/openshift-ansible/roles/openshift_service_catalog/vars/openshift-enterprise.yml
# /usr/share/ansible/openshift-ansible/roles/openshift_web_console/defaults/main.yml
# /usr/share/ansible/openshift-ansible/roles/template_service_broker/defaults/main.yml

ericlake on 10 May 2018

👍1

The work around mentioned by @ericlake is a bad solution, but not as bad as the OpenShift documentation for disconnected installs using a private registry. We're working through a ticket with support at this moment, but it's been a real pain. One thing I did was run the pre-install-check playbook for an RPM install and it finished fine so I know it still doesn't check the oreg_url.

canit00 on 31 Oct 2018

I actually opened a ticket with RH support about this and learned that there are a few undocumented variables that do for me what I needed to do. So I reverted my solution.

ericlake on 1 Nov 2018

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot on 29 May 2020

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot on 28 Jun 2020

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-bot on 28 Jul 2020

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.