Openshift 3.11 install fails and complains that required docker images are missing
got rid of this by disabling docker_image_availability checks in inventory YAML file.
Actual issues appears to be too short timeout hard coded in skopeo check command (10 secs). See line below:
This hard coded value should be removed/parameterized.
FYI - latency on the skopeo command is higher from where i tested this install (Bangalore). I am sure there are other locations where this will fail due to short timeout
I have same issue but in my case is related with credentials.
When i try on machine command that ansible plugin produce (with creds) i got this error:
FATA[0001] unable to retrieve auth token: invalid username/password
When i remove creds everything is working correctly, as docker have configured this credentials yet. To be more precious I put same credentials as in configuration.
Just ran into this issue today too. So +1 for the timeout of the skopeo command to be parameterized.
Workaround by setting openshift_disable_check="docker_image_availability"
Running into the same timeout issue here in Beijing, China, so +1 for the timeout of the skopeo command to be parameterized. Thanks.
I can confirm this from Germany, +1 for that.
I also meet this problem,anyone know to handle this problem?
@paddy667 Which file to setting openshift_disable_check="docker_image_availability" ?
@liuyatao add it to your Ansible inventory. It should go under the [OSEv3:vars] section. Here is some example code:
[OSEv3:vars]
timeout=60
ansible_user=root
ansible_become=yes
openshift_deployment_type=openshift-enterprise
openshift_disable_check="docker_image_availability"
Installing new OpenShift cluster will check for docker images with commands like skopeo inspect --tls-verify=true docker://docker.io/openshift/origin-haproxy-router:v3.11. However, this can take more then 10 seconds to complete, even on a good internet connection.
$ time skopeo inspect --tls-verify=true docker://docker.io/openshift/origin-haproxy-router:v3.11
{
"Name": "docker.io/openshift/origin-haproxy-router",
"Digest": "sha256:3415fcc585945cf0eee230a0031c154edc4f6b83bca1f31f85d69a9982f159b3",
...
}
real 0m20.555s
user 0m0.078s
sys 0m0.038s
Full output of failed check: https://gist.github.com/jozefizso/cb053e880dfa7abc6a2f1c5831122195
Only way to prevent this is to skip the docker_image_availability check.
Same issue here when deploying 3.11.69 cluster. Having set "timeout 70" in inventory fie and in ansible.cfg but it is not being picked up. This timeout should be parametrized just like any other commands.
My logs with debug output are reading " ...timeout 10 skopeo inspect --tls-verify=true ..."
As for this error from skopeo
FATA[0001] unable to retrieve auth token: invalid username/password
In my case it was caused by presence of token in /root/.docker/ directory . Just correct a token in it or remove the dir if you are not using authentication to registry.
+1 for the latency issue on skopeo. I am trying from India
same here. running into the issue due to timeout problem. takes about 11+ seconds from an internet connection in canada. had to disable the check in inventory file.
I've found that disabling the login test via oreg_test_login=False help alleviate the issue for me. But you should only configure this when you're sure that authentication works.
Is there a way to configure the inspect timeout? I couldn't tell from @bortek's comments. It seems like no, since I can't find it in the docs.
same issue here
I've started hitting similar issue in my setup from today, till yesterday it was passing. Is there any change went in?
TASK [openshift_node : Create credentials for registry auth] *****
FAILED - RETRYING: Create credentials for registry auth (3 retries left).
FAILED - RETRYING: Create credentials for registry auth (3 retries left).
FAILED - RETRYING: Create credentials for registry auth (3 retries left).
FAILED - RETRYING: Create credentials for registry auth (2 retries left).
FAILED - RETRYING: Create credentials for registry auth (2 retries left).
FAILED - RETRYING: Create credentials for registry auth (2 retries left).
FAILED - RETRYING: Create credentials for registry auth (1 retries left).
FAILED - RETRYING: Create credentials for registry auth (1 retries left).
FAILED - RETRYING: Create credentials for registry auth (1 retries left).
fatal: [10.172.182.97]: FAILED! => {"attempts": 3, "changed": false, "msg": "time=\"2019-08-21T02:16:14-04:00\" level=fatal msg=\"unable to retrieve auth token: invalid username/password\" \n", "state": "unknown"}
fatal: [10.172.182.119]: FAILED! => {"attempts": 3, "changed": false, "msg": "time=\"2019-08-21T02:16:14-04:00\" level=fatal msg=\"unable to retrieve auth token: invalid username/password\" \n", "state": "unknown"}
fatal: [10.172.181.69]: FAILED! => {"attempts": 3, "changed": false, "msg": "time=\"2019-08-21T02:16:14-04:00\" level=fatal msg=\"unable to retrieve auth token: invalid username/password\" \n", "state": "unknown"}
to retry, use: --limit @/root/openshift-ansible/playbooks/deploy_cluster.retry
Same issue here!
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
@openshift-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting
/reopen.
Mark the issue as fresh by commenting/remove-lifecycle rotten.
Exclude this issue from closing again by commenting/lifecycle frozen./close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
@liuyatao add it to your Ansible inventory. It should go under the [OSEv3:vars] section. Here is some example code: