Openshift-ansible: Openshift Origin Installation fails if cloud provider (AWS) used in inventory file

Created on 6 Oct 2017 · 22Comments · Source: openshift/openshift-ansible

Description

On openshift cluster if I use AWS as cloud provider. My installation fails while trying to start node service on each node. If I don't use any cloud provider it appears to be successful

Version

I am using RPM Installation

- ansible 2.3.2.0
- oc v3.6.0+c4dd4cf
- kubernetes v1.6.1+5115d708d7
- features: Basic-Auth GSSAPI Kerberos SPNEGO

Steps To Reproduce

Use below mentioned inventory file and copy to /etc/ansible/hosts file
Run ansible-playbook -i /etc/ansible/hosts /root/openshift-ansible/playbooks/byo/config.yml -vvv

Expected Results

Expected result should be Node service start successfully and I see the output of oc get nodes as successful not in the state of NotReady.

fatal: [osnode04.bdteam.local]: FAILED! => {
    "attempts": 3,
    "changed": false,
    "failed": true,
    "invocation": {
        "module_args": {
            "daemon_reload": false,
            "enabled": null,
            "masked": null,
            "name": "origin-node",
            "no_block": false,
            "state": "restarted",
            "user": false
        }
    },
    "msg": "Unable to restart service origin-node: Job for origin-node.service failed because the control process exited with error code. See \"systemctl status origin-node.service\" and \"journalctl -xe\" for details.\n"
}

RUNNING HANDLER [openshift_node : reload systemd units] ************************************************************************************************************************************************************************************************************************
META: ran handlers
    to retry, use: --limit @/root/openshift-ansible/playbooks/byo/config.retry

PLAY RECAP *********************************************************************************************************************************************************************************************************************************************************************
localhost                  : ok=21   changed=0    unreachable=0    failed=0
openshift-etcd.bdteam.local : ok=97   changed=34   unreachable=0    failed=0
osmaster01.bdteam.local    : ok=365  changed=110  unreachable=0    failed=1
osmaster02.bdteam.local    : ok=314  changed=95   unreachable=0    failed=1
osnode01.bdteam.local      : ok=146  changed=38   unreachable=0    failed=1
osnode02.bdteam.local      : ok=146  changed=38   unreachable=0    failed=1
osnode03.bdteam.local      : ok=146  changed=38   unreachable=0    failed=1
osnode04.bdteam.local      : ok=146  changed=38   unreachable=0    failed=1


INSTALLER STATUS ***************************************************************************************************************************************************************************************************************************************************************
Initialization             : Complete
etcd Install               : Complete
NFS Install                : Not Started
Load balancer Install      : Not Started
Master Install             : Complete
Master Additional Install  : Complete
Node Install               : In Progress
    This phase can be restarted by running: playbooks/byo/openshift-node/config.yml
GlusterFS Install          : Not Started
Hosted Install             : Not Started
Metrics Install            : Not Started
Logging Install            : Not Started
Service Catalog Install    : Not Started



Failure summary:


  1. Hosts:    osmaster01.bdteam.local, osmaster02.bdteam.local, osnode01.bdteam.local, osnode02.bdteam.local, osnode03.bdteam.local, osnode04.bdteam.local
     Play:     Configure nodes
     Task:     restart node
     Message:  Unable to restart service origin-node: Job for origin-node.service failed because the control process exited with error code. See "systemctl status origin-node.service" and "journalctl -xe" for details.

[root@osmaster01 ~]# packet_write_wait: Connection to 10.X.X.X port 22: Broken pipe

Observed Results

Node service is unable to restart on each nodes or masters.

[root@osmaster01 centos]# oc get nodes
NAME                                        STATUS     AGE       VERSION
ip-10-30-1-200.us-west-1.compute.internal   NotReady   2h        v1.6.1+5115d708d7
ip-10-30-1-27.us-west-1.compute.internal    NotReady   2h        v1.6.1+5115d708d7
ip-10-30-1-43.us-west-1.compute.internal    NotReady   2h        v1.6.1+5115d708d7
ip-10-30-2-109.us-west-1.compute.internal   NotReady   2h        v1.6.1+5115d708d7
ip-10-30-2-182.us-west-1.compute.internal   NotReady   2h        v1.6.1+5115d708d7
ip-10-30-2-251.us-west-1.compute.internal   NotReady   2h        v1.6.1+5115d708d7

Kubectl describe node output

[root@osmaster01 centos]# kubectl describe node ip-10-30-2-251.us-west-1.compute.internal
Name:           ip-10-30-2-251.us-west-1.compute.internal
Role:
Labels:         beta.kubernetes.io/arch=amd64
            beta.kubernetes.io/instance-type=m4.xlarge
            beta.kubernetes.io/os=linux
            failure-domain.beta.kubernetes.io/region=us-west-1
            failure-domain.beta.kubernetes.io/zone=us-west-1a
            kubernetes.io/hostname=osnode04.bdteam.local
            region=primary
            zone=west
Annotations:        volumes.kubernetes.io/controller-managed-attach-detach=true
Taints:         <none>
CreationTimestamp:  Fri, 06 Oct 2017 18:10:56 +0000
Phase:
Conditions:
  Type          Status  LastHeartbeatTime           LastTransitionTime          Reason              Message
  ----          ------  -----------------           ------------------          ------              -------
  OutOfDisk         False   Fri, 06 Oct 2017 20:37:50 +0000     Fri, 06 Oct 2017 18:10:56 +0000     KubeletHasSufficientDisk    kubelet has sufficient disk space available
  MemoryPressure    False   Fri, 06 Oct 2017 20:37:50 +0000     Fri, 06 Oct 2017 18:10:56 +0000     KubeletHasSufficientMemory  kubelet has sufficient memory available
  DiskPressure      False   Fri, 06 Oct 2017 20:37:50 +0000     Fri, 06 Oct 2017 18:10:56 +0000     KubeletHasNoDiskPressure    kubelet has no disk pressure
  Ready         False   Fri, 06 Oct 2017 20:37:50 +0000     Fri, 06 Oct 2017 18:10:56 +0000     KubeletNotReady         runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Addresses:      10.30.2.251,10.30.2.251,ip-10-30-2-251.bdteam.local,osnode04.bdteam.local
Capacity:
 cpu:       4
 memory:    16266720Ki
 pods:      40
Allocatable:
 cpu:       4
 memory:    16164320Ki
 pods:      40
System Info:
 Machine ID:            8bd05758fdfc1903174c9fcaf82b71ca
 System UUID:           EC2798A7-3C88-0538-2A95-D28F2BCCDF96
 Boot ID:           5d7f71a8-95f8-4ed6-a7ba-07977e2dc926
 Kernel Version:        3.10.0-693.2.2.el7.x86_64
 OS Image:          CentOS Linux 7 (Core)
 Operating System:      linux
 Architecture:          amd64
 Container Runtime Version: docker://1.12.6
 Kubelet Version:       v1.6.1+5115d708d7
 Kube-Proxy Version:        v1.6.1+5115d708d7
ExternalID:         i-08ae279780695c5f7
Non-terminated Pods:        (0 in total)
  Namespace         Name        CPU Requests    CPU Limits  Memory Requests Memory Limits
  ---------         ----        ------------    ----------  --------------- -------------
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  CPU Requests  CPU Limits  Memory Requests Memory Limits
  ------------  ----------  --------------- -------------
  0 (0%)    0 (0%)      0 (0%)      0 (0%)
Events:
  FirstSeen LastSeen    Count   From                            SubObjectPath   Type        Reason          Message
  --------- --------    -----   ----                            -------------   --------    ------          -------
  1h        1h      1   kubelet, ip-10-30-2-251.us-west-1.compute.internal          Warning     ImageGCFailed       unable to find data for container /
  1h        1h      1   kubelet, ip-10-30-2-251.us-west-1.compute.internal          Normal      NodeHasSufficientDisk   Node ip-10-30-2-251.us-west-1.compute.internal status is now: NodeHasSufficientDisk
  1h        1h      1   kubelet, ip-10-30-2-251.us-west-1.compute.internal          Normal      NodeHasSufficientMemory Node ip-10-30-2-251.us-west-1.compute.internal status is now: NodeHasSufficientMemory
  1h        1h      1   kubelet, ip-10-30-2-251.us-west-1.compute.internal          Normal      NodeHasNoDiskPressure   Node ip-10-30-2-251.us-west-1.compute.internal status is now: NodeHasNoDiskPressure
  1h        1h      1   kubelet, ip-10-30-2-251.us-west-1.compute.internal          Normal      Starting        Starting kubelet.

If I dont use any cloud provider in my ansible config.yml file my installation works fine but I need to resolve this for AWS or any cloud provider

Systemctl output of node service on a particular node

[root@osnode01 centos]# systemctl status origin-node.service
● origin-node.service - OpenShift Node
   Loaded: loaded (/etc/systemd/system/origin-node.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/origin-node.service.d
           └─openshift-sdn-ovs.conf
   Active: activating (start) since Fri 2017-10-06 20:39:54 UTC; 8s ago
     Docs: https://github.com/openshift/origin
  Process: 56362 ExecStopPost=/usr/bin/dbus-send --system --dest=uk.org.thekelleys.dnsmasq /uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers array:string: (code=exited, status=0/SUCCESS)
  Process: 56360 ExecStopPost=/usr/bin/rm /etc/dnsmasq.d/node-dnsmasq.conf (code=exited, status=0/SUCCESS)
  Process: 56368 ExecStartPre=/usr/bin/dbus-send --system --dest=uk.org.thekelleys.dnsmasq /uk/org/thekelleys/dnsmasq uk.org.thekelleys.SetDomainServers array:string:/in-addr.arpa/127.0.0.1,/cluster.local/127.0.0.1 (code=exited, status=0/SUCCESS)
  Process: 56365 ExecStartPre=/usr/bin/cp /etc/origin/node/node-dnsmasq.conf /etc/dnsmasq.d/ (code=exited, status=0/SUCCESS)
 Main PID: 56370 (openshift)
   Memory: 42.2M
   CGroup: /system.slice/origin-node.service
           ├─56370 /usr/bin/openshift start node --config=/etc/origin/node/node-config.yaml --loglevel=2
           └─56415 journalctl -k -f

Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546363   56370 pod_container_deletor.go:77] Container "2f4c53551f7b6e654cc1de1159d44856f81b6d16f4ed5d1eb580c9cb3a9bc575" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546425   56370 pod_container_deletor.go:77] Container "851f6503d78acd135e3a4b87009d4163a808856f14757f6123c1cf625123504d" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546448   56370 pod_container_deletor.go:77] Container "88a45a9147f05a0bd9e05ed712069f10b4cea6c2af3ccd0eb1601166f3ccf679" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546460   56370 pod_container_deletor.go:77] Container "a3ef9c2922877e2f25bd4814fd1f4e371fd98a19ad36b54371fd0b1bc51e255b" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546472   56370 pod_container_deletor.go:77] Container "c5102f50c2e01a2100e1dcb025096967e31134c43ffdb1655827b908e5b29f77" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546483   56370 pod_container_deletor.go:77] Container "d68f9392b34c6410e6154c95febcfb55dac109725750ae5c20671c39279c9730" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.546494   56370 pod_container_deletor.go:77] Container "eb04adc0b544c64e20ac3c847e03de048f7c7a26ce4d4a6b46282817d0df8e10" not found in pod's containers
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: W1006 20:39:59.710842   56370 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Oct 06 20:39:59 osnode01.bdteam.local origin-node[56370]: E1006 20:39:59.710981   56370 kubelet.go:2072] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Oct 06 20:40:00 osnode01.bdteam.local origin-node[56370]: W1006 20:40:00.816290   56370 sdn_controller.go:38] Could not find an allocated subnet for node: osnode01.bdteam.local, Waiting...
[root@osnode01 centos]#

Logs output from one of the node (/var/log/messages)

Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5434] dhcp4 (eth0):   address 10.30.1.43
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5434] dhcp4 (eth0):   plen 24 (255.255.255.0)
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5434] dhcp4 (eth0):   gateway 10.30.1.1
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5434] dhcp4 (eth0):   lease time 3600
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5434] dhcp4 (eth0):   hostname 'ip-10-30-1-43'
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5435] dhcp4 (eth0):   nameserver '10.21.0.251'
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5435] dhcp4 (eth0):   domain name 'bdteam.local'
Oct  6 20:41:15 osnode01 NetworkManager[18586]: <info>  [1507322475.5435] dhcp4 (eth0): state changed bound -> bound
Oct  6 20:41:15 osnode01 dbus-daemon: dbus[632]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Oct  6 20:41:15 osnode01 dbus[632]: [system] Activating via systemd: service name='org.freedesktop.nm_dispatcher' unit='dbus-org.freedesktop.nm-dispatcher.service'
Oct  6 20:41:15 osnode01 systemd: Starting Network Manager Script Dispatcher Service...
Oct  6 20:41:15 osnode01 dhclient[18622]: bound to 10.30.1.43 -- renewal in 1686 seconds.
Oct  6 20:41:15 osnode01 dbus[632]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct  6 20:41:15 osnode01 dbus-daemon: dbus[632]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Oct  6 20:41:15 osnode01 systemd: Started Network Manager Script Dispatcher Service.
Oct  6 20:41:15 osnode01 nm-dispatcher: req:1 'dhcp4-change' [eth0]: new request (6 scripts)
Oct  6 20:41:15 osnode01 nm-dispatcher: req:1 'dhcp4-change' [eth0]: start running ordered scripts...
Oct  6 20:41:15 osnode01 nm-dispatcher: + cd /etc/sysconfig/network-scripts
Oct  6 20:41:15 osnode01 nm-dispatcher: + . ./network-functions
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ export PATH
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ hostname
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ HOSTNAME=osnode01.bdteam.local
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ '[' -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ . /etc/init.d/functions
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ TEXTDOMAIN=initscripts
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ umask 022
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ PATH=/sbin:/usr/sbin:/bin:/usr/bin
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ export PATH
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' 56720 -ne 1 -a -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -d /run/systemd/system ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ case "$0" in
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ COLUMNS=80
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -c /dev/stderr -a -r /dev/stderr ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ CONSOLETYPE=serial
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -f /etc/sysconfig/i18n -o -f /etc/locale.conf ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ . /etc/profile.d/lang.sh
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ unset LANGSH_SOURCED
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -z '' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' -f /etc/sysconfig/init ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ . /etc/sysconfig/init
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ BOOTUP=color
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ RES_COL=60
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ MOVE_TO_COL='echo -en \033[60G'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ SETCOLOR_SUCCESS='echo -en \033[0;32m'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ SETCOLOR_FAILURE='echo -en \033[0;31m'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ SETCOLOR_WARNING='echo -en \033[0;33m'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ SETCOLOR_NORMAL='echo -en \033[0;39m'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' serial = serial ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ BOOTUP=serial
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ MOVE_TO_COL=
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ SETCOLOR_SUCCESS=
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ SETCOLOR_FAILURE=
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ SETCOLOR_WARNING=
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ SETCOLOR_NORMAL=
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ __sed_discard_ignored_files='/\(~\|\.bak\|\.orig\|\.rpmnew\|\.rpmorig\|\.rpmsave\)$/d'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' '' = 1 ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++++ cat /proc/cmdline
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ strstr 'BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.2.2.el7.x86_64 root=UUID=29342a0b-e20f-4676-9ecf-dfdf02ef6683 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto console=ttyS0,115200 LANG=en_US.UTF-8' rc.debug
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ '[' 'BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.2.2.el7.x86_64 root=UUID=29342a0b-e20f-4676-9ecf-dfdf02ef6683 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto console=ttyS0,115200 LANG=en_US.UTF-8' = 'BOOT_IMAGE=/boot/vmlinuz-3.10.0-693.2.2.el7.x86_64 root=UUID=29342a0b-e20f-4676-9ecf-dfdf02ef6683 ro console=tty0 console=ttyS0,115200n8 crashkernel=auto console=ttyS0,115200 LANG=en_US.UTF-8' ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ return 1
Oct  6 20:41:15 osnode01 nm-dispatcher: +++ return 0
Oct  6 20:41:15 osnode01 nm-dispatcher: + '[' -f ../network ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: + . ../network
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ NETWORKING=yes
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ NOZEROCONF=yes
Oct  6 20:41:15 osnode01 nm-dispatcher: + [[ dhcp4-change =~ ^(up|dhcp4-change|dhcp6-change)$ ]]
Oct  6 20:41:15 osnode01 nm-dispatcher: + NEEDS_RESTART=0
Oct  6 20:41:15 osnode01 nm-dispatcher: + UPSTREAM_DNS=/etc/dnsmasq.d/origin-upstream-dns.conf
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ mktemp
Oct  6 20:41:15 osnode01 nm-dispatcher: + UPSTREAM_DNS_TMP=/tmp/tmp.5DzdaQo1tn
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ mktemp
Oct  6 20:41:15 osnode01 nm-dispatcher: + UPSTREAM_DNS_TMP_SORTED=/tmp/tmp.Ie4FFsjAgL
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ mktemp
Oct  6 20:41:15 osnode01 nm-dispatcher: + CURRENT_UPSTREAM_DNS_SORTED=/tmp/tmp.0ZlG7MgcgO
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ mktemp
Oct  6 20:41:15 osnode01 nm-dispatcher: + NEW_RESOLV_CONF=/tmp/tmp.293w7YIsqD
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ mktemp
Oct  6 20:41:15 osnode01 nm-dispatcher: + NEW_NODE_RESOLV_CONF=/tmp/tmp.D9exxlKVYt
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ /sbin/ip route list match 0.0.0.0/0
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ awk '{print $3 }'
Oct  6 20:41:15 osnode01 nm-dispatcher: + def_route=10.30.1.1
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ /sbin/ip route get to 10.30.1.1
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ awk '{print $3}'
Oct  6 20:41:15 osnode01 nm-dispatcher: + def_route_int=eth0
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ /sbin/ip route get to 10.30.1.1
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ awk '{print $5}'
Oct  6 20:41:15 osnode01 nm-dispatcher: + def_route_ip=10.30.1.43
Oct  6 20:41:15 osnode01 nm-dispatcher: + [[ eth0 == eth0 ]]
Oct  6 20:41:15 osnode01 nm-dispatcher: + '[' '!' -f /etc/dnsmasq.d/origin-dns.conf ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: + grep -q 99-origin-dns.sh /etc/resolv.conf
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ systemctl -q is-active dnsmasq.service
Oct  6 20:41:15 osnode01 nm-dispatcher: + '[' 0 -eq 1 ']'
Oct  6 20:41:15 osnode01 nm-dispatcher: ++ systemctl -q is-active dnsmasq.service
Oct  6 20:41:15 osnode01 nm-dispatcher: + grep -q 99-origin-dns.sh /etc/resolv.conf
Oct  6 20:41:15 osnode01 nm-dispatcher: + sed -e '/^nameserver.*$/d' /etc/resolv.conf
Oct  6 20:41:15 osnode01 nm-dispatcher: + echo 'nameserver 10.30.1.43'
Oct  6 20:41:15 osnode01 nm-dispatcher: + grep -q 'search.*cluster.local' /tmp/tmp.293w7YIsqD
Oct  6 20:41:15 osnode01 nm-dispatcher: + grep -qw search /tmp/tmp.293w7YIsqD
Oct  6 20:41:15 osnode01 nm-dispatcher: + cp -Z /tmp/tmp.293w7YIsqD /etc/resolv.conf
Oct  6 20:41:15 osnode01 nm-dispatcher: + rm -f /tmp/tmp.5DzdaQo1tn /tmp/tmp.Ie4FFsjAgL /tmp/tmp.0ZlG7MgcgO /tmp/tmp.293w7YIsqD
Oct  6 20:41:18 osnode01 origin-node: I1006 20:41:18.210035   56657 aws.go:936] Could not determine public DNS from AWS metadata.
Oct  6 20:41:18 osnode01 origin-node: W1006 20:41:18.246426   56657 cni.go:157] Unable to update cni config: No networks found in /etc/cni/net.d
Oct  6 20:41:18 osnode01 origin-node: E1006 20:41:18.246581   56657 kubelet.go:2072] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
Oct  6 20:41:20 osnode01 origin-node: W1006 20:41:20.737092   56657 sdn_controller.go:38] Could not find an allocated subnet for node: osnode01.bdteam.local, Waiting...
Oct  6 20:41:20 osnode01 origin-node: F1006 20:41:20.737146   56657 node.go:309] error: SDN node startup failed: failed to get subnet for this host: osnode01.bdteam.local, error: timed out waiting for the condition
Oct  6 20:41:20 osnode01 systemd: origin-node.service: main process exited, code=exited, status=255/n/a
Oct  6 20:41:20 osnode01 dnsmasq[18837]: setting upstream servers from DBus
Oct  6 20:41:20 osnode01 dnsmasq[18837]: using nameserver 10.21.0.251#53
Oct  6 20:41:20 osnode01 dbus-daemon: dbus[632]: [system] Rejected send message, 0 matched rules; type="method_return", sender=":1.7943" (uid=0 pid=18837 comm="/usr/sbin/dnsmasq -k ") interface="(unset)" member="(unset)" error name="(unset)" requested_reply="0" destination=":1.9458" (uid=0 pid=56795 comm="/usr/bin/dbus-send --system --dest=uk.org.thekelle")
Oct  6 20:41:20 osnode01 dbus[632]: [system] Rejected send message, 0 matched rules; type="method_return", sender=":1.7943" (uid=0 pid=18837 comm="/usr/sbin/dnsmasq -k ") interface="(unset)" member="(unset)" error name="(unset)" requested_reply="0" destination=":1.9458" (uid=0 pid=56795 comm="/usr/bin/dbus-send --system --dest=uk.org.thekelle")
Oct  6 20:41:20 osnode01 systemd: Failed to start OpenShift Node.
Oct  6 20:41:20 osnode01 systemd: Unit origin-node.service entered failed state.
Oct  6 20:41:20 osnode01 systemd: origin-node.service failed.

Additional Information

CentOS Linux release 7.4.1708 (Core)
Inventory File is as per below

[OSEv3:children]
masters
nodes
etcd

[OSEv3:vars]
ansible_ssh_user=root
openshift_deployment_type=origin
openshift_master_cluster_method=native
openshift_master_cluster_hostname=osmasterelb.bdteam.local
openshift_master_cluster_public_hostname=osmasterelb.bdteam.local
openshift_clock_enabled=true
openshift_master_default_subdomain= apps.bdteam.local
openshift_cloudprovider_kind=aws
openshift_cloudprovider_aws_access_key=XXXXXXX
openshift_cloudprovider_aws_secret_key=XXXXXXXXX

# host group for masters
[masters]
osmaster01.bdteam.local openshift_hostname=osmaster01.bdteam.local
osmaster02.bdteam.local openshift_hostname=osmaster02.bdteam.local

[etcd]
openshift-etcd.bdteam.local openshift_hostname=openshift-etcd.bdteam.local

[nodes]
osmaster01.bdteam.local openshift_hostname=osmaster01.bdteam.local
osmaster02.bdteam.local openshift_hostname=osmaster02.bdteam.local
osnode01.bdteam.local openshift_node_labels="{'region': 'infra', 'zone': 'west'}"  openshift_hostname=osnode01.bdteam.local
osnode03.bdteam.local openshift_node_labels="{'region': 'infra', 'zone': 'west'}" openshift_hostname=osnode03.bdteam.local
osnode02.bdteam.local openshift_node_labels="{'region': 'primary', 'zone': 'west'}" openshift_hostname=osnode02.bdteam.local
osnode04.bdteam.local openshift_node_labels="{'region': 'primary', 'zone': 'west'}" openshift_hostname=osnode04.bdteam.local

lifecyclrotten

Source

apooniajjn

👍1

Most helpful comment

Hello,
I can confirm that the problem exists using OpenShift Origin v3.6 and openshift-ansible with git tag
openshift-ansible-3.6.173.0.9-1 using Amazon Web Services (AWS).
The problem exists when you have your custom host names or custom domain configured e.g. mymaster1.example.internal and so on.

The aws cloud provider works fine only when you use the hostname/domain in your ansible inventory *.hosts file, the same as displayed in the AWS instance Private DNS field (in ec2 instance description) e.g.:

Private DNS: ip-10-212-31-117.eu-west-1.compute.internal

To do so you must have VPC DHCP options configured with empty domain-name eg.:

{ "DhcpOptions": [ { "DhcpConfigurations": [ { "Values": [ { "Value": "AmazonProvidedDNS" } ], "Key": "domain-name-servers" } ], "DhcpOptionsId": "dopt-<lkjlkfdj>" } ] }

The hostname in CentOS Linux must be the same as above: ip-10-212-31-117.eu-west-1.compute.internal.

The following commands also must return ip-10-212-31-117.eu-west-1.compute.internal:

The similar problem is also mentioned in the issue: https://github.com/kubernetes/kubernetes/issues/11543

I'm looking forward for a fix or workaround to use custom domain and hostnames when using aws cloud provider.

Regards,
Pawel

bemnum on 19 Oct 2017

👍2

All 22 comments

Same error here since end of August with RHEL7.4 + Openshift Enterprise on AWS. (See also #5691)

Nodes are registered in the cluster with AWS DNS domain suffix (.compute.internal) instead of the public_dns_domain we provide. Forcing it with hostname in inventory file using 'openshift_hostname' doesn't help.

Support request opened with Red Hat (case 01937377), still waiting for resolution.

$ oc get nodes
NAME                                         STATUS     AGE       VERSION
ip-10-0-132-148.eu-west-1.compute.internal   NotReady   15d       v1.6.1+5115d708d7
ip-10-0-132-201.eu-west-1.compute.internal   NotReady   15d       v1.6.1+5115d708d7
ip-10-0-132-38.eu-west-1.compute.internal    NotReady   15d       v1.6.1+5115d708d7
ip-10-0-133-100.eu-west-1.compute.internal   NotReady   15d       v1.6.1+5115d708d7
ip-10-0-133-173.eu-west-1.compute.internal   NotReady   15d       v1.6.1+5115d708d7
ip-10-0-134-180.eu-west-1.compute.internal   NotReady   15d       v1.6.1+5115d708d7
ip-10-0-134-31.eu-west-1.compute.internal    NotReady   15d       v1.6.1+5115d708d7

Same for networks.

$ oc get hostsubnets
NAME                                         HOST                                         HOST IP        SUBNET
ip-10-0-132-148.eu-west-1.compute.internal   ip-10-0-132-148.eu-west-1.compute.internal   10.0.132.148   172.16.14.0/23
ip-10-0-132-201.eu-west-1.compute.internal   ip-10-0-132-201.eu-west-1.compute.internal   10.0.132.201   172.16.10.0/23
ip-10-0-132-38.eu-west-1.compute.internal    ip-10-0-132-38.eu-west-1.compute.internal    10.0.132.38    172.16.0.0/23
ip-10-0-133-100.eu-west-1.compute.internal   ip-10-0-133-100.eu-west-1.compute.internal   10.0.133.100   172.16.12.0/23
ip-10-0-133-173.eu-west-1.compute.internal   ip-10-0-133-173.eu-west-1.compute.internal   10.0.133.173   172.16.16.0/23
ip-10-0-134-180.eu-west-1.compute.internal   ip-10-0-134-180.eu-west-1.compute.internal   10.0.134.180   172.16.6.0/23
ip-10-0-134-31.eu-west-1.compute.internal    ip-10-0-134-31.eu-west-1.compute.internal    10.0.134.31    172.16.8.0/23

patlachance on 7 Oct 2017

👍1

It seems to be a timing issue, I have similar error messages installing on aws. I was able to start the origin-node service on the machine after waiting several minutes when the installation failed. After that, running installation a second time seems to work.

j00p34 on 9 Oct 2017

@j00p34 ...I don't think in my setup I have timing issue ... I had tried manually restarting the node service on each machine after a while and it was throwing the same error ...

apooniajjn on 9 Oct 2017

@j00p34 / @poonia0arun same for me. Restarting installation doesn't help.

patlachance on 10 Oct 2017

@poonia0arun Sorry to hear that. It would have been easy to workaround then. I must say that I am using the 3.7 alpha version of openshift I didn't try it with 3.6 yet. Another big difference I see from your config is that your specifying aws keys. I am using IAM roles for my instances so they have rights to the AWS API without specifying keys:

```# Create an OSEv3 group that contains the masters and nodes groups
[OSEv3:children]
masters
nodes

Set variables common for all OSEv3 hosts

[OSEv3:vars]

SSH user, this user should allow ssh based auth without requiring a password

ansible_ssh_user=centos

If ansible_ssh_user is not root, ansible_become must be set to true

ansible_become=true

Debug level for all OpenShift components (Defaults to 2)

debug_level=5

openshift_deployment_type=origin
use_manageiq=true
openshift_cfme_install_app=True
openshift_repos_enable_testing=True
openshift_disable_check=memory_availability,docker_storage,disk_availability,docker_image_availability
enable_excluders=false
openshift_hosted_logging_deploy=true
openshift_hosted_logging_storage_kind=dynamic

AWS (Using IAM Profiles)

openshift_cloudprovider_kind=aws

Note: IAM roles must exist before launching the instances.

We need a wildcard DNS setup for our public access to services, fortunately

we can use the superb xip.io to get one for free.

openshift_master_default_subdomain=pub.lic.ip.here.xip.io

uncomment the following to enable htpasswd authentication; defaults to DenyAllPasswordIdentityProvider

openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', 'challenge': 'true', 'kind': 'HTPasswdPasswordIdentityProvider', 'filename': '/etc/origin/master/htpasswd'}]
openshift_master_htpasswd_users={'username': '$apr1$incrediblysecrethash/'}

Uncomment the line below to enable metrics for the cluster.

that as the node name.

[masters]
master.openshift.local

host group for etcd

[etcd]
master.openshift.local

host group for nodes, includes region info

[nodes]
master.openshift.local openshift_node_labels="{'region': 'infra', 'zone': 'default'}" openshift_schedulable=true
node1.openshift.local openshift_node_labels="{'region': 'primary', 'zone': 'east'}"
node2.openshift.local openshift_node_labels="{'region': 'primary', 'zone': 'west'}"
~
```
@patlachance Your setup is a lot different I guess as you are on Enterprise version.

I have used terraform to set up the machines and configure everything in AWS. I've got openshift running except for the registry. The registry can't start because it's trying to use a base image from docker hub that doesn't exist. I did find this:
aws-ansible

That seems to configure your complete environment so it could be a better option to know everything is configured ok. I think I'll look into that this week.

This is also an interesting read : refererence architecture 3.6

j00p34 on 10 Oct 2017

@j00p34 Your wright, I'm trying to install Enterprise version, following instructions from the link you provided. Only difference is that I'm trying to deploy Openshift in a private VPC behind a custom proxy/reverse proxy instances.

patlachance on 10 Oct 2017

@poonia0arun There's one thing I remember from a previous installation: When I provided openshift hostname my cluster couldn't start either. I can't remember exactly what the problem was but it had something to do with kubernetes resolving the hostname while the node names are different. Maybe you should try it without the openshift_hostname=osmaster01.bdteam.local stuff. You get aws names then but it worked for me.

j00p34 on 10 Oct 2017

@j00p34 if I run my ansible playbook without openshift_hostname value... API on master doesn't restart because it tries to resolve to ip-10-30.1.248.bdteam.local hostname which is not a dns record on my dns server so API service on master fails ..

[root@osmaster01 centos]# systemctl status origin-master-api.service
● origin-master-api.service - Atomic OpenShift Master API
   Loaded: loaded (/usr/lib/systemd/system/origin-master-api.service; enabled; vendor preset: disabled)
   Active: activating (start) since Tue 2017-10-10 16:53:28 UTC; 23s ago
     Docs: https://github.com/openshift/origin
 Main PID: 31254 (openshift)
   Memory: 25.4M
   CGroup: /system.slice/origin-master-api.service
           └─31254 /usr/bin/openshift start master api --config=/etc/origin/master/master-config.yaml --loglevel=2 --listen=https://0.0.0.0:8443 --master=https://ip-10-30-1-27.bdteam.local:8443

Oct 10 16:53:38 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:42 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:43 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:44 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:44 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:44 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:44 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:44 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:45 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Oct 10 16:53:51 osmaster01.bdteam.local openshift[31254]: grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup ip-10-30-1-248.bdteam.local: no such host"; Reconnecting to {ip-10-30-1-2...m.local:2379 <nil>}
Hint: Some lines were ellipsized, use -l to show in full.
[root@osmaster01 centos]#

I am using ELB in front my HA pair of masters.

apooniajjn on 10 Oct 2017

I fixed my problem, seems unrelated after all. I am running 3.7 and I noticed origin-master-controllers.service was crash looping because in this version you need to set ClusterID when in aws. While running the playbook I added

[Global]
KubernetesClusterTag=mytestcluster
KubernetesClusterID=mytestcluster

to /etc/origin/cloudprovider/aws.conf

After that the install proceeded without a problem. The reason it worked after a while was probably because I was starting it at the right moment

j00p34 on 11 Oct 2017

@j00p34 .. oh okay.. I can't find the right solution for this ... I am still waiting for a solution

apooniajjn on 11 Oct 2017

@abutcher @sdodson any pointers for this issue?

rushabh268 on 12 Oct 2017

@sdodson do you happen to have any pointer on this issue ?

apooniajjn on 18 Oct 2017

Private DNS: ip-10-212-31-117.eu-west-1.compute.internal

To do so you must have VPC DHCP options configured with empty domain-name eg.:

{ "DhcpOptions": [ { "DhcpConfigurations": [ { "Values": [ { "Value": "AmazonProvidedDNS" } ], "Key": "domain-name-servers" } ], "DhcpOptionsId": "dopt-<lkjlkfdj>" } ] }

The hostname in CentOS Linux must be the same as above: ip-10-212-31-117.eu-west-1.compute.internal.

The following commands also must return ip-10-212-31-117.eu-west-1.compute.internal:

The similar problem is also mentioned in the issue: https://github.com/kubernetes/kubernetes/issues/11543

I'm looking forward for a fix or workaround to use custom domain and hostnames when using aws cloud provider.

Regards,
Pawel

bemnum on 19 Oct 2017

👍2

One of my Colleague spent some time into this issue ...he suggested to create A record on Route53 as ip-X-X-X-X.local.domain and assign masters and nodes IP accordingly to each A record...In my setup, I am using ELB in-front of each masters so create a classic loadbalancer listening on port 8443 of each masters.

I made three changes to make it work on my current setup even though I can't use proper custom hostname:

Route 53 A record pointing to ip-X-X-X-X.local.domain
Create local host record so AWS private DNS are resolvable locally
Update openshift_hostname as AWS private DNS record

Hosts file

[root@osmaster01 master]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.30.1.121 ip-10-30-1-121.us-west-1.compute.internal ip-10-30-1-121.bdteam.local ip-10-30-1-121
10.30.2.212 ip-10-30-2-212.us-west-1.compute.internal ip-10-30-2-212.bdteam.local ip-10-30-2-212
10.30.1.64  ip-10-30-1-64.us-west-1.compute.internal  ip-10-30-1-64.bdteam.local ip-10-30-1-64
10.30.2.235 ip-10-30-2-235.us-west-1.compute.internal ip-10-30-2-235.bdteam.local ip-10-30-2-235
10.30.1.221 ip-10-30-1-221.us-west-1.compute.internal ip-10-30-1-221.bdteam.local ip-10-30-1-221
10.30.2.209 ip-10-30-2-209.us-west-1.compute.internal ip-10-30-2-209.bdteam.local ip-10-30-2-209

Inventory File:

[OSEv3:children]
masters
nodes
etcd

[OSEv3:vars]
ansible_ssh_user=root
openshift_master_cluster_method=native
openshift_master_cluster_hostname=openshift-master.bdteam.local
openshift_master_cluster_public_hostname=openshift-master.bdteam.local
openshift_master_default_subdomain=apps.bdteam.local
openshift_clock_enabled=true
openshift_hosted_manage_registry=false
openshift_hosted_manage_router=false
openshift_override_hostname_check=true
deployment_type=openshift-enterprise
openshift_disable_check=memory_availability,disk_availability,docker_storage
openshift_cloudprovider_kind=aws
openshift_cloudprovider_aws_access_key=XXXXXXXXX
openshift_cloudprovider_aws_secret_key=XXXXXXXX

[nodes]
10.30.1.121  openshift_hostname=ip-10-30-1-121.us-west-1.compute.internal
10.30.2.212  openshift_hostname=ip-10-30-2-212.us-west-1.compute.internal
10.30.1.64   openshift_hostname=ip-10-30-1-64.us-west-1.compute.internal  openshift_node_labels="{'region': 'infra', 'zone': 'west'}"
10.30.2.235  openshift_hostname=ip-10-30-2-235.us-west-1.compute.internal openshift_node_labels="{'region': 'infra', 'zone': 'west'}"
10.30.1.221  openshift_hostname=ip-10-30-1-221.us-west-1.compute.internal openshift_node_labels="{'region': 'primary', 'zone': 'west'}"
10.30.2.209  openshift_hostname=ip-10-30-2-209.us-west-1.compute.internal openshift_node_labels="{'region': 'primary', 'zone': 'west'}"

[masters]
10.30.1.121  openshift_hostname=ip-10-30-1-121.us-west-1.compute.internal
10.30.2.212  openshift_hostname=ip-10-30-2-212.us-west-1.compute.internal

[etcd]
10.30.1.121  openshift_hostname=ip-10-30-1-121.us-west-1.compute.internal

No error occurred:

[root@osmaster01 master]# oc get nodes
NAME                                        STATUS                     AGE       VERSION
ip-10-30-1-121.us-west-1.compute.internal   Ready,SchedulingDisabled   59m       v1.6.1+5115d708d7
ip-10-30-1-221.us-west-1.compute.internal   Ready                      59m       v1.6.1+5115d708d7
ip-10-30-1-64.us-west-1.compute.internal    Ready                      59m       v1.6.1+5115d708d7
ip-10-30-2-209.us-west-1.compute.internal   Ready                      59m       v1.6.1+5115d708d7
ip-10-30-2-212.us-west-1.compute.internal   Ready,SchedulingDisabled   59m       v1.6.1+5115d708d7
ip-10-30-2-235.us-west-1.compute.internal   Ready                      59m       v1.6.1+5115d708d7

hopefully this will help to someone who is still trying to make it work.

apooniajjn on 27 Oct 2017

@liggitt not sure if you got any notification since the ref issue is closed.

Any chance can you comment if what i've done here and that should be the expected fix for the current issue?

I'll try to see if this made any difference or not and report back

DanyC97 on 18 Mar 2018

@DanyC97 the kubeletPreferredAddressTypes arg goes in master config under apiserver arguments

liggitt on 18 Mar 2018

thanks a bunch @liggitt , i'll give it a try and report back.

Initially i've done https://github.com/kubernetes/kubernetes/issues/11543#issuecomment-373978371 but not much luck.

DanyC97 on 19 Mar 2018

@liggitt something is not right. I've applied the change as suggested and i got

applied "kubeletPreferredAddressTypes" fix and saw the following error.
Mar 19 19:46:05 ip-10-0-0-197 origin-master-controllers: Invalid MasterConfig /etc/origin/master/master-config.yaml
Mar 19 19:46:05 ip-10-0-0-197 origin-master-controllers: flag: Invalid value: "kubeletPreferredAddressTypes": is not a valid flag
Mar 19 19:46:05 ip-10-0-0-197 systemd: origin-master-controllers.service: main process exited, code=exited, status=255/n/a
Mar 19 19:46:05 ip-10-0-0-197 systemd: Failed to start Atomic OpenShift Master Controllers.

any ideas ?

DanyC97 on 19 Mar 2018

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot on 22 May 2020

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot on 21 Jun 2020

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-bot on 21 Jul 2020

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.