Unable to create vSphere storage with origin 3.9.0
Error-Message: "Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []"
oc v3.9.0+ba7faec-1
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://cp-lb-01.cloud.mycompany.com:443
openshift v3.9.0+ba7faec-1
kubernetes v1.9.1+a0ce1bc657
I found this Bug-Ticket
The fix is for this bug are the ClusterRole "system:vsphere-cloud-provider" and the ClusterRoleBinding "system:vsphere-cloud-provider". Therefore I listed the content of my actual ClusterRole and ClusterRoleBinding.
Maybe this issues are related to my problem:
https://github.com/kubernetes/kubernetes/issues/58927
https://github.com/vmware/kubernetes/issues/450
If this Issue is related then the fix is in K8s 1.9.4 with this commit
I tried a lot with Openshift configuration after ansible-deployment, therefore I print all snippets.
I rewrote my configuration to the new style, using this documentation.
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
authorization.openshift.io/system-only: "true"
openshift.io/reconcile-protect: "false"
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: 2018-04-26T16:32:27Z
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:vsphere-cloud-provider
namespace: ""
resourceVersion: "1675333"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterroles/system%3Avsphere-cloud-provider
uid: 6896110e-496f-11e8-a170-00505694394e
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
- apiGroups:
- ""
resources:
- events
verbs:
- create
- patch
- update
- apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
annotations:
openshift.io/reconcile-protect: "false"
rbac.authorization.kubernetes.io/autoupdate: "true"
creationTimestamp: 2018-04-26T16:32:27Z
labels:
kubernetes.io/bootstrapping: rbac-defaults
name: system:vsphere-cloud-provider
namespace: ""
resourceVersion: "1674944"
selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/system%3Avsphere-cloud-provider
uid: 6897dfcb-496f-11e8-a170-00505694394e
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:vsphere-cloud-provider
subjects:
- kind: ServiceAccount
name: vsphere-cloud-provider
namespace: kube-system
...
kubernetesMasterConfig:
apiServerArguments:
cloud-provider:
- "vsphere"
cloud-config:
- "/etc/origin/cloudprovider/vsphere.conf"
runtime-config:
- apis/settings.k8s.io/v1alpha1=true
storage-backend:
- etcd3
storage-media-type:
- application/vnd.kubernetes.protobuf
controllerArguments:
cloud-config:
- /etc/origin/cloudprovider/vsphere.conf
cloud-provider:
- vsphere
...
/etc/origin/cloudprovider/vsphere.conf
[Global]
user = "MyAdminUser"
password = "MySuperSecurePassword"
port = "443"
insecure-flag = "1"
datacenters = "OCP-Datacenter"
datastore = "iscsi-hdd"
[VirtualCenter "10.y.y.xxx"]
[Workspace]
server = "10.y.y.xxx"
datacenter = "OCP-Datacenter"
default-datastore = "iscsi-hdd"
folder = "/OCP-Datacenter/vm"
[Disk]
scsicontrollertype = pvscsi
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
storageclass.kubernetes.io/is-default-class: "true"
creationTimestamp: 2018-04-26T16:25:02Z
name: slow
resourceVersion: "43413"
selfLink: /apis/storage.k8s.io/v1/storageclasses/slow
uid: 5ee47fb3-496e-11e8-a170-00505694394e
parameters:
datastore: iscsi-hdd
diskformat: thin
fstype: ext3
provisioner: kubernetes.io/vsphere-volume
reclaimPolicy: Delete
...
kubeletArguments:
cloud-provider:
- "vsphere"
...
Provisioning Failed: Failed to provision volume with StorageClass "fast": Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
Log in origin-master-controller:
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482751 2728 pv_controller_base.go:402] resyncing PV controller
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482821 2728 pv_controller_base.go:529] storeObjectUpdate updating claim "openshift-ansible-service-broker/etcd" with version 7264
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482844 2728 pv_controller.go:228] synchronizing PersistentVolumeClaim[openshift-ansible-service-broker/etcd]: phase: Pending, bound to: "", bindCompleted: false, boundByController: false
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482865 2728 pv_controller.go:310] synchronizing unbound PersistentVolumeClaim[openshift-ansible-service-broker/etcd]: no volume found
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482892 2728 pv_controller.go:648] updating PersistentVolumeClaim[openshift-ansible-service-broker/etcd] status: set phase Pending
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482907 2728 pv_controller.go:693] updating PersistentVolumeClaim[openshift-ansible-service-broker/etcd] status: phase Pending already set
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482921 2728 pv_controller_base.go:529] storeObjectUpdate updating claim "test-storage/test-storage" with version 1875650
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482932 2728 pv_controller.go:228] synchronizing PersistentVolumeClaim[test-storage/test-storage]: phase: Pending, bound to: "", bindCompleted: false, boundByController: false
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482940 2728 pv_controller.go:310] synchronizing unbound PersistentVolumeClaim[test-storage/test-storage]: no volume found
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482947 2728 pv_controller.go:1315] provisionClaim[test-storage/test-storage]: started
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482954 2728 pv_controller.go:1523] scheduleOperation[provision-test-storage/test-storage[0aa90544-4eb0-11e8-a35a-005056943169]]
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.482975 2728 pv_controller.go:1334] provisionClaimOperation [test-storage/test-storage] started, class: "slow"
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.483463 2728 event.go:218] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"openshift-ansible-service-broker", Name:"etcd", UID:"e65f983f-4953-11e8-bfa6-00505694394e", APIVersion:"v1", ResourceVersion:"7264", FieldPath:""}): type: 'Normal' reason: 'FailedBinding' no persistent volumes available for this claim and no storage class is set
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.493299 2728 vsphere_volume_util.go:114] Setting fstype as "ext3"
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.493314 2728 vsphere_volume_util.go:137] VSANStorageProfileData in vsphere volume ""
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.493330 2728 vsphere.go:1007] Starting to create a vSphere volume with volumeOptions: &{CapacityKB:3145728 Tags:map[kubernetes.io/created-for/pvc/namespace:test-storage kubernetes.io/created-for/pvc/name:test-storage kubernetes.io/created-for/pv/name:pvc-0aa90544-4eb0-11e8-a35a-005056943169] Name:kubernetes-dynamic-pvc-0aa90544-4eb0-11e8-a35a-005056943169 DiskFormat:thin Datastore:iscsi-hdd VSANStorageProfileData: StoragePolicyName: StoragePolicyID: SCSIControllerType:}
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: E0503 10:58:19.505559 2728 vsphere_util.go:199] Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: E0503 10:58:19.505581 2728 vsphere.go:1059] Failed to get shared datastore: Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.505596 2728 vsphere.go:1111] The canonical volume path for the newly created vSphere volume is ""
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.505612 2728 pv_controller.go:1425] failed to provision volume for claim "test-storage/test-storage" with StorageClass "slow": Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
Mai 03 10:58:19 cp-master-01 origin-master-controllers[2728]: I0503 10:58:19.505943 2728 event.go:218] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"test-storage", Name:"test-storage", UID:"0aa90544-4eb0-11e8-a35a-005056943169", APIVersion:"v1", ResourceVersion:"1875650", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' Failed to provision volume with StorageClass "slow": Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
PV and PVC creation should be successful
@openshift/sig-storage
@jsafrane assigning you directly since you were involved with the attached BZ, lets just make sure the fix is in master for origin
Just to let you known that I face a similar problem. I have installed the vsphere provider with ansible. Not sure it is the proper way to do that though.
[OSEv3:vars]
...
openshift_cloudprovider_kind='vsphere'
openshift_cloudprovider_vsphere_username='[email protected]'
openshift_cloudprovider_vsphere_password='S3cr3t!'
openshift_cloudprovider_vsphere_host='vcsa-1.lss1.domain.tld'
openshift_cloudprovider_vsphere_datacenter='Datacenter'
openshift_cloudprovider_vsphere_datastore='datastore2'
oc v3.9.0+ba7faec-1
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://console.oshift.lss1.domain.tld:8443
openshift v3.9.0+ba7faec-1
kubernetes v1.9.1+a0ce1bc657
May 8 15:52:04 master origin-master-controllers: I0508 15:52:04.375030 22540 vsphere.go:1007] Starting to create a vSphere volume with volumeOptions: &{CapacityKB:1024 Tags:map[kubernetes.io/created-for/pv/name:pvc-fd6b880f-52c6-11e8-a0bc-005056b9ed4a kubernetes.io/created-for/pvc/namespace:my-project-olc kubernetes.io/created-for/pvc/name:my-storage] Name:kubernetes-dynamic-pvc-fd6b880f-52c6-11e8-a0bc-005056b9ed4a DiskFormat: Datastore:datastore2 VSANStorageProfileData: StoragePolicyName: StoragePolicyID: SCSIControllerType:}
May 8 15:52:04 master origin-master-controllers: E0508 15:52:04.383825 22540 vsphere_util.go:199] Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
May 8 15:52:04 master origin-master-controllers: E0508 15:52:04.383877 22540 vsphere.go:1059] Failed to get shared datastore: Kubernetes node nodeVmDetail details is empty. nodeVmDetails : []
Opened https://github.com/openshift/origin/pull/19648 to remove the need of client access altogether from vmware cloudprovider.
@gnufied Why do you think that PR #19648 fixes this problem? Do I have a problem with the node to vSphere connection?
@ReadmeCritic The linked BZ was because of vphere cloud provider unable to fetch node info from api-server. It is possible that - this BZ is different, so I am going to try and isolate that.
Seeing the exact same issue as @Reamer on both OCP (3.9.27) and Origin (v3.9.0+ba7faec-1) deployments. I'm Working to test OCP 3.9.30 since it's supposed to be fixed but haven't made any progress on diagnosing/fixing the issue with origin installs.
Confirmed the same issue exists on OCP 3.9.30
So I was able to workaround the issue by forcing an older hardware version (11) for my VM:
virtualHW.version = "13" tovirtualHW.version = "11"cat /sys/class/dmi/id/product_uuid matches cat /sys/class/dmi/id/product_serialThis seems related to https://github.com/kubernetes/kubernetes/pull/59602 and should be fixed in k8s 1.9.4 but not 1.9.1 shipping with OCP 3.9.27 or 3.9.30
@liveaverage Thanks for description of your workaround. I'll try it.
@liveaverage Thanks for your workaround. It seems to work correctly.
I updated to okd 3.10 and it works with newest VM version 14. Thanks for your help.
```
oc v3.10.0+0c4577e-1
kubernetes v1.10.0+b81c8f8
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://s-cp-lb-01.cloud.example.de:443
openshift v3.10.0+7eee6f8-2
kubernetes v1.10.0+b81c8f8
``
Most helpful comment
So I was able to workaround the issue by forcing an older hardware version (11) for my VM:
virtualHW.version = "13"tovirtualHW.version = "11"cat /sys/class/dmi/id/product_uuidmatchescat /sys/class/dmi/id/product_serialThis seems related to https://github.com/kubernetes/kubernetes/pull/59602 and should be fixed in k8s 1.9.4 but not 1.9.1 shipping with OCP 3.9.27 or 3.9.30