Is this a request for help?: no
Is this a BUG REPORT or FEATURE REQUEST? (choose one): bug report
Version of Helm and Kubernetes:
Helm 2.13.0
Kubernetes 1.12.5-gke.5 (Google Kubernetes Engine)
Which chart:
stable/grafana
What happened:
Upgrade fails due to Multi-Attach error
What you expected to happen:
Successful upgrade of chart
How to reproduce it (as minimally and precisely as possible):
Change a value in the values file and run upgrade command
Anything else we need to know:
both pods are running. The old one doesn't terminate and the new one can't start because of Multi-attach error:
grafana-6464546845-b7c56 1/1 Running 0 44m
grafana-667dd55b9f-dhsmj 0/1 Init:0/2 0 5m44s
Deleting one or both pods doesn't work, they will both restart and result in the same error
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 5m41s default-scheduler Successfully assigned grafana/grafana-667dd55b9f-dhsmj to gke-primary-default-35377a27-skws
Warning FailedAttachVolume 5m41s attachdetach-controller Multi-Attach error for volume "pvc-07f60d3b-43ba-11e9-82b4-4201ac100006" Volume is already used by pod(s) grafana-6464546845-b7c56
Warning FailedMount 80s (x2 over 3m38s) kubelet, gke-primary-default-35377a27-skws Unable to mount volumes for pod "grafana-667dd55b9f-dhsmj_grafana(46d186ed-43c7-11e9-82b4-4201ac100006)": timeout expired waiting for volumes to attach or mount for pod "grafana"/"grafana-667dd55b9f-dhsmj". list of unmounted volumes=[storage]. list of unattached volumes=[config dashboards-default storage grafana-token-962dt]
Workaround: set deploymentstrategy to Recreate
Exact same symptom here:
The old pod status remained running after helm upgrade and the new pod created eventually times out because the PV requested was never released.
timeout expired waiting for volumes to attach or mount for pod
+1 Using deploymentStrategy: Recreate works around the problem. Thanks @arthurk !
Another solution may be to run Grafana with >1 replicas, and adjust the rolling update settings accordingly. wdyt?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
activity
To me is it common sense to set deploymentstrategy to recreate, if I am using PhysicalVolumeClaims in RWO mode with a Deployment.
See https://kubernetes.io/docs/tasks/run-application/run-single-instance-stateful-application/ .
Is Grafana desinged to be run multiple times in a cluster?
If so helm chart could be changed from using Deployments to StatefullSet and this might include more work to connect and sync those instances.
See https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This often breaks our CD pipeline as Helm upgrade fails waiting for Grafana to finish upgrading. Helm has an unsynced state afterwards.
For the record, since #14863 the deploymentStrategy parameter is a map type.
So the modified workaround in the config will look like this:
deploymentStrategy: { "type": "Recreate" }
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
this must be reopen
โ Events: โ
โ Type Reason Age From Message โ
โ ---- ------ ---- ---- ------- โ
โ Normal Scheduled 3m39s default-scheduler Successfully assigned monitoring/loki-stack-grafana-74cbb68cdb-vmp9l to ip-10-140-99-28.us-east-2.compute.internal โ
โ Warning FailedAttachVolume 3m39s attachdetach-controller Multi-Attach error for volume "pvc-8eb03441-ced7-44bb-92b8-bfdc76407bd6" Volume is already used by pod(s) loki-stack-grafana-96c4d988f-5fwdz โ
โ Warning FailedMount 96s kubelet, ip-10-140-99-28.us-east-2.compute.internal Unable to attach or mount volumes: unmounted volumes=[storage], unattached volumes=[loki-stack-grafana-token-rnn5g config sc-datasources-volume storage]: time โ
โ d out waiting for the condition
Most helpful comment
activity