The Deploy the updated application section states the following:
To provide maximum uptime, multiple instances of the application pod must be running.
The section then goes on to state the following:
If you don't have multiple front-end pods, scale the _azure-vote-front_ deployment as follows...
From the doc, it seems pretty clear that you must have more than one pod in a deployment running in order to handle a rolling update. If you don't have more than one pod running, you need to manually execute the scale command to scale the deployment up.
However, my team has noticed different behavior while using AKS: if a deployment only has one pod running and an update is triggered, then AKS automatically scales the deployment up to two pods temporarily in order to execute a rolling update.
Is this behavior intended? I think it's a useful feature but, to my knowledge, it is not called out in the AKS documentation. If so, then I believe that this doc needs to be updated to reflect this behavior.
⚠ Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.
Thanks for reaching out. If you still have issues just let us know.
Reopening this issue after confirming that it occurs with the command given in the doc.
@MicrosoftDocs/aks-pm @jluk @mlearned is this expected behavior? I can add it to the doc if needed.
More correctly, the current AKS upgrade procedure does not actually scale up the deployments - what's being observed is the cordon, drain and deletion of each node.
AKS fires a cordon and drain to reschedule the pods to a new node - it blocks for a maximum of 20 minutes waiting for all work to be rescheduled. Since a cordon / drain and rescheduling of the work can take an indeterminate a mount of time (how big are you container images, are PVCs in play?) after the 20 minute timer expires, AKS continues to decommission that node and bring its replacement online.
The doc is accurate, kubernetes best practices state you should have more than one pod for all deployments for high availability - as noted, workload rescheduling is highly dependent on the type of workload/etc. This means while you have one pod running, and the cordon and drain will trigger a reschedule, during that period your app is offline.
Azurevotes is simplistic enough you wouldn't see the variable workload rescheduling time, and two pods would be online at once.
Thanks for your reply. I'm not sure if we're on the same page, however.
The behavior that I'm referring to is observed during a container image update. Example: I've got a deployment whose pods have an older image version and I want to update the pods to use a newer image version. This workflow is described here.
If I'm understanding your reply, you're referring to an actual cluster upgrade, as described here.
This is a slightly simplified version of what I'm seeing if I execute the kubectl get pods command on the appropriate namespace:
Initially...
NAME STATUS
my-pod-1 Running
After executing the set command with a new image version.
NAME STATUS
my-pod-1 Running
my-pod-2 Container Creating
A little while later...
NAME STATUS
my-pod-1 Terminating
my-pod-2 Running
Finally.
NAME STATUS
my-pod-2 Running
That is the default kubernetes behavior when doing an image upgrade. "RollingUpdate"
This is user configurable by using the Strategy property.
https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
Thanks for all that! I will go ahead and close this out. If you have further questions just let us know.
Thank you all for your help, and sorry if I was asking a stupid question. I'm still pretty new to Kubernetes & AKS. 😁
All questions are good, especially while learning :)
We're here to answer and assist