Bug description
According to https://istio.io/docs/concepts/performance-and-scalability/ the telemetry pod for 1.1.4 should only require 0.6 vCPU. This is less than pilot which requires 1 vCPU. However, the telemetry will not schedule on the default n1-standard-1. I tried adding more of these n1-standard-1, 1 vCPU nodes and deleted the telemetry pod, but it comes back pending again with the same events.
Kubectl describe pod shows events:
Warning FailedScheduling 94s (x4 over 3m59s) default-scheduler 0/5 nodes are available: 1 Insufficient memory, 5 Insufficient cpu.
Warning FailedScheduling 57s (x3 over 57s) default-scheduler 0/6 nodes are available: 1 Insufficient memory, 6 Insufficient cpu.
Warning FailedScheduling 38s (x18 over 54s) default-scheduler 0/7 nodes are available: 1 Insufficient memory, 7 Insufficient cpu.
Normal NotTriggerScaleUp 6s (x22 over 3m56s) cluster-autoscaler pod didn't trigger scale-up (it wouldn't fit if a new node is added): 1 Insufficient cpu
CREATE CLUSTER:
gcloud container clusters create [cluster-name] \
--enable-autoupgrade \
--enable-autoscaling \
--enable-network-policy \
--cluster-version latest \
--min-nodes=3 --max-nodes=10 --num-nodes 4 \
--zone us-central1-a \
--project [project]
gcloud container clusters get-credentials [cluser-name] \
--zone us-central1-a \
--project [project]
kubectl create clusterrolebinding cluster-admin-binding \
--clusterrole=cluster-admin \
--user=$(gcloud config get-value core/account)
INSTALL ISTIO (DEFAULT):
kubectl create namespace istio-system
helm template $HOME/istio-1.1.4/install/kubernetes/helm/istio-init --name istio-init --namespace istio-system | kubectl apply -f -
kubectl get crds | grep 'istio.io\|certmanager.k8s.io' | wc -l
[53]
helm template $HOME/istio-1.1.4/install/kubernetes/helm/istio --name istio --namespace istio-system | kubectl apply -f -
kubectl label namespace default istio-injection=enabled
TEST:
kubectl get pods -n istio-system
kubectl describe pod [istio telemetry pod] -n istio-system
Istio Version
1.1.4
How was Istio installed?
helm template, no tiller
Environment where bug was observed (cloud vendor, OS, etc)
Google Kubernetes Engine with 4+ n1-standard-1 (1 vCPU) nodes
Affected product area (please put an X in all that apply)
[x ] Configuration Infrastructure
[x ] Docs
[x ] Installation
[ ] Networking
[x ] Performance and Scalability
[ ] Policies and Telemetry
[ ] Security
[ ] Test and Release
[ x] User Experience
I started a new cluster with 3 [n1-standard-2 (2 vCPU)] nodes and the telemetry pod does schedule, but this is not the desired solution and contradicts the performance+scalability documentation.
The document you linked is the actual usage you would expect for istio-telemetry with 1000 QPS, but NOT the Kubernetes resource requests we set (details).
In the default installation, which it looks like you followed, we request 1000m for telemetry, which is why it is not getting scheduled.
You have a few options:
--set mixer.telemetry.resources.requests.cpu=100m
Most helpful comment
The document you linked is the actual usage you would expect for istio-telemetry with 1000 QPS, but NOT the Kubernetes resource requests we set (details).
In the default installation, which it looks like you followed, we request 1000m for telemetry, which is why it is not getting scheduled.
You have a few options:
--set mixer.telemetry.resources.requests.cpu=100m