Hello Loki team,
First and foremost, thank you so much for Loki! It's awesome.
Is your feature request related to a problem? Please describe.
As a cluster administrator, when installing loki-stack
I want to be able to set toleration for promtail
So that I may forward logs from all nodes, even nodes with taint, to loki
Describe the solution you'd like
I would like to be able to set tolerations for promtail pods when installing with helm like so:
helm upgrade --install loki-stack loki/loki-stack --namespace default --set promtail.tolerations[0].operator=Exists,promtail.tolerations[0].effect=NoSchedule,promtail.tolerations[0].key=storage-node --set grafana.enabled=true
The configurable properties would look like:
loki:
enabled: true
promtail:
enabled: true
tolerations: []
fluent-bit:
enabled: false
grafana:
enabled: false
sidecar:
datasources:
enabled: true
image:
tag: 6.7.0
prometheus:
enabled: false
Describe alternatives you've considered
This is a quick workaround to get promtail working with loki-stack on nodes with taint where toleration is required:
# Install loki-stack w/o promtail
helm upgrade --install loki-stack loki/loki-stack --namespace default --set promtail.enabled=false --set grafana.enabled=true
# Install promtail and set tolerations
helm upgrade --install promtail loki/promtail --set "loki.serviceName=loki-stack" --set tolerations[0].operator=Exists,tolerations[0].effect=NoSchedule,tolerations[0].key=storage-node
Additional context
I have two pools of nodes, nodepool1 and npstorage (which is tainted). Promtail only installs on untainted nodes in nodepool1. My npstorage pool is dedicated for rook and ceph...but I need to inspect those logs.
My cluster nodes:
kubectl get nodes
NAME STATUS ROLES AGE VERSION
aks-nodepool1-12920461-vmss000000 Ready agent 19h v1.15.11
aks-nodepool1-12920461-vmss000001 Ready agent 19h v1.15.11
aks-npstorage-12920461-vmss000000 Ready agent 16h v1.15.11
aks-npstorage-12920461-vmss000001 Ready agent 16h v1.15.11
aks-npstorage-12920461-vmss000002 Ready agent 16h v1.15.11
Each npstorage node is tainted like this:
taints:
- effect: NoSchedule
key: storage-node
value: "true"
Currently, when I install loki-stack with helm, pods are only added to nodepool1 (untainted) nodes:
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
loki-stack-0 1/1 Running 0 79m 10.244.0.36 aks-nodepool1-12920461-vmss000000 <none> <none>
loki-stack-grafana-9dcdfd59c-m7bnf 1/1 Running 0 14m 10.244.1.26 aks-nodepool1-12920461-vmss000001 <none> <none>
loki-stack-promtail-96gks 1/1 Running 0 79m 10.244.1.25 aks-nodepool1-12920461-vmss000001 <none> <none>
loki-stack-promtail-jwhd4 1/1 Running 0 79m 10.244.0.37 aks-nodepool1-12920461-vmss000000 <none> <none>
Let me know if there any other detail I can add to help.
Sounds fair.
If I understand correctly, this should be possible by prepending the configs with the sub chart name, i.e., promtail. I'll include some examples that should be correct, but I haven't tested them so please double check and diff the release before running them.
Command example:
helm upgrade --install loki-stack loki/loki-stack --namespace default --set grafana.enabled=true --set promtail.tolerations[0].operator=Exists,promtail.tolerations[0].effect=NoSchedule,promtail.tolerations[0].key=storage-node
Config file example:
grafana:
enabled: true
promtail:
tolerations:
- effect: NoSchedule
operator: Exists
key: storage-node
@BradLugo Thanks! That worked for me
Things to keep in mind: Promtail has its own default toleration that would get overwritten if this is set
#default
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
link: https://github.com/grafana/loki/blob/v2.0.0/production/helm/promtail/values.yaml#L119-L124
Since you likely want promtail to be running on every single node, the toleration can be reduced to simply
promtail:
tolerations:
- effect: NoSchedule
operator: Exists