Describe the bug
The syslog logfile on my master node is flooded with TRACE log files from K3S.
I don't know how to change it back to WARN or INFO. I tried the k3s SERVER -v 0 value, but this doesn't seem to affect the logging verbosity. -
To Reproduce
I upgraded from 0.6.1 to 0.8.1
Expected behavior
Log level is default (info or warn), or an instruction is required on https://rancher.com/docs/k3s/latest/en/configuration/ how to change the logging to something less chatty.
Screenshots
A part of my log file:
Aug 24 22:25:17 domotica-04 k3s[32334]: E0824 22:25:17.596207 32334 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
Aug 24 22:25:17 domotica-04 k3s[32334]: I0824 22:25:17.597873 32334 trace.go:81] Trace[2048830714]: "Patch /api/v1/nodes/domotica-07/status" (started: 2019-08-24 22:25:09.248662863 +0200 CEST m=+559.935042797) (total time: 8.349088647s):
Aug 24 22:25:17 domotica-04 k3s[32334]: Trace[2048830714]: [522.13879ms] [515.912678ms] About to check admission control
Aug 24 22:25:17 domotica-04 k3s[32334]: Trace[2048830714]: [8.349088647s] [7.826949857s] END
Aug 24 22:25:17 domotica-04 k3s[32334]: I0824 22:25:17.756231 32334 trace.go:81] Trace[162800240]: "SQL DB QueryContext query: SELECT id, name, value, old_value, old_revision, create_revision, revision, ttl, version, del FROM key_value WHERE name = ? ORDER BY revision DESC limit ? keys: [/registry/masterleases/192.168.1.204 1]" (started: 2019-08-24 22:25:16.19860096 +0200 CEST m=+566.884981769) (total time: 1.556530473s):
Aug 24 22:25:17 domotica-04 k3s[32334]: Trace[162800240]: [1.556530473s] [1.556530473s] END
Aug 24 22:25:17 domotica-04 k3s[32334]: I0824 22:25:17.883792 32334 trace.go:81] Trace[210670162]: "SQL DB QueryContext query: SELECT id, name, value, old_value, old_revision, create_revision, revision, ttl, version, del FROM key_value WHERE name = ? ORDER BY revision DESC limit ? keys: [/registry/minions/domotica-07 1]" (started: 2019-08-24 22:25:15.967138441 +0200 CEST m=+566.653518167) (total time: 1.916338725s):
Aug 24 22:25:17 domotica-04 k3s[32334]: Trace[210670162]: [1.916338725s] [1.916338725s] END
Aug 24 22:25:17 domotica-04 k3s[32334]: I0824 22:25:17.989664 32334 trace.go:81] Trace[1931520310]: "SQL DB QueryContext query: \nSELECT kv.id, kv.name, kv.value, kv.old_value, kv.old_revision, kv.create_revision, kv.revision, kv.ttl, kv.version, kv.del\nFROM key_value kv\n INNER JOIN\n (\n SELECT MAX(revision) revision, kvi.name\n FROM key_value kvi\n\t\t\n GROUP BY kvi.name\n ) AS r\n ON r.name = kv.name AND r.revision = kv.revision\nWHERE kv.name like ? ORDER BY kv.name ASC limit ?\n keys: [/registry/controllers% 1000000]" (started: 2019-08-24 22:25:17.050754189 +0200 CEST m=+567.737134706) (total time: 937.489635ms):
Aug 24 22:25:17 domotica-04 k3s[32334]: Trace[1931520310]: [937.489635ms] [937.489635ms] END
Aug 24 22:25:18 domotica-04 k3s[32334]: I0824 22:25:18.001996 32334 trace.go:81] Trace[9223029]: "SQL Commit" (started: 2019-08-24 22:25:13.900250862 +0200 CEST m=+564.586630379) (total time: 4.10128006s):
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[9223029]: [883.954788ms] [883.786955ms] condition key /registry/leases/kube-node-lease/domotica-05 good true
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[9223029]: [4.10111381s] [3.217159022s] op key /registry/leases/kube-node-lease/domotica-05 op 2
Aug 24 22:25:18 domotica-04 k3s[32334]: I0824 22:25:18.003170 32334 trace.go:81] Trace[1314769145]: "GuaranteedUpdate etcd3: coordination.Lease" (started: 2019-08-24 22:25:13.89485854 +0200 CEST m=+564.581238640) (total time: 4.107967046s):
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[1314769145]: [4.107316547s] [4.102048683s] Transaction committed
Aug 24 22:25:18 domotica-04 k3s[32334]: E0824 22:25:18.003538 32334 status.go:71] apiserver received an error that is not an metav1.Status: &errors.errorString{s:"context canceled"}
Aug 24 22:25:18 domotica-04 k3s[32334]: I0824 22:25:18.005602 32334 trace.go:81] Trace[1067664194]: "Update /apis/coordination.k8s.io/v1beta1/namespaces/kube-node-lease/leases/domotica-05" (started: 2019-08-24 22:25:13.892844711 +0200 CEST m=+564.579224436) (total time: 4.112388495s):
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[1067664194]: [4.110649165s] [4.10911346s] Object stored in database
Aug 24 22:25:18 domotica-04 k3s[32334]: I0824 22:25:18.007225 32334 trace.go:81] Trace[1052773052]: "SQL DB QueryContext query: SELECT id, name, value, old_value, old_revision, create_revision, revision, ttl, version, del FROM key_value WHERE name = ? ORDER BY revision DESC limit ? keys: [/registry/minions/domotica-07 1]" (started: 2019-08-24 22:25:15.967801315 +0200 CEST m=+566.654180915) (total time: 2.039033138s):
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[1052773052]: [2.039033138s] [2.039033138s] END
Aug 24 22:25:18 domotica-04 k3s[32334]: E0824 22:25:18.124080 32334 watch.go:233] unable to encode watch object *v1.WatchEvent: write tcp 192.168.1.204:6443->192.168.1.208:39678: write: broken pipe (&streaming.encoder{writer:(framer.lengthDelimitedFrameWriter)(0xc444d60), encoder:(versioning.codec)(0x9644050), buf:(bytes.Buffer)(0xe138e00)})
Aug 24 22:25:18 domotica-04 k3s[32334]: I0824 22:25:18.354409 32334 trace.go:81] Trace[1650250852]: "SQL DB QueryContext query: SELECT id, name, value, old_value, old_revision, create_revision, revision, ttl, version, del FROM key_value WHERE name = ? ORDER BY revision DESC limit ? keys: [/registry/minions/domotica-01 1]" (started: 2019-08-24 22:25:15.356237002 +0200 CEST m=+566.042616603) (total time: 2.997912811s):
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[1650250852]: [2.997912811s] [2.997912811s] END
Aug 24 22:25:18 domotica-04 k3s[32334]: I0824 22:25:18.358516 32334 trace.go:81] Trace[1224956611]: "SQL DB QueryContext query: SELECT id, name, value, old_value, old_revision, create_revision, revision, ttl, version, del FROM key_value WHERE name = ? ORDER BY revision DESC limit ? keys: [/registry/minions/domotica-02 1]" (started: 2019-08-24 22:25:15.959612165 +0200 CEST m=+566.645993099) (total time: 2.398633223s):
Aug 24 22:25:18 domotica-04 k3s[32334]: Trace[1224956611]: [2.398633223s] [2.398633223s] END
Additional context
The architecture this system is running is ARMhf.
I haven't personally seen this behavior so I'm guessing something in your environment is triggering it to go trace. What are the environment variables set on the process?
Yes i think so too. The environment variables are pretty default though,
This is my service file:
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
After=network-online.target
[Service]
Type=notify
EnvironmentFile=/etc/systemd/system/k3s.service.env
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
server \
-v=0 \
--no-deploy=traefik \
--no-deploy=servicelb \
--kubelet-arg=address=0.0.0.0 \
KillMode=process
Delegate=yes
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
[Install]
WantedBy=multi-user.target
And this is my environment file
KUBERNETES_SERVICE_HOST=domotica-04.artificialcreature.com
KUBERNETES_SERVICE_PORT=443
Nothing special here i guess. I checked the global environment variables as well, but i couldn't find any hints there either.
I have got the same behavior as well, I don't know how to reduce the verbosity and -v 0 does not seem to help.
I'm on version v1.17.2+k3s1 on a Centos 8 Vagrant box and starting the k3s with the following command:
K3S_TOKEN=superrandompassword /usr/local/bin/k3s server --cluster-init --node-ip 192.168.1.166 -v 0
Have you been able to workaround this, @Rick-Jongbloed ?
How did you install k3s to begin with? Are there any environment variables that might be influencing this behavior?
No i havent been able to work around this issue unfortunately.
Can you provide a full listing of the environment variables for your k3s server process?
I wil do this this weekend and will update this post.
I'm facing the same behavior as well, I installed k3s with the following command :
curl -sfL https://get.k3s.io | sh -s - server --datastore-endpoint="mysql://user:password@tcp(ip:3306)/k3s" --log /var/log/k3s.log
In file /var/log/k3s.log, I can see the following output :
Trace[1691121332]: [9.976763219s] [232.807碌s] Transaction prepared
Trace[1691121332]: [9.984781277s] [8.018058ms] Transaction committed
Trace[1691121332]: [9.985086084s] [255.129碌s] Transaction prepared
Trace[1691121332]: [9.993509663s] [8.423579ms] Transaction committed
Trace[1691121332]: [9.993818646s] [273.106碌s] Transaction prepared
Trace[1691121332]: [9.998879099s] [5.060453ms] END
I0420 11:20:48.213953 1590 trace.go:116] Trace[1092196218]: "Update" url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:k3s/v1.17.4+k3s1 (linux/amd64) kubernetes/3eee8ac/leader-election,client:127.0.0.1 (started: 2020-04-20 11:20:38.212566061 +0200 CEST m=+610.467951821) (total time: 10.001340905s):
Trace[1092196218]: [10.001340905s] [10.001264141s] END
W0420 11:20:48.285716 1590 garbagecollector.go:639] failed to discover some groups: map[metrics.k8s.io/v1beta1:the server is currently unable to handle the request]
E0420 11:20:48.643149 1590 node_lifecycle_controller.go:1152] Error updating node rancher1: Operation cannot be fulfilled on nodes "rancher1": the object has been modified; please apply your changes to the latest version and try again
E0420 11:20:48.673407 1590 node_lifecycle_controller.go:1152] Error updating node rancher2: Operation cannot be fulfilled on nodes "rancher2": the object has been modified; please apply your changes to the latest version and try again
E0420 11:20:51.297134 1590 pod_workers.go:191] Error syncing pod 92c0d88d-5729-454d-a166-92cb9850a0fd ("helm-install-traefik-snzq9_kube-system(92c0d88d-5729-454d-a166-92cb9850a0fd)"), skipping: failed to "StartContainer" for "helm" with CrashLoopBackOff: "back-off 5m0s restarting failed container=helm pod=helm-install-traefik-snzq9_kube-system(92c0d88d-5729-454d-a166-92cb9850a0fd)"
E0420 11:20:52.094317 1590 node_controller.go:245] Error getting node addresses for node "rancher1": error fetching node by provider ID: unimplemented, and error by node name: Failed to find node rancher1: node "rancher1" not found
E0420 11:20:52.094791 1590 node_controller.go:245] Error getting node addresses for node "rancher2": error fetching node by provider ID: unimplemented, and error by node name: Failed to find node rancher2: node "rancher2" not found
E0420 11:21:01.878530 1590 leaderelection.go:367] Failed to update lock: Put https://127.0.0.1:6444/api/v1/namespaces/kube-system/endpoints/kube-scheduler?timeout=10s: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
time="2020-04-20T11:21:01.879045722+02:00" level=error msg="error in txn: context canceled"
I0420 11:21:01.879671 1590 trace.go:116] Trace[1962836807]: "Update" url:/api/v1/namespaces/kube-system/endpoints/kube-scheduler,user-agent:k3s/v1.17.4+k3s1 (linux/amd64) kubernetes/3eee8ac/leader-election,client:127.0.0.1 (started: 2020-04-20 11:20:51.879228016 +0200 CEST m=+624.134613724) (total time: 10.000403113s):
Trace[1962836807]: [10.000403113s] [10.000310964s] END
I0420 11:21:01.881311 1590 trace.go:116] Trace[1185887788]: "GuaranteedUpdate etcd3" type:*core.Endpoints (started: 2020-04-20 11:20:51.879365396 +0200 CEST m=+624.134751290) (total time: 9.999308159s):
Trace[1185887788]: [319.246碌s] [295.958碌s] Transaction prepared
Trace[1185887788]: [6.154816ms] [5.83557ms] Transaction committed
Trace[1185887788]: [6.369838ms] [191.198碌s] Transaction prepared
Trace[1185887788]: [13.983552ms] [7.613714ms] Transaction committed
Trace[1185887788]: [14.191984ms] [187.93碌s] Transaction prepared
Trace[1185887788]: [19.192077ms] [5.000093ms] Transaction committed
Trace[1185887788]: [19.402218ms] [187.732碌s] Transaction prepared
My environment variables are :
ps -ef | grep k3s
root 1306 1 0 11:10 ? 00:00:00 /usr/local/bin/k3s server --datastore-endpoint=mysql://user:password@tcp(ip:3306)/k3s --log /var/log/k3s.log
root 1590 1306 40 11:10 ? 00:02:00 k3s server --datastore-endpoint=mysql://user:password@tcp(ip:3306)/k3s --log /var/log/k3s.log
root 1819 1590 0 11:10 ? 00:00:02 containerd -c /var/lib/rancher/k3s/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /var/lib/rancher/k3s/agent/containerd
root 2250 1 0 11:10 ? 00:00:00 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -namespace k8s.io -id 274e4fb9a437f5ebb52cf196dc806e4830d958caae8ce976ed94d08a356c2830 -address /run/k3s/containerd/containerd.sock
root 2251 1 0 11:10 ? 00:00:00 /var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/containerd-shim-runc-v2 -namespace k8s.io -id eb1944d2b7b83f48d6798bc8884843d8d16ce75961e244eb8aca149cee81bf27 -address /run/k3s/containerd/containerd.sock
cat /proc/1306/environ | tr '\0' '\n'
LANG=en_US.UTF-8
LANGUAGE=en_US:en
PATH=/var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/aux
NOTIFY_SOCKET=/run/systemd/notify
INVOCATION_ID=673a633c4ba24b148886feeaf01e060f
JOURNAL_STREAM=9:22596
RES_OPTIONS=
K3S_DATA_DIR=/var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430
cat /proc/1590/environ | tr '\0' '\n'
LANG=en_US.UTF-8
LANGUAGE=en_US:en
PATH=/var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin:/var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430/bin/aux
NOTIFY_SOCKET=/run/systemd/notify
INVOCATION_ID=673a633c4ba24b148886feeaf01e060f
JOURNAL_STREAM=9:22596
RES_OPTIONS=
K3S_DATA_DIR=/var/lib/rancher/k3s/data/6a3098e6644f5f0dbfe14e5efa99bb8fdf60d63cae89fdffd71b7de11a1f1430
_K3S_LOG_REEXEC_=true
Thanks for your help.
Same problem for on on v1.18.9+k3s1
Edit: -v=0 had no effect
systemd config
[Unit]
Description=Lightweight Kubernetes
Documentation=https://k3s.io
Wants=network-online.target
[Install]
WantedBy=multi-user.target
[Service]
Type=notify
EnvironmentFile=/etc/systemd/system/k3s.service.env
KillMode=process
Delegate=yes
# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=1048576
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
TimeoutStartSec=0
Restart=always
RestartSec=5s
ExecStartPre=-/sbin/modprobe br_netfilter
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/local/bin/k3s \
server \
'--disable=traefik' \
'--node-label' \
'role=ingress-controller' \
'-v=0' \
cat /var/log/messages | grep k3s
Oct 28 10:39:20 xx k3s[3519308]: I1028 10:39:20.319883 3519308 trace.go:116] Trace[964924759]: "List" url:/api/v1/events,user-agent:rancher/v0.0.0 (linux/amd64) kubernetes/$Format cluster c-b8p8t,client:10.42.0.144 (started: 2020-10-28 10:39:19.527878812 +0000 UTC m=+93.459566966) (total time: 791.950661ms):
Oct 28 10:39:20 xx k3s[3519308]: Trace[964924759]: [783.333787ms] [783.291805ms] Listing from storage done
Oct 28 10:39:20 xx k3s[3519308]: I1028 10:39:20.902945 3519308 trace.go:116] Trace[358696718]: "List etcd3" key:/events,resourceVersion:,limit:500,continue:eyJ2IjoibWV0YS5rOHMuaW8vdjEiLCJydiI6MTcxNTA2NjMsInN0YXJ0IjoiZGVmYXVsdC9wcm9kLWhjbG91ZC1pbm4tbWFzdGVyMC1mc24xLjE2MmQyMmI3Y2Y1NzI5ZGFcdTAwMDAifQ (started: 2020-10-28 10:39:20.363423701 +0000 UTC m=+94.295111757) (total time: 539.491307ms):
Oct 28 10:39:20 xx k3s[3519308]: Trace[358696718]: [539.491307ms] [539.491307ms] END
Oct 28 10:39:20 xx k3s[3519308]: I1028 10:39:20.912733 3519308 trace.go:116] Trace[344039666]: "List" url:/api/v1/events,user-agent:rancher/v0.0.0 (linux/amd64) kubernetes/$Format cluster c-b8p8t,client:10.42.0.144 (started: 2020-10-28 10:39:20.363391906 +0000 UTC m=+94.295079948) (total time: 549.299957ms):
Oct 28 10:39:20 xx k3s[3519308]: Trace[344039666]: [539.576034ms] [539.551281ms] Listing from storage done
You will see these TRACE level messages when datastore operations take an excessive amount of time. Generally this indicates that your node is CPU or IO bound and is not performing to the apiserver's expectations.
Good to know the cause and that this indicates (potentially) a problem, thanks.
Would it not make sense to rather output a single line such as
"WARNING: datastore operation -List etcd3...- took longer than expected 501ms (threshold 500ms)"
This gives the operator a chance to turn up the logging level (e.g. to Trace).
EDIT: @brandond what I'm trying to say is that -v=0 should only show INFO and above and never TRACE or DEBUG. The situation you are pointing out should IMHO be handled differently with a WARN level message
This is all upstream Kubernetes apiserver code printing these messages, not k3s code. Feel free to open an issue there.
Thanks. Looks like it (or similar problems) are on the radar
https://github.com/kubernetes/kubernetes/issues/95568
https://github.com/kubernetes/kubernetes/issues/87126
Closing due to age.
Most helpful comment
You will see these TRACE level messages when datastore operations take an excessive amount of time. Generally this indicates that your node is CPU or IO bound and is not performing to the apiserver's expectations.