Not sure how to ask my question in a better way, basically has anyone managed to set up fluentd with k3s for log shipping?
I'll answer my own question in case others attempt the same. It is possible with https://github.com/fluent/fluentd-kubernetes-daemonset, but there were a number of gotchas.
The fluentd logs started filling up with infinite backslashes ("\\\\\\...."). Issues https://github.com/fluent/fluentd-kubernetes-daemonset/issues/186 and https://github.com/openshift/origin-aggregated-logging/issues/1423#issuecomment-430675192 suggested that fluentd was processing its own logs which resulted in a recursive behaviour. To solve this, I had to add exclude_path ["/var/log/containers/fluentd*"] to the source section of its config file.
The second gotcha was that unlike docker, containerd doesn't produce logs in a json format, but a custom one, e.g. 2018-06-26T01:37:58.737599779Z stderr F <log message goes here>. In order to solve this, I replaced the @type json with
<parse>
@type regexp
expression /^(?<time>.+) (?<stream>stdout|stderr)( (?<logtag>.))? (?<log>.*)$/
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
Additionally I had elastic search in the same docker-compose.yml as k3s and I could access it from the k3s container via its service name (e.g. ping elasticsearch worked), but from the fluentd pod I could only access it via its IP address within docker (which can change).
To fix this I used a combination of k3s agent --resolv-conf /etc/resolv.conf for k3s to inherit the /etc/resolv.conf of its docker container and adding hostNetwork: true to the fluentd pod spec for it to inherit the /etc/resolv.conf of k3s.
Thanks a lot @e-nikolov! You saved my ass.
I think we can document this better in the future
Most helpful comment
I'll answer my own question in case others attempt the same. It is possible with https://github.com/fluent/fluentd-kubernetes-daemonset, but there were a number of gotchas.
The fluentd logs started filling up with infinite backslashes ("\\\\\\...."). Issues https://github.com/fluent/fluentd-kubernetes-daemonset/issues/186 and https://github.com/openshift/origin-aggregated-logging/issues/1423#issuecomment-430675192 suggested that fluentd was processing its own logs which resulted in a recursive behaviour. To solve this, I had to add
exclude_path ["/var/log/containers/fluentd*"]to the source section of its config file.The second gotcha was that unlike docker, containerd doesn't produce logs in a json format, but a custom one, e.g.
2018-06-26T01:37:58.737599779Z stderr F <log message goes here>. In order to solve this, I replaced the@type jsonwithAdditionally I had elastic search in the same docker-compose.yml as k3s and I could access it from the k3s container via its service name (e.g.
ping elasticsearchworked), but from the fluentd pod I could only access it via its IP address within docker (which can change).To fix this I used a combination of
k3s agent --resolv-conf /etc/resolv.conffor k3s to inherit the/etc/resolv.confof its docker container and addinghostNetwork: trueto the fluentd pod spec for it to inherit the/etc/resolv.confof k3s.