I've requested this before through help tickets (156973 and 158135) but I think posting on github to get public +1s will help move this along.
The agent running in kubernetes could gather a lot of useful data from the metrics it takes in via DogStatsD, UDS, Autodiscovery, or another method that I'm not aware of yet, but it doesn't. Metrics from many pods in the same deployment come in and clobber each other. This is a never ending source of misery for my company, and it's making it hard for us to port applications to kubernetes.
What I ask for is that pod and container metadata, such as the cluster name, node name, pod name, container name, etc., as well as labels and a documented set of annotations, be collected by the agent and associated with metrics coming from each pod. I reckon as far as node and pod metadata goes it should be pretty simple, just collect the list of nodes and pods periodically and when a metric comes in from wherever, associated the source with the correct node/pod. Container metadata might be difficult and I understand if you can't provide that, but node/pod metadata is absolutely crucial.
We need this feature so that we can differentiate timeseries coming from different pods and relate that to metrics about nodes. For example if requests per minute starts decreasing from pods on one particular node we can correlate that with resource utilization on the node. Of course we also need cluster name since pod names aren't unique across clusters. In some cases we can use our entrypoint scripts and env vars to get some of this data added to the list of tags we send over statsd but this doesn't cover every case and honestly it's a lot of boilerplate for something the agent should be doing.
Hello @2rs2ts,
Does "metrics coming from each pod" refers to dogstatsd custom metrics? You can expect the following tags to be added to your custom metrics:
DD_TAGS environment variable or the yaml configuration optionSince Kubernetes does not expose a cluster name, or cluster-level tags or labels, users usually set them as host tags, either configured in the Agent daemonset or assigned to the nodes.
In 6.3.3, in a Kubernetes cluster, the following container tags will be automatically collected by origin detection (the same as are added to Autodiscovery checks):
kube_namespace, kube_deployment, kube_daemonset, kube_stateful_set, kube_container_namedocker_image, image_name, image_tagWe have changes in coming releases to address custom metrics emitted by containers, and we plan to add container_id, container_name, and pod_name in future releases. In the meantime, if any of the currently supported tags are not working, we鈥檒l be happy to help you get them successfully applied to metrics. I recommend support tickets as a medium though, as we鈥檒l need to exchange flares and configuration details.
@xvello Thanks, unfortunately there is a memory leak in 6.3.x so we are stuck not being able to upgrade, fortunately I think our team is going to get on a call with an engineer about that. When we can upgrade I'll test those things and give feedback on whether they're sufficient or not. I appreciate you spelling that all out for me, I wish it was documented on the site though
Hi @2rs2ts ,
I listed 6.3.3 as a habit of listing latest, although all the features I listed were already in 6.2.1. We should be all set to enable them already.
As for the documentation, we revamped https://docs.datadoghq.com/ early this year to increase discoverability of the features, but we still have room for improvement. If you have the time, more detailed feedback on what pages you expected to find links to these features would be very valuable.
@xvello actually we are already using 6.2.1. Sounds like the tags we need are for UDS/AD only, but not DogStatsD. When you said "We have changes in coming releases to address custom metrics emitted by containers" I assume you meant that includes statsd metrics?
@xvello Upon upgrading to 6.4.2 we don't see those tags with the UDP-based StatsD protocol so I suppose the answer to my previous question is no, you didn't mean statsd metrics.
Do you intend to add these tags with the UDP-based metrics? And if so, when?
Hello @2rs2ts
We investigated adding origin detection to UDP traffic, but could not find a consistent way to reliably link a packet to its source container. We cannot rely on the source IP, because there are many situations where multiple containers may share a single IP. For example in Kubernetes, all containers in a pod share the same network interface. hostNetwork mode can further complicate this as well.
This is why we are focussing our efforts on Unix sockets that allow dogstatsd to reliably detect the origin container and tag submitted metrics accordingly. We do not currently have a plan to bring origin detection to UDP metrics, but are open to reconsidering if we find a reliable solution for linking a container to an incoming udp packet.
@xvello Is there a reason why you cannot provide a reduced feature set for UDP origin detection? For example, maybe you can't detect which container is sending the metrics, but you can tell which pod is. And you can just not support metrics submitted from pods with host networking (which is something people should not be using that often anyway.)
pod_name doesnt appear for custom metrics when collected with k8s (statsd interface)
Hi @2rs2ts,
I renamed this feature request, and registered it in our backlog. Unfortunately, I cannot give you a time estimate for the moment.
@ludwikbukowski container_id, container_name and pod_name are not added as tags for Autodiscovery and Origin detection, for now.
I recommend you open a support ticket so we can investigate alternatives and inform you when we make progress on this.
Thanks @xvello, it's fine if you can't give an ETA right now. I'm glad we were able to get this hashed out and triaged! Again, thank you!
Just brief explanation why I'm lacking the feature (just FYI, as a feedback).
I run many pods(X) that run on many nodes(Y, Y
Only tagging the metric with some pod identifier prevent this to happen.
And just to make clear - the docker/kubernetes metrics are stampped with pod_name. Only the cusom metrics are not
^ that is exactly my issue and it's something that I honestly expected I'd get out of the box
@xvello we use kube-router and for us the source IP would reliably identify the source. It would be great to be able to turn on tagging in dogstatsd based on source IP for users where this would be effective. In our case we don't have UDS support in the library we're currently using, so we're stuck with UDP for right now.
Does this work with with origin detection? https://github.com/DataDog/datadog-agent/tree/master/Dockerfiles/dogstatsd/socat-proxy
@killcity indeed, using socat as a UDP -> UDS proxy is supported. It is documented here and this image has been created to illustrate this methodology.
Of couse, if socat runs as a sidecar container, the container tags will match it, but all pod tags will be consistent.
I'm running socat as a sidecar and origin detection is indeed working. The problem im running into is visibility down to specific pods. I'd like to see per-pod metrics instead of having them all lumped under one tag. Is this possible? I noticed the docs said that pod name and container name were not included. I think this is a mistake.
Hey all,
Updating this to let you know that this feature is planned for Q1 2019. We will avoid relying on src IP --> pod resolving if possible, as this won't work reliably in every network configurations. We will share more info soon.
@killcity it will be possible to enable pod-level tagging starting with agent version 6.9 which should go out this week. See the updated config template.
@hkaj can you please explain what "pod-level tagging" means? Your link does not explain it.
@killcity could you share your sidecar implementation? I'm running the same setup but can't get the process up because the socket isn't mounted in time. Are you introducing a wait?
@ashwin-subramanian Heres a simple example:
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
run: myapp
env: prod
name: myapp
namespace: myapp
spec:
serviceName: "myapp"
replicas: 1
selector:
matchLabels:
run: myapp
template:
metadata:
labels:
run: myapp
spec:
securityContext:
fsGroup: 1000
volumes:
- name: dsdsocket
hostPath:
path: /var/run/datadog/
- name: data
emptyDir: {}
containers:
- name: myapp
env:
- name: DATADOG_HOST
value: "127.0.0.1"
- name: DD_AGENT_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: DD_AGENT_PORT
value: "8126"
- name: DD_AGENT_SERVICE_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: DD_AGENT_SERVICE_PORT
value: "8126"
image: myrepo/myapp:latest
volumeMounts:
- name: data
mountPath: /data
readOnly: false
imagePullPolicy: Always
securityContext:
privileged: true
readinessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 360
periodSeconds: 5
timeoutSeconds: 10
livenessProbe:
httpGet:
path: /
port: 9090
initialDelaySeconds: 360
periodSeconds: 5
timeoutSeconds: 10
resources:
requests:
cpu: "15"
memory: "32000Mi"
limits:
cpu: "15"
memory: "32000Mi"
- name: socat
image: datadog/dogstatsd-socat-proxy:beta
imagePullPolicy: Always
ports:
- containerPort: 8125
name: dogstatsdport
protocol: UDP
volumeMounts:
- name: dsdsocket
mountPath: /socket
resources:
requests:
cpu: "100m"
memory: "100Mi"
limits:
cpu: "100m"
memory: "100Mi"
restartPolicy: Always
terminationGracePeriodSeconds: 30
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: mystorageclass
Hi all, quick (belated) update: this was released in agent 6.10, it supports k8s only for now (we have plans for swarm support, but would welcome external contributions if anyone needs this urgently). Please refer to the documentation for details about how to use it, and which client library support it: https://docs.datadoghq.com/agent/kubernetes/dogstatsd/#origin-detection-over-udp
Thats great news!
Most helpful comment
Hi all, quick (belated) update: this was released in agent 6.10, it supports k8s only for now (we have plans for swarm support, but would welcome external contributions if anyone needs this urgently). Please refer to the documentation for details about how to use it, and which client library support it: https://docs.datadoghq.com/agent/kubernetes/dogstatsd/#origin-detection-over-udp