Output of the info page (if this is a bug)
(Paste the output of the info page here)
unable to provide the info, pods will not start
Describe what happened:
Having problem with collecting logs from GKE containers.
Followed https://app.datadoghq.com/logs/onboarding/container steps after which am seeing the following in the datadog-agent logs which are crashlooping:
NAME READY STATUS RESTARTS AGE
datadog-agent-4kgjf 0/1 RunContainerError 0 3s
datadog-agent-d5wbs 0/1 RunContainerError 0 3s
datadog-agent-khh4b 0/1 RunContainerError 1 3s
vasiliy@Vasiliys-Pro:~/Code/js/CM/empire/datadog/deployment% kubectl logs -f datadog-agent-khh4b
failed to open log file "/var/log/pods/default_datadog-agent-khh4b_2643e896-66e4-11e9-a6de-42010a8e0020/datadog-agent/1.log": open /var/log/pods/default_datadog-agent-khh4b_2643e896-66e4-11e9-a6de-42010a8e0020/datadog-agent/1.log: no such file or directory%
Describe what you expected:
datadog-agent pods starting fine
Steps to reproduce the issue:
Additional environment details (Operating System, Cloud provider, etc):
Latest GKE DaemonSet:
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: datadog-agent
spec:
template:
metadata:
labels:
app: datadog-agent
name: datadog-agent
spec:
serviceAccountName: datadog-agent
containers:
- image: datadog/agent:latest
imagePullPolicy: Always
name: datadog-agent
ports:
- containerPort: 8125
# Custom metrics via DogStatsD - uncomment this section to enable custom metrics collection
hostPort: 38125
name: dogstatsdport
protocol: UDP
- containerPort: 8126
# Trace Collection (APM) - uncomment this section to enable APM
hostPort: 38126
name: traceport
protocol: TCP
env:
- name: DD_API_KEY
value: "xxxxxxxxxx"
- name: DD_COLLECT_KUBERNETES_EVENTS
value: "true"
- name: DD_LEADER_ELECTION
value: "true"
- name: KUBERNETES
value: "yes"
- name: DD_KUBERNETES_KUBELET_HOST
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: DD_APM_ENABLED
value: "true"
- name: DD_DOGSTATSD_NON_LOCAL_TRAFFIC
value: "true"
- name: DD_LOGS_ENABLED
value: "true"
- name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
value: "true"
- name: DD_AC_EXCLUDE
value: "name:datadog-agent"
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "256Mi"
cpu: "200m"
volumeMounts:
- name: dockersocket
mountPath: /var/run/docker.sock
- name: procdir
mountPath: /host/proc
readOnly: true
- name: cgroups
mountPath: /host/sys/fs/cgroup
readOnly: true
- name: pointerdir
mountPath: /opt/datadog-agent/run
livenessProbe:
exec:
command:
- ./probe.sh
initialDelaySeconds: 15
periodSeconds: 5
volumes:
- hostPath:
path: /var/run/docker.sock
name: dockersocket
- hostPath:
path: /proc
name: procdir
- hostPath:
path: /sys/fs/cgroup
name: cgroups
- hostPath:
path: /opt/datadog-agent/run
name: pointerdir
Support has been unresponsive for days... 馃憥
@vasiliyb I think this is tied to this issue: https://github.com/rancher/charts/issues/24#issuecomment-415692699
Could you try using a different mountdir than /opt/datadog-agent/run as specified in the issue?
cc @DylanLovesCoffee
@vasiliyb Could you create a volume and volumeMount to /var/log/pods in your DD Daemonset and update us in the ticket of the result?
I fixed the failed to open log file "/var/log/pods/ (...) error with the following configuration:
# In spec.template.spec.containers[0].volumeMounts.
- name: logpath
mountPath: /var/log/pods
- name: dockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
# In spec.template.spec.volumes.
- hostPath:
path: /var/log/pods
name: logpath
- hostPath:
path: /var/lib/docker/containers
name: dockercontainers
The problem is that the files in /var/log/pods/*/*/*.log are actually symbolic links that point to /var/lib/docker/containers/*/*.log, so mounting only /var/log/pods is not enough.
I fixed the read-only file system problem that occurs when attempting to mount the host path /opt/datadog-agent/run by replacing /opt/datadog-agent/run with /var/lib/datadog-agent/run in spec.template.spec.volumes[].hostPath.path (ref: https://github.com/DataDog/datadog-agent/issues/3370#issuecomment-487438501, https://github.com/rancher/charts/issues/24#issuecomment-415692699).
@jtrh thanks! worked perfectly.
Hi @vasiliyb
Thanks for submitting your issue and providing a solution.
We updated recently the datadog deamonset to include /var/lib/docker/containers by default as well as updated the documentation, see:
Feel free to close the issue, if you think your problem is solved.
Cheers
How do i get logs from different file path which are written /opt/
Most helpful comment
I fixed the
failed to open log file "/var/log/pods/ (...)error with the following configuration:The problem is that the files in
/var/log/pods/*/*/*.logare actually symbolic links that point to/var/lib/docker/containers/*/*.log, so mounting only/var/log/podsis not enough.I fixed the
read-only file systemproblem that occurs when attempting to mount the host path/opt/datadog-agent/runby replacing/opt/datadog-agent/runwith/var/lib/datadog-agent/runinspec.template.spec.volumes[].hostPath.path(ref: https://github.com/DataDog/datadog-agent/issues/3370#issuecomment-487438501, https://github.com/rancher/charts/issues/24#issuecomment-415692699).