Output of the info page (if this is a bug)
Getting the status from the agent.
==============
Agent (v6.0.3)
==============
Status date: 2018-03-15 17:14:23.501699 UTC
Pid: 8828
Python Version: 2.7.13
Logs:
Check Runners: 4
Log Level: info
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
NTP offset: 0.002778145 s
System UTC time: 2018-03-15 17:14:23.501699 UTC
Host Info
=========
bootTime: 2018-03-15 16:20:25.000000 UTC
kernelVersion: 4.9.81-35.56.amzn1.x86_64
os: linux
platform: amazon
platformFamily: rhel
platformVersion: 2017.09
procs: 211
uptime: 87
virtualizationRole: guest
virtualizationSystem: xen
Hostnames
=========
ec2-hostname: ip-172-31-15-66.us-west-2.compute.internal
hostname: i-0691f3728b71dd647
instance-id: i-0691f3728b71dd647
socket-fqdn: ip-172-31-15-66.us-west-2.compute.internal.
socket-hostname: ip-172-31-15-66
=========
Collector
=========
Running Checks
==============
cpu
---
Total Runs: 210
Metrics: 6, Total Metrics: 1254
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
disk
----
Total Runs: 210
Metrics: 52, Total Metrics: 11298
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
docker
------
Total Runs: 210
Metrics: 0, Total Metrics: 0
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0Error: UNKNOWN ERROR
No traceback
Warning: Error initialising check: [permanent failure in dockerutil: retry number exceeded]
file_handle
-----------
Total Runs: 210
Metrics: 1, Total Metrics: 210
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
io
--
Total Runs: 210
Metrics: 91, Total Metrics: 21370
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
load
----
Total Runs: 210
Metrics: 6, Total Metrics: 1260
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
memory
------
Total Runs: 210
Metrics: 14, Total Metrics: 2940
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
network
-------
Total Runs: 210
Metrics: 26, Total Metrics: 6366
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
ntp
---
Total Runs: 210
Metrics: 1, Total Metrics: 199
Events: 0, Total Events: 0
Service Checks: 1, Total Service Checks: 210
uptime
------
Total Runs: 210
Metrics: 1, Total Metrics: 210
Events: 0, Total Events: 0
Service Checks: 0, Total Service Checks: 0
========
JMXFetch
========
Initialized checks
==================
no checks
Failed checks
=============
no checks
=========
Forwarder
=========
CheckRunsV1: 210
IntakeV1: 17
RetryQueueSize: 0
Success: 437
TimeseriesV1: 210
API Keys status
===============
https://6-0-3-app.agent.datadoghq.com,*************************3d0bf: API Key valid
==========
Logs Agent
==========
Logs Agent is not running
=========
DogStatsD
=========
Checks Metric Sample: 49096
Event: 1
Events Flushed: 1
Number Of Flushes: 210
Series Flushed: 45898
Service Check: 2520
Service Checks Flushed: 2718
Dogstatsd Metric Sample: 2039
Describe what happened:
==> /var/log/datadog/agent.log <==
2018-03-15 17:15:07 UTC | WARN | (checkbase.go:60 in Warnf) | Error initialising check: [permanent failure in dockerutil: retry number exceeded]
2018-03-15 17:15:07 UTC | ERROR | (runner.go:276 in work) | Error running check docker: permanent failure in dockerutil: retry number exceeded
2018-03-15 17:15:07 UTC | WARN | (datadog_agent.go:135 in LogMessage) | (disk.py:104) | Unable to get disk metrics for /var/lib/docker/devicemapper/mnt/ba004f4db542edfe90eade24759318016564726655cd022b478e18425df78148: [Errno 13] Permission denied: '/var/lib/docker/devicemapper/mnt/ba004f4db542edfe90eade24759318016564726655cd022b478e18425df78148'
2018-03-15 17:15:07 UTC | WARN | (datadog_agent.go:135 in LogMessage) | (disk.py:104) | Unable to get disk metrics for /var/run/docker/netns/default: [Errno 13] Permission denied: '/var/run/docker/netns/default'
2018-03-15 17:15:07 UTC | WARN | (datadog_agent.go:135 in LogMessage) | (disk.py:104) | Unable to get disk metrics for /var/lib/docker/containers/4361404cc1b8fc34d9acf5c4b1e44edf857c125c6139986ddad48aebaa9f4b52/shm: [Errno 13] Permission denied: '/var/lib/docker/containers/4361404cc1b8fc34d9acf5c4b1e44edf857c125c6139986ddad48aebaa9f4b52/shm'
2018-03-15 17:15:22 UTC | WARN | (checkbase.go:60 in Warnf) | Error initialising check: [permanent failure in dockerutil: retry number exceeded]
2018-03-15 17:15:22 UTC | ERROR | (runner.go:276 in work) | Error running check docker: permanent failure in dockerutil: retry number exceeded
2018-03-15 17:15:22 UTC | WARN | (datadog_agent.go:135 in LogMessage) | (disk.py:104) | Unable to get disk metrics for /var/lib/docker/devicemapper/mnt/ba004f4db542edfe90eade24759318016564726655cd022b478e18425df78148: [Errno 13] Permission denied: '/var/lib/docker/devicemapper/mnt/ba004f4db542edfe90eade24759318016564726655cd022b478e18425df78148'
2018-03-15 17:15:22 UTC | WARN | (datadog_agent.go:135 in LogMessage) | (disk.py:104) | Unable to get disk metrics for /var/run/docker/netns/default: [Errno 13] Permission denied: '/var/run/docker/netns/default'
2018-03-15 17:15:22 UTC | WARN | (datadog_agent.go:135 in LogMessage) | (disk.py:104) | Unable to get disk metrics for /var/lib/docker/containers/4361404cc1b8fc34d9acf5c4b1e44edf857c125c6139986ddad48aebaa9f4b52/shm: [Errno 13] Permission denied: '/var/lib/docker/containers/4361404cc1b8fc34d9acf5c4b1e44edf857c125c6139986ddad48aebaa9f4b52/shm'
==> /var/log/datadog/process-agent.log <==
2018-03-15 17:12:01 ERROR (common.go:64) - unable to get the container list from ecs
2018-03-15 17:12:01 ERROR (container.go:91) - failed to get container list from ecs - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
2018-03-15 17:12:11 ERROR (common.go:42) - decoding task metadata failed - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
2018-03-15 17:12:11 ERROR (common.go:50) - unable to retrieve task metadata
2018-03-15 17:12:11 ERROR (common.go:64) - unable to get the container list from ecs
2018-03-15 17:12:11 ERROR (container.go:91) - failed to get container list from ecs - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
2018-03-15 17:12:21 ERROR (common.go:42) - decoding task metadata failed - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
2018-03-15 17:12:21 ERROR (common.go:50) - unable to retrieve task metadata
2018-03-15 17:12:21 ERROR (common.go:64) - unable to get the container list from ecs
2018-03-15 17:12:21 ERROR (container.go:91) - failed to get container list from ecs - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
==> /var/log/datadog/process-errors.log <==
2018-03-15 16:21:51 INFO (main_common.go:84) - pid '8684' written to pid file '/opt/datadog-agent/run/process-agent.pid'
2018-03-15 16:21:51 INFO (tagger.go:77) - starting the tagging system
2018-03-15 16:21:51 INFO (tagger.go:148) - ecs tag collector successfully started
2018-03-15 16:21:51 ERROR (common.go:42) - decoding task metadata failed - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
2018-03-15 16:21:51 ERROR (common.go:50) - unable to retrieve task metadata
2018-03-15 16:21:51 ERROR (common.go:64) - unable to get the container list from ecs
2018-03-15 16:21:51 ERROR (container.go:91) - unable to connect to docker - temporary failure in dockerutil, will retry later: try delay not elapsed yet
2018-03-15 16:21:51 ERROR (container.go:91) - failed to get container list from ecs - json: cannot unmarshal string into Go value of type ecs.TaskMetadata
Describe what you expected:
Should be collecting ECS and Docker data correctly
Should be collecting logs from all docker containers
Steps to reproduce the issue:
Install the agent, configure as follows:
# datadog.yaml
dd_url: https://app.datadoghq.com
tags:
- instance_id:UNDEFINED_INSTANCE_ID
- environment:UNDEFINED_ENVIRONMENT
histogram_percentiles: ["0.90","0.95","0.99"]
forwarder_num_workers: 2
collect_ec2_tags: true
check_runners: 0
enable_gohai: true
use_dogstatsd: yes
dogstatsd_port: 8125
dogstatsd_non_local_traffic: no
logs_enabled: false
listeners:
- name: auto
docker_labels_as_tags:
com.amazonaws.ecs.cluster: cluster
com.amazonaws.ecs.task-definition-family: task_family
com.amazonaws.ecs.task-definition-version: task_version
environment: environment
git-sha: sha
process_config:
enabled: "true"
apm_config:
enabled: true
# datadog-ecs.yaml
## Provides autodetected defaults, for kubernetes environments,
## please see datadog.yaml.example for all supported options
# Autodiscovery
listeners:
- name: ecs
config_providers:
## The ecs provider handles templates embedded in container labels, see
## https://docs.datadoghq.com/guides/autodiscovery/#template-source-docker-label-annotations
- name: ecs
polling: true
# conf.d/docker.d/conf.yaml
init_config:
instances:
-
collect_events: false
collect_container_size: true
collect_images_stats: true
collect_image_size: true
collect_disk_stats: true
collect_exit_codes: true
logs:
- type: docker
service: docker-alpha
source: docker-alpha
tags: alpha
Additional environment details (Operating System, Cloud provider, etc):
Moving to support ticket.
I'm having this same issue @maycmlee
@mandeepbal I'm sorry to hear that. Could you open a support ticket by emailing [email protected]. Thanks!
Hi May, this seems to be an issue with the open source agent. Why should I have to open a support ticket ?
Hi @mandeepbal, if you open a ticket then we can take a closer look into your issue and specific setup to see where the problem is coming from. Thanks.
getting these errors:
process-agent[3185]: 2019-09-26 19:17:56 UTC | PROCESS | CRITICAL | (collector.go:91 in runCheck) | Unable to run check 'container': permanent failure in detector: No collector available
Is there a fix for this issue?
Hi @toli-belo
Could you open a new issue with your specific issue please?
Also, you can contact our support team and open a ticket: [email protected] so they can further look into your issue.
Thanks!
getting these errors:
process-agent[3185]: 2019-09-26 19:17:56 UTC | PROCESS | CRITICAL | (collector.go:91 in runCheck) | Unable to run check 'container': permanent failure in detector: No collector available
Is there a fix for this issue?
@toli-belo did you find the solution for this? I have the same error in logs.
Thanks
I have same error in y log is there any update?
Oct 1 19:52:39 ip-172-31-31-162 process-agent[1154]: 2019-10-01 19:52:39 CEST | PROCESS | CRITICAL | (collector.go:91 in runCheck) | Unable to run check 'container': permanent failure in detector: No collector available
I have the same issue in my log files.
2019-10-09 11:17:51 BST | PROCESS | CRITICAL | (collector.go:91 in runCheck) | Unable to run check 'container': permanent failure in detector: No collector available
The same.
grep -c "PROCESS | CRITICAL | (collector.go:91 in runCheck) | Unable to run check 'container': permanent failure in detector: No collector available" /var/log/messages
7877
Centos7, datadog-agent-6.14.1-1.x86_64
Most helpful comment
getting these errors:
process-agent[3185]: 2019-09-26 19:17:56 UTC | PROCESS | CRITICAL | (collector.go:91 in runCheck) | Unable to run check 'container': permanent failure in detector: No collector availableIs there a fix for this issue?