The container metadata fie is not being written for containers started by the ECS Agent.
After upgrading to 1.15.0 and enabling container metadata by running the agent with ECS_ENABLE_CONTAINER_METADATA=true, there are no metadata files being written to disk.
The directories are created on the host at /var/lib/ecs/data/metadata/task_id/container_name/, however no files exist in these directories.
ECS_CONTAINER_METADATA_FILE is set inside the container and the directory is available at /opt/ecs/metadata/random_ID/, however no file exists.
I expected to find an ecs-container-metadata.json file in the directories.
There is no file
{
"Cluster": "foo",
"ContainerInstanceArn": "arn:aws:ecs:us-east-2:ACCOUNT_ID:container-instance/807843bb-fa48-4359-80c9-4b2576ec8902",
"Version": "Amazon ECS Agent - v1.15.0 (d2dd240)"
}
Containers: 5
Running: 5
Paused: 0
Stopped: 0
Images: 4
Server Version: 1.12.6
Storage Driver: overlay
Backing Filesystem: extfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: host bridge overlay null
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp selinux
Kernel Version: 4.9.16-coreos-r1
Operating System: Container Linux by CoreOS 1298.7.0 (Ladybug)
OSType: linux
Architecture: x86_64
CPUs: 1
Total Memory: 1.956 GiB
Name: ip-10-10-0-96.us-east-2.compute.internal
ID: xxxx
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
127.0.0.0/8
Since it looks like you're on Container Linux, can you provide the docker run command you're using to start the ECS agent?
It would also be helpful if you could provide logs for the ECS agent. If you're not comfortable including those publicly, please email them to me at skarp at amazon.com. Thanks!
Logs sent via email. Here's my Docker run command (via Terraform formatted cloud-config)
/usr/bin/docker run --name ecs-agent \
--network=host \
--volume=/var/run/docker.sock:/var/run/docker.sock \
--volume=/var/log/ecs:/log \
--volume=/var/ecs-data:/data \
--volume=/sys/fs/cgroup:/sys/fs/cgroup:ro \
--volume=/run/docker/execdriver/native:/var/lib/docker/execdriver/native:ro \
--publish=0.0.0.0:51678:51678 \
--env=ECS_LOGFILE=/log/ecs-agent.log \
--env=ECS_LOGLEVEL=$${ECS_LOGLEVEL} \
--env=ECS_DATADIR=/data \
--env=ECS_CLUSTER=$${ECS_CLUSTER} \
--env=ECS_AVAILABLE_LOGGING_DRIVERS='["awslogs"]' \
--env=ECS_ENABLE_TASK_IAM_ROLE=true \
--env=ECS_INSTANCE_ATTRIBUTES=$${ECS_INSTANCE_ATTRIBUTES} \
--env=ECS_ENABLE_CONTAINER_METADATA=true \
--log-driver=awslogs \
--log-opt awslogs-region=${aws_region} \
--log-opt awslogs-group=${ecs_log_group_name} \
amazon/amazon-ecs-agent:$${ECS_VERSION}
Thanks. Since you're mounting the agent data directory on the host as /var/ecs-data, you'll need to set ECS_HOST_DATA_DIR=/var/ecs-data. By default, the agent assumes that the data directory is present on the host at /var/lib/ecs/data and that's what it's mounting into your containers.
There are a few other parameters you should be able to remove:
--publish=0.0.0.0:51678:51678 - Since you're running the agent with --network=host (which is what we recommend), you do not need to manually publish this port.--volume=/run/docker/execdriver/native:/var/lib/docker/execdriver/native:ro - The agent used to use this to help with metrics collection, but we switched to relying on docker stats around when Docker 1.12 came out. This mount is no longer necessary.Please let me know if this fixes it!
I've set ECS_HOST_DATA_DIR=/var/ecs-data but I have the same issue. Here's an empty metadata directory for one of my running containers:
ls -lah /var/ecs-data/data/metadata/2386acbb-afce-4a15-b3e5-5de4148db5c1/nautilus
total 16K
drwxr-xr-x. 2 root root 4.0K Nov 3 18:13 .
drwxr-xr-x. 4 root root 4.0K Nov 3 18:13 ..
Anything else I can check?
It looks like something is adding an extra data/ into the path. I reproduced what you're experiencing and you'll find the correct files written out into /var/ecs-data/metadata. I'm looking to see where this is going wrong in the code.
The code as-written is really expecting the end of the host path to be equivalent to the container path for the data directory (e.g., it would expect /var/ecs-data:/ecs-data). This is a bug, but to get around it for now you can change the container path of your mount to /ecs-data and then ECS_DATADIR=/ecs-data and ECS_HOST_DATA_DIR=/var.
So, like this:
--volume=/var/ecs-data:/ecs-data \
--env=ECS_DATADIR=/ecs-data \
--env=ECS_HOST_DATA_DIR=/var \
@samuelkarp Thanks, that got me around the problem!
@samuelkarp Upon further inspection I'm seeing unexpected port mappings in the metadata file. Let me know if you'd like me to open a separate issue for this.
Here I'm running a container that has been assigned port mapping 0.0.0.0:32769->80/tcp
docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
50e4958241ad xxx.dkr.ecr.us-east-2.amazonaws.com/xxx/streamline:latest "/usr/local/bin/main" 10 minutes ago Up 10 minutes 0.0.0.0:32769->80/tcp ecs-streamline-6-streamline-de99e9d0cabdd3f88301
But the metadata file has unexpected PortMappings. I expected to see HostPort= 32769, but it's set to 0. I'm assuming this is derived from the task definition, however I expected this value to be the dynamically assigned port.
{
"Cluster": "xxx",
"ContainerInstanceARN": "arn:aws:ecs:us-east-2:xxx:container-instance/dcc9fa0a-fba3-4a82-9159-210f7e693695",
"TaskARN": "arn:aws:ecs:us-east-2:xxx:task/1af3eed7-97ac-4b80-95cf-66747138c5dd",
"ContainerID": "50e4958241ad07299dbed24c4668f21ef6cf2f8500ad2f70cf86170e638ce802",
"ContainerName": "streamline",
"DockerContainerName": "/ecs-streamline-6-streamline-de99e9d0cabdd3f88301",
"ImageID": "sha256:54e3d53df4f8e9920373efa0fc678e73c89b721985d9af8bb2add64b0770920f",
"ImageName": "xxx.dkr.ecr.us-east-2.amazonaws.com/xxx/streamline:latest",
"PortMappings": [
{
"ContainerPort": 80,
"HostPort": 0,
"BindIp": "",
"Protocol": "tcp"
}
],
"Networks": [
{
"NetworkMode": "default",
"IPv4Addresses": [
"172.17.0.3"
]
}
],
"MetadataFileStatus": "READY"
}
@bndw I think you're now running into https://github.com/aws/amazon-ecs-agent/issues/1052, which will be fixed by https://github.com/aws/amazon-ecs-agent/pull/1053.
closing since #1053 is included with 1.15.1 onwards
After updating to 1.15.2 I no longer see container metadata files on the host, nor do I see the ECS_CONTAINER_METADATA_FILE variables inside my containers. Has this changed?
@bndw The metadata feature has not changed with the 1.15.2 release. As a sanity check - is ECS_ENABLE_CONTAINER_METADATA enabled in the agent's ecs.config?
@samuelkarp,
I don't think that this bug has actually been fixed. We're running the ECS-Agent 1.17.0 and seeing the behavior your described in https://github.com/aws/amazon-ecs-agent/issues/1051#issuecomment-342039846.
Our ECS-Agent is started up like this:
--net host -m 0b --detach=true -e ECS_CLUSTER=xxx \
-e ECS_RESERVED_MEMORY=1024 \
-e ECS_DATADIR=/data \
-e ECS_HOST_DATA_DIR=/mnt/ecs \
-e DOCKER_HOST=unix:///var/run/docker.sock \
-e ECS_LOGLEVEL=warn \
-e ECS_LOGFILE=/log/ecs.log \
-e ECS_ENGINE_AUTH_TYPE=dockercfg \
-e ECS_CONTAINER_STOP_TIMEOUT=60s \
-e ECS_ENABLE_TASK_IAM_ROLE=true \
-e ECS_ENABLE_CONTAINER_METADATA=true \
-e ECS_ENABLE_TASK_ENI=true \
-e ECS_ENABLE_TASK_IAM_ROLE_NETWORK_HOST=true \
-v /mnt/ecs:/data \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /mnt/ecs/log:/log \
-v /var/run/docker/execdriver/native:/var/lib/docker/execdriver/native:ro \
-v /sys/fs/cgroup:/sys/fs/cgroup:ro \
-v /proc:/host/proc:ro \
-v /sbin:/sbin:ro \
-v /lib:/lib:ro \
-v /lib32:/lib32:ro \
-v /lib64:/lib64:ro \
-v /lib:/lib:ro \
-v /bin:/bin:ro \
--restart=always --init --cap-add=NET_ADMIN --cap-add=SYS_ADMIN \
--name ecs-agent \
amazon/amazon-ecs-agent:v1.17.0 \
Looking at an ECS Task, we can see that the /data path is put into the metadata file location incorrectly:
[root@xxx:/mnt/ecs]# docker inspect d6801185f4c1 | grep -i meta
"/mnt/ecs//data/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba/xxx:/opt/ecs/metadata/3a06a773-7b41-4f45-80d4-ae52605ffda3"
"Source": "/mnt/ecs/data/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba/xxx",
"Destination": "/opt/ecs/metadata/3a06a773-7b41-4f45-80d4-ae52605ffda3",
"ECS_CONTAINER_METADATA_FILE=/opt/ecs/metadata/3a06a773-7b41-4f45-80d4-ae52605ffda3/ecs-container-metadata.json",
Here you can see that the actual file is in /mnt/ecs/metadata/<task>/..., but the bind mount added in /data to the path:
[root@xxx:/mnt/ecs:130]# find /mnt/ecs -name dc526595-c7ac-481a-97b6-b56f38fcc5ba
/mnt/ecs/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba
/mnt/ecs/data/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba
[root@xxx:/mnt/ecs]# find /mnt/ecs/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba
/mnt/ecs/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba
/mnt/ecs/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba/xxx
/mnt/ecs/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba/xxx/ecs-container-metadata.json
[root@xxx:/mnt/ecs]# find /mnt/ecs/data/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba
/mnt/ecs/data/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba
/mnt/ecs/data/metadata/dc526595-c7ac-481a-97b6-b56f38fcc5ba/xxx
[root@xxx:/mnt/ecs]#
Most helpful comment
@samuelkarp,
I don't think that this bug has actually been fixed. We're running the ECS-Agent 1.17.0 and seeing the behavior your described in https://github.com/aws/amazon-ecs-agent/issues/1051#issuecomment-342039846.
Our ECS-Agent is started up like this:
Looking at an ECS Task, we can see that the
/datapath is put into the metadata file location incorrectly:Here you can see that the actual file is in
/mnt/ecs/metadata/<task>/..., but the bind mount added in/datato the path: