Tell us about your request
Extend the overrides parameter of RunTask and StartTask with logConfiguration.
"overrides": {
"containerOverrides": [{
"command": ["string"],
"cpu": number,
"environment": [{
"name": "string",
"value": "string"
}],
"memory": number,
"memoryReservation": number,
"name": "string",
"logConfiguration" {
logDriver: 'configurable',
options: {
'awslogs-group': 'configurable',
'awslogs-region': 'configurable',
'awslogs-stream-prefix': 'configurable'
}
}
}],
"executionRoleArn": "string",
"taskRoleArn": "string"
},
Which service(s) is this request for?
ECS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
When running one-off tasks, on a copy a Taskdefinition of an existing running services. The logs end up in Cloudwatch in the same Loggroup as the live service. This makes it hard to lookup the logs in Cloudwatch.
I'm also using RunTask for creating Cloudwatch cron events which trigger RunTask through a lambda. Having a seperate Cloudwatch Loggroup would really make it easier to have the logs of those cron-tasks in a seperate group.
Are you currently working around this issue?
Search for keyword in cloudwatch
Additional context
Great work on the roadmap!
Any update on this? Is this possible today?
馃憤 for this suggestion
This would be extremely useful
This is partly possible with FireLens.
FireLens takes the options in your logConfiguration options and sends them directly to Fluentd/Fluent Bit. You can use the fact that Fluent Bit and Fluentd supports using environment variables as configuration to enable overrides.
Here's the key sections of an example task definition:
{
"family": "firelens-overrides",
"containerDefinitions": [
{
"essential": true,
"image": "906394416424.dkr.ecr.ap-south-1.amazonaws.com/aws-for-fluent-bit:latest",
"name": "log_router",
"firelensConfiguration": {
"type": "fluentbit"
},
"environment": [
{ "name": "NAME", "value": "cloudwatch" },
{ "name": "REGION", "value": "ap-south-1" },
{ "name": "LOG_STREAM", "value": "test" },
{ "name": "LOG_GROUP", "value": "env_var_interpolation_example" }
]
},
{
"essential": true,
"image": "1111111111111.dkr.ecr.ap-south-1.amazonaws.com/app-image:latest",
"name": "app",
"logConfiguration": {
"logDriver":"awsfirelens",
"options": {
"Name": "${NAME}",
"region": "${REGION}",
"log_group_name": "${LOG_GROUP}",
"auto_create_group": "true",
"log_stream_name": "${LOG_STREAM}"
}
}
}
]
}
As you can see, the Log Group, Region, and Log Stream are all set via environment variables on the log router container. Environment variables can be overridden in runtask, so you can re-use this task def multiple times and change some of the log parameters each time you run it.
+1 for this one :)
This is partly possible with FireLens.
FireLens takes the options in your logConfiguration options and sends them directly to Fluentd/Fluent Bit. You can use the fact that Fluent Bit and Fluentd supports using environment variables as configuration to enable overrides.
Here's the key sections of an example task definition:
{ "family": "firelens-overrides", "containerDefinitions": [ { "essential": true, "image": "906394416424.dkr.ecr.ap-south-1.amazonaws.com/aws-for-fluent-bit:latest", "name": "log_router", "firelensConfiguration": { "type": "fluentbit" }, "environment": [ { "name": "NAME", "value": "cloudwatch" }, { "name": "REGION", "value": "ap-south-1" }, { "name": "LOG_STREAM", "value": "test" }, { "name": "LOG_GROUP", "value": "env_var_interpolation_example" } ] }, { "essential": true, "image": "1111111111111.dkr.ecr.ap-south-1.amazonaws.com/app-image:latest", "name": "app", "logConfiguration": { "logDriver":"awsfirelens", "options": { "Name": "${NAME}", "region": "${REGION}", "log_group_name": "${LOG_GROUP}", "auto_create_group": "true", "log_stream_name": "${LOG_STREAM}" } } } ] }As you can see, the Log Group, Region, and Log Stream are all set via environment variables on the log router container. Environment variables can be overridden in runtask, so you can re-use this task def multiple times and change some of the log parameters each time you run it.
Thank you @PettitWesley for this solution.
It is working well with ECS tasks with launchtype EC2 but no log is created in Cloudwatch when launching via Fargate. Seems like it is due to "networkMode": "awsvpc" needed by Fargate.
Because when I launch a task with launchtype EC2 and "networkMode": "awsvpc" it is also not working...
Do you have any idea? I checked and it has nothing to do with IAM rights.
The post from @etiennecaldichoury is very helpful (thanks!) and I'm eager to implement this on one of my project. However, my use case also requires Fargate and I see that might be a blocker.
Anyone else on this thread have info on whether this is compatible with Fargate or any special steps to enable?
From this AWS Firelens docs page, I compiled the following excerpts:
Fargate is supported:
FireLens for Amazon ECS is supported for tasks using both the Fargate and EC2 launch types.
Don't specify TCP forward input:
In your custom configuration file, for tasks using the bridge or awsvpc network mode, you should not set a Fluentd or Fluent Bit forward input over TCP because FireLens will add it to the input configuration.
Task Execution IAM role required if using ECR or Secrets Manager:
If your task uses the Fargate launch type and you are pulling container images from Amazon ECR or referencing sensitive data from AWS Secrets Manager in your log configuration, then you must include the task execution IAM role.
Firelense config files on S3 not supported with Fargate:
For tasks using the Fargate launch type, the only supported config-file-type value is file.
@etiennecaldichoury - Of these restrictions, the only possibility I can see that might apply to your case would be the third item regarding the need for a task execution role - which governs permissions/access on setting up the containers _before_ the policy specified by TaskRole takes over. Can you confirm and let us know if this might be the problem?
Thanks!
Agree with @aaronsteers- this approach should work on both EC2 and Fargate; nothing about it is EC2 specific AFAIK.
Hi @aaronsteers @PettitWesley
Thanks a lot for your very quick reply! Here are some more information (sorry for the long post :) )
Concerning your points n掳2 and 4: I don鈥檛 use any configuration file, configuration is done via task definition below (I have removed/hidden critical information).
{
"containerDefinitions": [
{
"essential": true,
"image": "906394416424.dkr.ecr.eu-central-1.amazonaws.com/aws-for-fluent-bit:latest",
"name": "log_router",
"firelensConfiguration": {
"type": "fluentbit"
},
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "<awslogs-group>",
"awslogs-region": "eu-central-1",
"awslogs-stream-prefix": "log_router"
}
},
"environment": [
{
"name": "LOG_PREFIX",
"value": "ecs"
}
]
},
{
"logConfiguration": {
"logDriver": "awsfirelens",
"options": {
"Name": "cloudwatch",
"region": "eu-central-1",
"log_group_name": "<log_group_name>",
"log_stream_prefix": "${LOG_PREFIX}/",
"log_key": "log"
}
},
"portMappings": [
{
"protocol": "tcp",
"containerPort": 5000
}
],
"image": "<id>.dkr.ecr.eu-central-1.amazonaws.com/<container>:<tag>",
"essential": true,
"name": "application"
}
],
"family": "<family>",
"executionRoleArn": "arn:aws:iam::<id>:role/ecsTaskExecutionRole",
"cpu": "256",
"memory": "1024",
"networkMode": "awsvpc",
"requiresCompatibilities": [
"FARGATE"
]
}
Concerning n掳3, normally IAM rights are fine because when I switch back my container to 芦 awslogs 禄 driver (instead of 芦 awsfirelens 禄), it is writing logs correctly to cloudwatch! Moreover, my EC2 and Fargate task definitions are sharing the same task execution IAM role and one is working and the other no.
Seems like issue is coming from network mode "awsvpc". Please find below "log_router" container logs
With EC2 task ("application" container log is written to Cloudwatch)
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
AWS for Fluent Bit Container Image Version 2.2.0
tput: No value for $TERM and no -T specified
[1mFluent Bit v1.3.9[0m
[1m[93mCopyright (C) Treasure Data[0m
[2020/03/10 07:11:29] [ info] [storage] version=1.0.1, initializing...
[2020/03/10 07:11:29] [ info] [storage] in-memory
[2020/03/10 07:11:29] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/03/10 07:11:29] [ info] [engine] started (pid=1)
[2020/03/10 07:11:29] [ info] [in_fw] listening on unix:///var/run/fluent.sock
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_group = 'pys-flask-dev-log'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ecs/'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-central-1'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_key = 'log'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''n"
[2020/03/10 07:11:29] [ info] [in_fw] binding 0.0.0.0:24224
[2020/03/10 07:11:29] [ info] [in_tcp] binding 127.0.0.1:8877
[2020/03/10 07:11:29] [ info] [sp] stream processor started
[engine] caught signal (SIGTERM)
[2020/03/10 07:11:48] [ info] [input] pausing forward.0
[2020/03/10 07:11:48] [ info] [input] pausing forward.1
[2020/03/10 07:11:48] [ info] [input] pausing tcp.2
[2020/03/10 07:12:08] [ warn] [engine] service will stop in 5 seconds
[2020/03/10 07:12:12] [ info] [engine] service stopped
[2020/03/10 07:12:12] [ info] [input] pausing forward.0
[2020/03/10 07:12:12] [ info] [input] pausing forward.1
[2020/03/10 07:12:12] [ info] [input] pausing tcp.2
With EC2 task and networkMode "awsvpc" ("application" container log is NOT written to Cloudwatch)
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
AWS for Fluent Bit Container Image Version 2.2.0
tput: No value for $TERM and no -T specified
[1mFluent Bit v1.3.9[0m
[1m[93mCopyright (C) Treasure Data[0m
[2020/03/10 07:11:29] [ info] [storage] version=1.0.1, initializing...
[2020/03/10 07:11:29] [ info] [storage] in-memory
[2020/03/10 07:11:29] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/03/10 07:11:29] [ info] [engine] started (pid=1)
[2020/03/10 07:11:29] [ info] [in_fw] listening on unix:///var/run/fluent.sock
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_group = 'pys-flask-dev-log'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ecs/'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-central-1'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_key = 'log'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = n"
time="2020-03-10T07:11:29Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''n"
[2020/03/10 07:11:29] [ info] [in_fw] binding 0.0.0.0:24224
[2020/03/10 07:11:29] [ info] [in_tcp] binding 127.0.0.1:8877
[2020/03/10 07:11:29] [ info] [sp] stream processor started
[engine] caught signal (SIGTERM)
[2020/03/10 07:11:48] [ info] [input] pausing forward.0
[2020/03/10 07:11:48] [ info] [input] pausing forward.1
[2020/03/10 07:11:48] [ info] [input] pausing tcp.2
[2020/03/10 07:12:08] [ warn] [engine] service will stop in 5 seconds
[2020/03/10 07:12:12] [ info] [engine] service stopped
[2020/03/10 07:12:12] [ info] [input] pausing forward.0
[2020/03/10 07:12:12] [ info] [input] pausing forward.1
[2020/03/10 07:12:12] [ info] [input] pausing tcp.2
The only difference seems to be:
[in_fw] binding 0.0.0.0:24224 -> [in_fw] binding 127.0.0.1:24224
Seems like it is the HOST injected by ECS that changes? Seems normal.
With Fargate task ("application" container log is also NOT written to Cloudwatch)
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
AWS for Fluent Bit Container Image Version 2.2.0
tput: No value for $TERM and no -T specified
[1mFluent Bit v1.3.9[0m
[1m[93mCopyright (C) Treasure Data[0m
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_group = 'pys-flask-dev-log'n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_stream_prefix = 'ecs/'n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_stream = ''n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter region = 'eu-central-1'n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_key = 'log'n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter role_arn = ''n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter auto_create_group = 'false'n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter endpoint = ''n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter credentials_endpoint = n"
time="2020-03-10T07:20:37Z" level=info msg="[cloudwatch 0] plugin parameter log_format = ''n"
[2020/03/10 07:20:37] [ info] [storage] version=1.0.1, initializing...
[2020/03/10 07:20:37] [ info] [storage] in-memory
[2020/03/10 07:20:37] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2020/03/10 07:20:37] [ info] [engine] started (pid=1)
[2020/03/10 07:20:37] [ info] [in_fw] listening on unix:///var/run/fluent.sock
[2020/03/10 07:20:37] [ info] [in_fw] binding 127.0.0.1:24224
[2020/03/10 07:20:37] [ info] [in_tcp] binding 127.0.0.1:8877
[2020/03/10 07:20:37] [ info] [sp] stream processor started
[engine] caught signal (SIGTERM)
[2020/03/10 07:21:10] [ info] [input] pausing forward.0
[2020/03/10 07:21:10] [ info] [input] pausing forward.1
[2020/03/10 07:21:10] [ info] [input] pausing tcp.2
time="2020-03-10T07:21:11Z" level=error msg="[cloudwatch 0] NoCredentialProviders: no valid providers in chain. Deprecated.ntFor verbose messaging see aws.Config.CredentialsChainVerboseErrorsn"
[2020/03/10 07:21:11] [ warn] [engine] service will stop in 5 seconds
[2020/03/10 07:21:11] [ warn] [engine] failed to flush chunk '1-1583824870.364332508.flb', retry in 11 seconds: task_id=0, input=forward.0 > output=cloudwatch.1
[2020/03/10 07:21:15] [ info] [engine] service stopped
[2020/03/10 07:21:15] [ info] [input] pausing forward.0
[2020/03/10 07:21:15] [ info] [input] pausing forward.1
[2020/03/10 07:21:15] [ info] [input] pausing tcp.2
Got this error message:
time="2020-03-10T07:21:11Z" level=error msg="[cloudwatch 0] NoCredentialProviders: no valid providers in chain. Deprecated.ntFor verbose messaging see aws.Config.CredentialsChainVerboseErrorsn"
I really don鈥檛 understand what is happening.
@etiennecaldichoury The old awslogs driver uses the Task Execution Role (which is for stuff managed by us) and FireLens uses the Task Role (which is for the containers in your task- and the Firelens side-car is in your task).
I don't see a Task Role in your Task Definition; I suspect that's the issue.
@PettitWesley thanks a lot I'll try today
@PettitWesley working perfectly...! Now I know "Task Execution" and "Task" roles are different things :D
Thanks a lot again
Closing this issue with the FireLens solution provided. Let us know if that solution is not valid by reopening the issue
@srrengar Even though the FireLens solution works, it would be more convenient to override logConfiguration variables in the task level. It would make a lot easier to have Scheduled Tasks route log traffic to a separate log group or prefix in Cloudwatch Logs, if necessary. The main reason I didn't like the FireLens solution is because the log message in Cloudwatch Logs is not clean with simple log output, it has a full json string for every stdout message.
{
"container_id": "xxxxxx",
"container_name": "/ecs-xxxx-1-xxxx-d2d2f1a6b5ca8abd1900",
"ec2_instance_id": "i-xxxx",
"ecs_cluster": "xxxx",
"ecs_task_arn": "arn:aws:ecs:us-east-1:xxxx:task/xxxx",
"ecs_task_definition": "xxxx:1",
"log": "________Log message here________",
"source": "stdout"
}
@lafraia You can make FireLens give you the simple log output. Add one extra option in your logConfiguration- log_key log. Then it will just send the value of the log key.
https://github.com/aws/amazon-cloudwatch-logs-for-fluent-bit#plugin-options
Most helpful comment
This would be extremely useful