I created EC2 Instance, with the idea that it was going to join my non-default cluster. However, it ended up joining the default cluster so I deregistered it, but changing /etc/ecs/ecs.config didn't work on re-registering it to the new cluster.
I followed the instructions on http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-install.html to start/restart the container, but it exits immediately with this error:
2015-07-20T21:08:01Z [INFO] Starting Agent: Amazon ECS Agent - v1.3.0 (097e4af)
2015-07-20T21:08:01Z [INFO] Loading configuration
2015-07-20T21:08:01Z [INFO] Checkpointing is enabled. Attempting to load state
2015-07-20T21:08:01Z [INFO] Loading state! module="statemanager"
2015-07-20T21:08:01Z [CRITICAL] Data mismatch; saved cluster 'default' does not match configured cluster 'dev-consul'. Perhaps you want to delete the configured checkpoint file?
Also, how can I prevent the EC2 instance from auto joining the default cluster in the future?
When you use the ECS Optimized AMI, the agent will automatically launch with the instance. Userdata does execute before the Agent is launched, however, so you can use that to configure the cluster before it runs as shown in step 10 of the container instance launch docs.
Once it has joined a cluster, you can cause it to join a new cluster by stopping it, deleting its checkpoint file (located at /var/lib/ecs/data/* by default), and launching it again with the new configuration. It's important to note that all tasks currently managed by it should be stopped prior to doing this as the Agent will be unable to track past tasks.
Thanks for clarifying, exactly what I was looking for!
@euank then instances are associated with 2 clusters at the same time for me.
@benweet for better or worse, that's as designed. Tasks shouldn't be placed on the 'stale' ContainerInstance due to the agentConnected attribute being false however.
You can use the deregister-container-instance API to deregister the stale ContainerInstance.
The Agent will not make that call itself currently and the ECS service will not handle deregistration until the associated EC2 instance terminates.
Hi
Does the same apply for a non amazon linux ami. I created an AMI by installing docker on a centos 7 machine and created an AMI out of it. Now when i am that AMI to launch my ECS cluster and i am starting up the amazon-ecs-agent docker container using user-data script i am getting the same error.
@rupamroy Instructions for running the ECS agent on other Linux instances are here. @euank's comments are accurate, though the paths to the files may be different depending on how you're starting the agent. If you continue running into problems, please share the commands you use to run the agent (though please remove all credentials!) and we can help you.
What is the correct procedure for setting that target cluster when launch a ECS optimized instance? Currently it always auto registers to default cluster.
Instructions to attach ECS optimized instance to cluster (other than default) are available here
This really should be added to the Troubleshooting docs (http://docs.aws.amazon.com/AmazonECS/latest/developerguide/troubleshooting.html) and here http://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-agent-config.html , because it breaks if you merely launch the instance with ECS AMI.
2017-05-19T21:42:46Z [INFO] Checkpointing is enabled. Attempting to load state
2017-05-19T21:42:46Z [INFO] Loading state! module="statemanager"
2017-05-19T21:42:46Z [CRITICAL] Data mismatch; saved cluster 'default' does not match configured cluster 'xxx'. Perhaps you want to delete the configured checkpoint file?
As the message suggests, remove the checkpoint file sudo rm /var/lib/ecs/data/agent.db and restart the ecs-agent.
Most helpful comment
When you use the ECS Optimized AMI, the agent will automatically launch with the instance. Userdata does execute before the Agent is launched, however, so you can use that to configure the cluster before it runs as shown in step 10 of the container instance launch docs.
Once it has joined a cluster, you can cause it to join a new cluster by stopping it, deleting its checkpoint file (located at
/var/lib/ecs/data/*by default), and launching it again with the new configuration. It's important to note that all tasks currently managed by it should be stopped prior to doing this as the Agent will be unable to track past tasks.