Containers-roadmap: [ECS]: ECS Stateful Services

Created on 21 Jan 2019 · 23Comments · Source: aws/containers-roadmap

This issue, related to stateful services, complements issue #64 regarding ECS/EBS native integration. The proposal is to introduce a new type of ECS Services in which each task is allocated a unique identifier that persists even if the task dies and is replaced.

Background
ECS users have expressed the need to deploy stateful workloads on ECS. For example, in #64 , customers would like to have native integration between ECS and EBS so that ECS Tasks are automatically and dynamically attached to an EBS volume.

Stateless containerized applications have traditionally been deployed as services with 'fungible' tasks, meaning that the different instantiations of a single application are interchangeable. To deploy these workloads, ECS Services allow customers to run and maintain a specified number of instantiations of the same Task simultaneously.

Certain workloads however require each Task within a service to play a specific role. This is particularly true for some stateful workloads, in which specific Tasks play a special role such as ‘primary’ or ‘leader’.

Potential Feature Proposal
I am opening this issue to gather use-cases and +1s on a potential feature that would introduce ECS Services for stateful workloads, in which each Task gets assigned an identifier. In this potential scenario, if the Task dies and a new replacing Task is started by ECS , the same identifier, volume, and Service Discovery / CloudMap name will be allocated to the new task.

As we research this potential feature, if this will be helpful to you please +1 and provide some details on any use-cases you may have.

ECS

Source

Akramio

👍109 😕8 👀3

Most helpful comment

+1 we would like to containerize our elasticsearch cluster and stop messing around with ansible playbooks. We could achieve this with eks but would be amazing to maintain only our ECS infrastructure

larrywax on 31 Oct 2019

❤6

All 23 comments

This would create single-AZ pinned tasks due to the EBS vol's residency, no? Multi-AZ failover would have to be architected independent of this at the task provisioning time even having host instances in multi az?

chrismccracken on 22 Jan 2019

@christopherhein some people seem to want pets

FernandoMiguel on 22 Jan 2019

👍3

I have a use case for Task-associated EBS volumes on ECS, via AWS Batch. I hope it's relevant to your question.

I have Batch jobs that download some data (between 50GB and 1TB), process it, and upload the results. Each job requires sufficient storage space to download the data, and in some cases, the same amount again during processing.

This means I have to:

configure the ECS cluster instances with an EBS volume large enough to handle N maximum storage size jobs per instance (easy if you specify a single instance type, hard if you want any size in a class)
provide user data for cluster instances to format and mount the volume
configure mounts and volumes for the job definitions so each task has access to the EBS storage
deal with the inevitable job failures due to lack of space because arithmetic isn't our strong suit

If instead I could specify that a task requires X GBs of attached EBS storage, and that was exposed via Batch's job definitions and overrides – exactly the same as CPU and RAM today – then I'd have an EBS volume of precisely the right size per job. Perfect!

(Thanks again for trying this approach to open feedback!)

jdub on 22 Jan 2019

We run a custom database on ECS. Each customer is allocated a dedicated database. Depending of the customer's data size, the memory allocated to the task is different. One task + service represents a customer's database. For now, we pin tasks to a container instance where the customer's data is stored. The ability to associate a EBS volume with a task would help binpack container instances and provide the agility we need to move from one instance to another.

francispereira on 22 Jan 2019

I was considering opening a related issue for assigning an ordinal (or identifier including an ordinal) for each task in a service. A use case is for clients within a task application to be able to identify themselves across restarts. Other platforms such as Cloud Foundry and Kubernetes expose this through environment variables and hostnames, respectively.

JacobASeverson on 22 Jan 2019

👍2

+1 this would allow running Kafka on ECS very easily. Even better, please combine that with fargate so you get serverless and ops less stateful containers

simplesteph on 23 Jan 2019

👍4

+1 We have a ECS cluster for microservices. Right now we are looking for AWS native solution to containerize our offline Spark SQL jobs. While EKS looks good for it, it would be nicer to use one technology to handle them all so that we could focus development resource on one solution.

luyun-aa on 25 Jan 2019

:+1:

We've got a few scenarios where our services depend on large-ish (100s-1000s of GB) read-only datasets. These can be pulled at task startup from s3 but this essentially nukes our ability to auto-scale (because we have to wait for the data to land before servicing requests).

A feature in which we could spin up a task to prime the pump (ie preload an EBS volume) so that future tasks could start and just mount the same EBS volume read-only from new/other tasks would allow us to properly auto-scale. Failing that a mechanism in which we pre-bake a set number of EBS volumes so that we have capacity for an auto scale event would be less slick but just as effective if its easier to implement.

In any case, thanks for thinking about this!

kevinkreiser on 1 Feb 2019

👍3

@kevinkreiser do you mind my asking why does the EBS volume need to be pre-provisioned before you launch your Task? Would it work if you could provide a snapshot-id and ECS would create an EBS for that Task based on that snapshot you provided?

Akramio on 4 Feb 2019

@simplesteph interesting that you mention Kafka in the context of Fargate. Do you see any challenges or concerns on Fargate with stateful apps like Kafka given that you don't have access to an actual virtual machine? For example, you currently can't run privileged containers on Fargate.

Akramio on 4 Feb 2019

@Akramio yes, providing a snapshot id would be perfectly fine so long as that mechanism is relatively quick even in the presence of snapshots that are several 100 of gigabytes in size. Are you saying that this is already a possibility or is this the most likely path of implementing the feature?

kevinkreiser on 4 Feb 2019

@kevinkreiser we're still exploring what is the best way to implement this but it seems like not having to pre-create an EBS volume is easier (assume EBS volumes can be created 'on the fly' based on a snapshot fast enough).

Akramio on 6 Feb 2019

Was just wondering if anyone had any ideas for a workaround for this? I want to deploy three tasks in a service...zookeeper-1, zookeeper-2, and zookeeper-3, attached to vol1, vol2, and vol3 respectively. Then, on a deploy, I would like zookeeper-1 to go down and the new instance of zookeeper-1 to attach to the _same_ vol1.

I did imagine running three services as a workaround- but couldn't find any way to do a rolling deploy of a group of services using cloudformation or terraform?

pbecotte on 29 Mar 2019

We have two usecases, currently:

Running RabbitMQ (both single-instance and multi-instance multi-AZ HA clusters). Each Instance needs its permanent storage, which must survive container recreations. An EBS volume bound to the service instance would be best for this.
Shared Data Backend. Some services (instance number scales with load) need to write and read data which the user uploads. We're currently using EFS for this. (As we need durability, FSx seems not like a good fit.). We're thinking about writing S3 backends for this use case, however we're currently using some 3rd party libraries in some cases which assume File System interfaces.

markusschaber on 3 Apr 2019

👍1

This would be really nice to have.
We have also encountered a use case in setting up stateful Prometheus instances.
EBS volumes that are mapped to the service, which are associated to the new ec2 instance upon recreation would be ideal.

vaibhavzo on 14 May 2019

+1 Our customer wants a 5x10 solution for its custom made Shopping Cart application with data on SQL database and local volumes. , ECS Fargate tasks scheduled for 5x10 with persistent re-attachable EBS volumes would fit great and would even give the customer a cost reduction .

youwalther65 on 27 May 2019

+1 we would like to containerize our elasticsearch cluster and stop messing around with ansible playbooks. We could achieve this with eks but would be amazing to maintain only our ECS infrastructure

larrywax on 31 Oct 2019

❤6

we are planning to stateful application which hosting LDAP Service and storing LDAP directory Locally on each Task ( we are planning to use EBS as LDAP directory required Block Level Storage and also do Block Level replication if any changes detect in Directory data on one of the Running Task .to fulfill this requirement we want to make sure if task is terminated existing EBS Volume which host data must be able to reattach it self automatically and dynamically to new task . How we can accomplish this solution ?

bhavintr on 20 May 2020

Our first use case for this is running a pair of Prometheus monitoring hosts, each with a Thanos sidecar. The monitor host retains a sliding window of the last 2 hours of data, before it's compacted and shipped off to S3 by Thanos. In order to have 2 instances for HA, and not lose the last 2 hours of data every time we deploy, we currently have to run these as EC2 instances and have scripts to ensure host1 gets EBS volume1 reattached and host2 gets volume2 reattached (very similar to the Zookeeper case above).

We would love to run these as ECS Fargate tasks and just have it ensure that a) there's only ever 1 instance using a given volume, b) each instance sticks with the same one and gets its own volume c) ideally we could do rolling deploys for zero downtime.

Our 2nd use case is running a singleton Thanos Compactor service, which requires fairly large (100GB) volume, which currently forces us to go EC2 to get an EBS volume that's large enough. If we could instead mount a large enough ephemeral volume, we could run this in Fargate as well.

markmsmith on 2 Jun 2020

We would need this to run a neo4j cluster on ECS. "core" cluster members elect a leader.

If ECS delivers this it should solve the problem Kubernetes suffers from with StatefulSets, which requires a defined headless service for each expected pod in the StatefulSet. As such Kubernetes StatefulSets are unable to freely autoscale.

deuscapturus on 18 Jun 2020

I'm less clear about the need for properly stateful workloads, but having EBS volumes that can persist between runs of a task (either by literally keeping the volume around or by snapshotting and restoring) would enable Fargate to be used in a number of cases where it can't currently. Most other problems can be worked around (e.g. by third-party service discovery mechanisms).

Use-case 1:
We want to run developer environments in fargate, using something like Theia or cdr-server (both basically provide something similar to visual studio code spaces). When the developer has finished working, you'd obviously want to shut down the container (for cost reasons), but you clearly need to keep their files around. The files don't necessarily need to stay on the volume, they could be moved off to S3 or EFS, but this needs to be handled in fargate (not the container image) to ensure they get persisted correctly in the event that the container dies prematurely (e.g. due to out of memory killer).

Use-case 2:
I currently have an EC2 running a nexus instance in a docker container, storing about 3Tb of artifacts (on a bind mounted EBS volume). If the container dies on the EC2 for some reason, docker will start it again and all the files will still be on disk and most likely everything will just work. Even if the EC2 disappears for some reason, I can just fire up another instance, attach the data volume, pull/run my container and everything is good again. However in fargate all the data would be lost (even if you could go above the 20Gb storage limit), so you'd want to ensure it's stored on an external medium, so it was available when a replacement task was fired up. Snapshotting or transferring to EFS is probably not desirable given the size of the volume, so in this case you'd probably want the EBS volume to just sit there waiting to be attached to the next task.

tstibbs on 25 Jun 2020

Any service that has "high-availability" via replication (such as RAFT) that needs to persist to disk is going to require the ability for a volume to be "detached" when a container goes down and "reattached" when it comes back up. The recent EFS support could help with this. However, I still need a stable identifier for the container to implement a solution.

The ideal solution would allow me to automatically scale up ECS containers. They would get their own persistent volume (or identifier so I can use EFS) from a pool. If the pool is empty, one should be created. That way when I scale up, the container is attached to a volume from the pool. When they scale down or restart, the volume goes back into the pool.

nickpoorman on 10 Jul 2020

👍2

We want to run redis stream consumers as an ECS service (Fargate). We need each worker process (container) to have a unique but persistent identity (consumer name) in order to handle worker crashes/restarts gracefully. Kubernetes StatefulSets serve this purpose well but there appears to be no alternative or workaround in ECS and Fargate.