Faas: Proof of concept: reliable asynchronous processing

Created on 7 Jul 2017  路  11Comments  路  Source: openfaas/faas

Asynchronous processing should be possible for long-running functions.

Must have:

  • Work can be accepted through a new route or a Header
/async/function/<function_name>

Or via Header:

X-etc: async
  • Work is accepted immediately and a 202 Accepted is returned. This should be handed off to a queue.

  • One or more (scaleable) asynchronous workers read from a queue and call functions

    • Should dequeue item atomically
    • Upon failure another worker should pick up the item
    • For initial version - HTTP should be used by worker to call function just like the gateway does. Timeout will depend on the configuration of the function.
  • Prometheus metrics to be logged for work queued/processed/outstanding

Could have:

  • Watchdog configuration to state whether async/sync is supported
  • Validation in gateway for invocation method
  • Retry logic on failure

Nice to have:

  • Additional logging beyond docker service logs
  • Callback URL could be specified via header or query-string - this could be called by the framework upon completion

Notes:

Have looked into Kafka - design looks overly complex for task at hand.
NATS queuing is not resilient - but NATS Streaming may be suitable.

help wanted skiladvanced

All 11 comments

If you need a beta tester for asynchronous processing, I am in !

Thanks @Tofull - I've started a quick proof of concept with NATs streaming.

Have you started using FaaS or creating functions already? Do you have async workloads ready for testing?

Amazing ! You roks ! :)

I used FaaS to deploy some functions that my machine learning experts made with magic.

FaaS works great for prediction service as it is a synchronous task.

As some processing functions need time (training our models), we would use async processing and we already have a workflow ready for testing.

That sounds like a great use-case. Is there anything you can share on a blog or on Twitter?

When our developments will gain in stability, we will be able to communicate and mention FaaS as the solution we decided to use in a tweet or some presentations we will make ('cause we are working with French industries in aerospace :artificial_satellite: & :earth_americas: earth observation fields).
FaaS with async should completely meet our needs, here at the Institute of Technology Saint Exup茅ry. :smiley:

It would probably help to map out the use cases for the async processing first as there are a couple different ones I can think of that usually require different guarantees and metrics. anyone know what the users of this library would favor in use cases for this as this would ease up choosing the right queueing options here too. kafka for example might seem overly complex (and it is complex) but it has its uses, but usually I wouldnt choose that for simple response queues like here generally. nats is nice general purpose, but i fear doesnt encompass all options. there is also the possibility of a mixed solution making it simple pub/sub with separate log database (usually you need recent stuff, which is in mem, but you dont lose old stuff this way and support stuff like "oh my car is offline for 15min because of shit internet, but can still get its response"), which gives quite a lot of flexibility and isnt that hard to implement.
problem with using queuing systems like nats is that you end up having a ton of a ton of queues piling up. While this is completely acceptable in your infrastructure for workers and services as its quite limited, when it comes to response queues, thats not so feasable really. mqtt suffers from similar problems in the end when load gets high. seen a couple of implementations that offload mqtt queues to databases though.

As you have mentioned - the various queue implementations available have their own pros/cons. Ideally it should be easy to swap between different "queue" providers or implementations.

This initial branch / work is based around a NATs streaming queue which does have persistence and resilience.

You can see the progress here:

https://github.com/alexellis/faas/tree/async_nats

Guide to testing the branch:

https://gist.github.com/alexellis/62dad83b11890962ba49042afe258bb1

ah i must have missed that idea - that sounds like the best possible outcome, yes :)

Hey @Tofull do you have a draft or published blog yet?

Please see changes in #131

Work merged into master and released in 0.6.3 https://github.com/alexellis/faas/releases/tag/0.6.2

If anyone wants to start on a Kafka implementation that would be great - otherwise let's spend time using the async code and doing edge-case testing.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

alexellis picture alexellis  路  7Comments

alexellis picture alexellis  路  7Comments

hotjunfeng picture hotjunfeng  路  5Comments

jvice152 picture jvice152  路  7Comments

ohld picture ohld  路  6Comments