Bull: Recommended approach for concurrency

Created on 29 Aug 2019  路  7Comments  路  Source: OptimalBits/bull

Description

Hi all. Looking for a recommended approach that meets the following requirement:

  • Handle many job types (50 for the sake of this example)
  • Avoid more than 1 job running on a single worker instance at a given time (jobs vary in complexity, and workers are potentially CPU-bound)
  • Scale up horizontally by adding workers if the message queue fills up, that's the approach to concurrency I'd like to take.

Desired driving equivalent: 1 road with 1 lane

I've tried two approaches:

Single queue with named jobs for each job type

The problem here is that concurrency stacks across all job types (see https://github.com/OptimalBits/bull/issues/1113), so concurrency ends up being 50, and continues to increase for every new job type added, bogging down the worker.

Driving equivalent: 1 road with 50 lanes

Queue for each job type

While this prevents multiple of the same job type from running at simultaneously, if many jobs of varying types (some more computationally expensive than others) are submitted at the same time, the worker gets bogged down in that scenario too, which ends up behaving quite similar to the above solution.

Driving equivalent: 50 roads with 1 lane


The only approach I've yet to try would consist of a single queue and a single process function that contains a big switch-case to run the correct job function. Can anyone comment on a better approach they've used?

question

Most helpful comment

For future Googlers running Bull 3.X -- the approach I took was similar to the idea in https://github.com/OptimalBits/bull/issues/1113#issuecomment-440706459 . Used named jobs but set a concurrency of 1 for the first job type, and concurrency of 0 for the remaining job types, resulting in a total concurrency of 1 for the queue.

All 7 comments

Hi! You approach is totally fine, you need one queue for each job type and switch-case to select handler. You also can take advantage of named processors (https://github.com/OptimalBits/bull/blob/develop/REFERENCE.md#queueprocess), it doesn't increase concurrency setting, but your variant with switch block is more transparent.

Hi Stan :)

The named processors approach was increasing the concurrency (concurrency++ for each unique named job). Same issue as noted in https://github.com/OptimalBits/bull/issues/1113 and also in the docs:

However, if you define multiple named process functions in one Queue, the defined concurrency for each process function stacks up for the Queue.

So for a single queue with 50 named jobs, each with concurrency set to 1, total concurrency ends up being 50, making that approach not feasible.

And a queue for each job type also doesn't work given what I've described above, where if many jobs of different types are submitted at the same time, they will run in parallel since the queues are independent. This is not my desired behaviour since with 50+ queues, a worker could theoretically end up processing 50 jobs concurrently (1 for each job type).

So it seems the best approach then is a single queue _without_ named processors, with a single call to process, and just a big switch-case to select the handler.

Ross, I thought there was a special check if you add named processors with default concurrency (1), but it looks like you're right 馃槺

Not sure if that's a bug or a design limitation. #1113 seems to indicate it's a design limitation with Bull 3.x. Not sure if you see it being fixed in 3.x or not, since it may be considered a breaking change.

Bull 4.x concurrency being promoted to a queue-level option is something I'm looking forward to.

The design of named processors in not perfect indeed. I was also confused with this feature some time ago (#1334).

@rosslavery I think a switch case or a mapping object that maps the job types to their process functions is just a fine solution.

For future Googlers running Bull 3.X -- the approach I took was similar to the idea in https://github.com/OptimalBits/bull/issues/1113#issuecomment-440706459 . Used named jobs but set a concurrency of 1 for the first job type, and concurrency of 0 for the remaining job types, resulting in a total concurrency of 1 for the queue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

NicolasDuran picture NicolasDuran  路  4Comments

ianstormtaylor picture ianstormtaylor  路  4Comments

thelinuxlich picture thelinuxlich  路  3Comments

JSRossiter picture JSRossiter  路  3Comments

inn0vative1 picture inn0vative1  路  4Comments