Bull: How to processing multiple jobs in a batch?

Created on 9 Aug 2016  Â·  11Comments  Â·  Source: OptimalBits/bull

Assuming we have many jobs of insert data to a database in the queue, instead of process each jobs one by one, which is not very efficient, I expect to be able to grab multiple jobs, and insert those data to database in one go by using database's bulk insert command. How can I do that?

Is there anyway to fetch all jobs(or a given number of jobs) from queue, process them and then mark them as done? Thanks.

All 11 comments

sorry but I do not understand the question. Can you rephrase or provide some code to clarify?

@manast the question is as clear as it gets. he wants to pickup jobs in batch so that he can insert them into database at once using bulk insert.

@twang2218 What did you end up doing for this? I'm looking at doing almost exactly the same — I want to pull off n jobs from a queue, and do a batch write (in my case to a REST API) for the data in all n of those jobs.

Can't you use a parent queue to monitor a child queue? Another option would be to add all the n tasks into one job?

@grantcarthew The tasks are coming into our service from all over the place (mobile users, web threads, etc.) so I can't really put them all into one job… until it's time to batch them out to this other service (where we're attempting to respect request rate limits). So yeah, a parent-child queue situation is where I think I've landed.

I think I'm just having a hard time getting my head around how and when .process() is called on the child queue wrt the parent queue. Should it be done inside the parent's process handler? Can it be called over and over, along with pause/resuming periodically?

I don't know bull queue well sorry. I just watch this queue for help and ideas for my queue: https://github.com/grantcarthew/node-rethinkdb-job-queue

@woodardj the process function can only be called once, you will get an exception otherwise. The reason is that the call to this function just defines the process function for the queue, and only once can be defined at any given time.
Not sure I undertand your requirement, but is it that you want to be able to call queue.add() with an array of jobs so that they are placed faster in the queue than calling add for every job?

@manast The opposite, sort of — I don't need to work every job as it's added, but I want to grab n jobs off at a time (or one at a time, and I'll just do that n times).

Put another way: Multiple threads/sources are calling .add on a queue; instead of using .process to register a handler for each one, I want to pop jobs off the queue manually.

I see a few undocumented methods in the code, some outside the "private" section, but I'm not sure which ones would be most correct, and most safe from deprecation. I've managed to get it working 90% by using an array in memory, but that loses all of the bull features I love on my other queue (atomicity, persistence, etc.)

Just for reference, the same idea is discussed in the (currently open) issue #751 :)

He want to runmany processes concurrently, how to do that?

Bull.js has concurrency which will run parallel jobs but its not efficient untill and unless you import job from another file. As doing that each concurrent process will run in its own thread else it will share the thread and all concurrent processes will be much slower.

Was this page helpful?
0 / 5 - 0 ratings