I suggest extending the current dependency feature to allow a job to depend on multiple jobs, not just one, so that it can be run only when all those jobs it depends on have succeeded.
Let me add that this opens up a world of possibilities in terms of the computation that can be expressed, including map-reduce.
One way I think multi-job dependency could be accomplished is as follows:
I haven't thought very rigorously about the edge cases, especially about possible race conditions.
Thanks for the suggestion, it makes a ton of sense. I also like your implementation suggestion. There are obviously details to figure out here, but I think we should change the feature to support multi-job dependencies before releasing the feature to the public in 0.4.0. @selwin, as the spiritual father of the dependency feature, what do you think?
Sounds like a good idea to me.
I'd like to suggest a few changes to @jchia's implementation plan:
set
instead of list
for dependency_ids
remaining_dependency_ids
on Job
. Being able to figure out a job's remaining dependencies will be really useful for monitoring purposes. Job is enqueued when remaining_dependency_ids
is empty.In terms of API, I have been thinking that the after
keyword may not be that descriptive after all. Since this is not yet released to public, I'm thinking of changing the after
argument to depends_on
:
q.enqueue(foo, bar=baz, depends_on=job1) # Depends on a single job
q.enqueue(foo, bar=baz, depends_on=[job1, job2]) # Depends on multiple jobs
Thoughts?
SMOVE
.depends_on
more than after
. Also, I like the two forms of specifying either a job or an iterable of jobs.Thinking of details here a bit more, I think we can do the following:
Suppose we have:
foo = q.enqueue(foo)
bar = enqueue(bar, depends_on=foo)
enqueue(qux, depends_on=[foo, bar])
This could record Redis keys like this, roughly: (job ids simplified to 1, 2, 3 for brevity's sake)
job:1['func_name'] = 'foo'
job:1:reverse_dependencies = { 2, 3 }
job:2['func_name'] = 'bar'
job:2:dependencies = { 1 }
job:3['func_name'] = 'qux'
job:3:dependencies = { 1, 2 }
Thinking about @selwin's second remark: maybe that's not too useful after all. You can simply look at the job:X:dependencies
set to find which are the job's (remaining) dependencies. They will always be the remaining job dependencies. When a dependency is done, it'll update the dependency list by removing that job from it. Keeping references to jobs that are done, or have been deleted, won't make sense, since the job key does not exist anymore, so details about that job are already lost by then.
I think we need to have a job:X:reverse_dependencies
set to find references to all jobs that need to have job X be removed from their job:Y:dependencies
set. SMOVE
isn't needed after all, I think. This needs to be an atomic operation, so after a job X finishes:
job:X:reverse_dependencies
set. For each job Y in there:job:Y:dependencies
set and remove X
from it.job:Y:dependencies
is now empty, enqueue Y
.job.cancel()
now becomes more complicated as it needs to cancel any dependent jobs recursively, too.
Sounds good?
Win~
On Mon, Sep 16, 2013 at 4:15 PM, Vincent Driessen
[email protected]:
Thinking of details here a bit more, I think we can do the following:
Suppose we have:
foo = q.enqueue(foo)
bar = enqueue(bar, depends_on=foo)
enqueue(qux, depends_on=[foo, bar])This could record Redis keys like this, roughly: (job ids simplified to 1,
2, 3 for brevity's sake)job:1['func_name'] = 'foo'
job:1:reverse_dependencies = { 2, 3 }
job:2['func_name'] = 'bar'
job:2:dependencies = { 1 }
job:3['func_name'] = 'qux'
job:3:dependencies = { 1, 2 }Thinking about @selwin https://github.com/selwin's second remark: maybe
that's not too useful after all. You can simply look at the
job:X:dependencies set to find which are the job's (remaining)
dependencies. They will always be the remaining job dependencies. When a
dependency is done, it'll update the dependency list by removing that job
from it. Keeping references to jobs that are done, or have been deleted,
won't make sense, since the job key does not exist anymore, so details
about that job are already lost by then.To me, this looks like we're working under the assumption that jobs are
always deleted after successful execution (they are not). Keeping jobs ids
around won't take much space so I think it's better if we do.
In terms of reverse dependencies, it's currently called a "waitlist", take
a look at job.register_dependency() and is stored as a separate Redis key
and not attached to the actual job. However, I agree that reverse
dependencies is a better name so I don't mind renaming it (or changing the
way it's currently implemented).
I think we need to have a job:X:reverse_dependencies set to find
references to all jobs that need to have job X be removed from their
job:Y:dependencies set. SMOVE isn't needed after all, I think. This needs
to be an atomic operation, so after a job X finishes:
- Walk the job:X:reverse_dependencies set. For each job Y in there:
- Find the corresponding job:Y:dependencies set and remove X from it.
- Iff job:Y:dependencies is now empty, enqueue X.
job.cancel() now becomes more complicated as it needs to cancel any
dependent jobs recursively, too.Currently job.cancel() doesn't cancel any dependent jobs, they simply won't
get executed since their dependencies are never completed (be it deleted,
or simply removed from queue without being deleted). I suggest leaving it
as it is for now.Sounds good?
β
Reply to this email directly or view it on GitHubhttps://github.com/nvie/rq/issues/260#issuecomment-24497416
.
The additional dependency info suggested by @selwin will be useful and convenient for monitoring. However, it seems that all that info is already available in the Redis and can be obtained by doing some additional more queries, specifically on the jobs being depended on to check their success status. Of course this means it will be more expensive and troublesome to generate the monitoring information of which job depended on is still not finished.
We get to a point where we need to decide on a trade-off between monitoring performance/convenience vs execution performance. I don't have any performance numbers, especially for large problems, so have no strong opinion on which way to go.
In my original representation, generating the monitoring info for a job that depends on n other jobs requires O(n) additional Redis queries, but remaining_dependency_ids
is not needed. This could be problematic for large n.
In @selwin's representation, the additional cost is from the mutation of remaining_dependency_ids
each time an upstream job succeeds. I don't know how Redis set operations scale for large sets, but I'm guessing it reasonably shouldn't be more expensive than the O(n) monitoring queries for the original representation.
So, if monitoring happens often enough, without concrete performance numbers, I'm slightly in favor of @selwin's representation over the original representation.
Maybe we also need to think of the implications if jobs are allowed to run and succeed more than once. If succeeding more than once is a possibility that we allow now or in the future, the dependency logic needs additional consideration.
To me, this looks like we're working under the assumption that jobs are
always deleted after successful execution (they are not). Keeping jobs ids
around won't take much space so I think it's better if we do.
Agreed, my assumption was obviously wrong here. We _could_ keep a set like orig_dependencies
and never touch that beyond the initial job creation, but I'm kind of -0 on that.
In terms of reverse dependencies, it's currently called a "waitlist", take
a look at job.register_dependency() and is stored as a separate Redis key
and not attached to the actual job. However, I agree that reverse
dependencies is a better name so I don't mind renaming it (or changing the
way it's currently implemented).
Nice.
Currently job.cancel() doesn't cancel any dependent jobs, they simply won't
get executed since their dependencies are never completed (be it deleted,
or simply removed from queue without being deleted). I suggest leaving it
as it is for now.
OK. My only concern here is that eventually RQ is going to leak keys into Redis that are never cleaned up. We have jobs that never expire and are never consumed by workers. Thoughts?
The additional dependency info suggested by @selwin will be useful and convenient for monitoring. However, it seems that all that info is already available in the Redis and can be obtained by doing some additional more queries, specifically on the jobs being depended on to check their success status. Of course this means it will be more expensive and troublesome to generate the monitoring information of which job depended on is still not finished.
Agreed. Having a static orig_dependencies
key could solve that. (As commented above.) It's a pretty simple solution and if the benefits are considered convenient by both of you, that might be useful to add, even though I'm -0 on it. (I may yet have to be enlightened.)
We get to a point where we need to decide on a trade-off between monitoring performance/convenience vs execution performance. I don't have any performance numbers, especially for large problems, so have no strong opinion on which way to go.
I do have strong opinion. I think one of RQ's core project values is simplicity of implementation and it has always been more important than performance. (See also, my FOSDEM'13 talk on the topic.) I'm not saying performance isn't important, it just isn't _most_ important. RQ contributors have shared that desire to keeping the implementation simple, predictable, and maintainable.
In @selwin's representation, the additional cost is from the mutation of remaining_dependency_ids each time an upstream job succeeds. I don't know how Redis set operations scale for large sets, but I'm guessing it reasonably shouldn't be more expensive than the O(n) monitoring queries for the original representation.
Agreed. I think we should roll with @selwin's suggestion here.
So, if monitoring happens often enough, without concrete performance numbers, I'm slightly in favor of @selwin's representation over the original representation.
Me too.
Maybe we also need to think of the implications if jobs are allowed to run and succeed more than once. If succeeding more than once is a possibility that we allow now or in the future, the dependency logic needs additional consideration.
I don't see why we should support that. To me a job is a simple function execution. A single function can't succeed twice, so neither should a job. If you want to execute a function twice, I suppose you should schedule two invocations.
I don't see why we should support that. To me a job is a simple function execution. A single function can't succeed twice, so neither should a job. If you want to execute a function twice, I suppose you should schedule two invocations.
Some task queue systems have the concept of periodic recurring jobs. If python-rq ends up supporting recurring jobs by reusing the same job, for whatever reason, then this interaction with dependencies becomes an issue.
Sent from my phone
On Sep 17, 2013, at 2:06 PM, Joshua Chia [email protected] wrote:
I don't see why we should support that. To me a job is a simple function
execution. A single function can't succeed twice, so neither should a job.
If you want to execute a function twice, I suppose you should schedule two
invocations.
Some task queue systems have the concept of periodic recurring jobs. If
python-rq ends up supporting recurring jobs by reusing the same job, for
whatever reason, then this interaction with dependencies becomes an issue.
I haven't thought too deeply into this, but I think there are two options:
I'm leaning towards option 1 to keep the implementation simple. If someone
needs to periodically enqueue two jobs with dependencies (A & B), they'll
have to write a single periodic job (X) that enqueues (A & B).
β
Reply to this email directly or view it on
GitHubhttps://github.com/nvie/rq/issues/260#issuecomment-24567950
.
Both, implementation wise, I'd prefer to not implement too many checks
during enqueue and simply document this limitation (as long as RQ doesn't
error out, obviously).
Thoughts?
Sent from my phone
On Sep 18, 2013, at 6:51 AM, Joshua Chia [email protected] wrote:
β
Reply to this email directly or view it on
GitHubhttps://github.com/nvie/rq/issues/260#issuecomment-24631567
.
RQ doesn't provide any recurring job now, does it? If that's the case, then there's nothing to check now.
In future when there is an actual need to check, I suppose for a job A that is not supposed to be depended on by another job we can make the waitlist key a scalar so that an attempt to make another job depend on job A fails when trying to sadd to the waitlist scalar key. This way, no extra cost is incurred for the successful case. (I haven't actually tried to sadd to a scalar, but I think it should fail.)
Sent from my phone
On Sep 18, 2013, at 7:51 AM, Joshua Chia [email protected] wrote:
RQ doesn't provide any recurring job now, does it? If that's the case, then
there's nothing to check now.
I maintain one in a separate project here:
https://github.com/ui/rq-scheduler
The plan is to bring it into RQ proper, check out the
"scheduler-integration" branch on this repo. So implementation wise, we
already have something concrete and used in production (via the separate
rq-scheduler package).
This thread also has some relevant discussions about the scheduler
implementation: https://github.com/nvie/rq/issues/202
In future when there is an actual need to check, I suppose for a job A that
is not supposed to be depended on by another job we can make the waitlist
key a scalar so that an attempt to make another job depend on job A fails
when trying to sadd to the waitlist scalar key. This way, no extra cost is
incurred for the successful case. (I haven't actually tried to sadd to a
scalar, but I think it should fail.)
β
Reply to this email directly or view it on
GitHubhttps://github.com/nvie/rq/issues/260#issuecomment-24633905
.
How about prerequisites
instead of dependency_ids
, and remaining_prerequisites
instead of remaining_dependency_ids
. I'm going to take a stab at implementing this feature.
Since we already use the terms "dependents" and "depends_on", I think it's better if we stick with "dependency_ids" to keep the terminologies consistent. I won't have much time to implement this in the next few days so feel free to work on it :)
I concur that it would be nice if we could just use the prefix 'depend' for all these names, but unfortunately, the word 'dependency' really refers to the relationship between the dependent and the 'dependee', not the objects themselves that are in that relationship. I have not found a single definition where 'dependency' actually means 'dependee'. Unfortunately, 'dependee' is not a proper English word. The closest thing I can think of is 'prerequisite', and the meaning is quite clear.
However, if you insist, I can easily change it back to 'dependency'. I just think that 'dependency' does not accurately describe the object that is depended on. On the other hand, if you agree with me, perhaps we should also change the argument name depends_on
to prerequisites
. I don't know how much backward-compatibility that will sacrifice, but I suppose not much since depends_on
is itself rather new.
This issue was opened 2 years ago. Any ETA for this feature? :)
This pull request looks dead. I really need multi-job dependency, too, and I've been using my own branch based on the pull request.
Is there just no intention to add this feature?
I'm just throwing in my 2 cents here. But doesn't adding multi-job dependencies add a lot of potential complexity and perhaps cross of into a different problem domain such as workflows?
I'm not sure what others use cases are, but I usually do simplified workflows by chaining jobs together, I.e one job creates another and returns that next jobs ID
Thanks,
Jason Mills
On Nov 6, 2015, at 7:55 PM, Joshua Chia [email protected] wrote:
This pull request looks dead. I really need multi-job dependency, too, and I've been using my own branch based on the pull request.
β
Reply to this email directly or view it on GitHub.
I think rediscussing the actual need for this is unnecessary. It has been stated multiple times over the last few years. Many people are waiting for this and have custom branches just for this feature. I sure know that I am.
Please let us continue working on this and finally finish it.
Sent from my iPhone.
On 07 Nov 2015, at 05:02, Jason Mills [email protected] wrote:
I'm just throwing in my 2 cents here. But doesn't adding multi-job dependencies add a lot of potential complexity and perhaps cross of into a different problem domain such as workflows?
I'm not sure what others use cases are, but I usually do simplified workflows by chaining jobs together, I.e one job creates another and returns that next jobs ID
Thanks,
Jason Mills
- sent from mobile.
On Nov 6, 2015, at 7:55 PM, Joshua Chia [email protected] wrote:
This pull request looks dead. I really need multi-job dependency, too, and I've been using my own branch based on the pull request.
β
Reply to this email directly or view it on GitHub.β
Reply to this email directly or view it on GitHub.
In my use case, my jobs don't have a linear relationship. Some jobs use the
output of multiple jobs as input and I express that as dependence on
multiple jobs. If I express everything as a sequence of jobs with
single-job dependency, just because one job fails everything downstream
cannot start.
RQ with the multiple-job is the only existing Python tool I know that can
express this sort of job scheduling requirement.
On Nov 7, 2015 12:02, "Jason Mills" [email protected] wrote:
I'm just throwing in my 2 cents here. But doesn't adding multi-job
dependencies add a lot of potential complexity and perhaps cross of into a
different problem domain such as workflows?I'm not sure what others use cases are, but I usually do simplified
workflows by chaining jobs together, I.e one job creates another and
returns that next jobs IDThanks,
Jason Mills
- sent from mobile.
On Nov 6, 2015, at 7:55 PM, Joshua Chia [email protected]
wrote:This pull request looks dead. I really need multi-job dependency, too,
and I've been using my own branch based on the pull request.β
Reply to this email directly or view it on GitHub.β
Reply to this email directly or view it on GitHub
https://github.com/nvie/rq/issues/260#issuecomment-154615698.
I'll have a go at this after the christmas holidays if nobody else is working on it?
As far as I know, no one's currently working on it :)
Sent from my phone
On Dec 16, 2015, at 3:47 AM, Christophe Olinger [email protected] wrote:
I'll have a go at this after the christmas holidays if nobody else is working on it?
β
Reply to this email directly or view it on GitHub.
Hey there, I took @Kisioj branch and I made a few changes on top of it then rebased against current master. Will this be a fair start? I haven't tested so much, but I'll go a little bit deeper tomorrow.
I think this PR was almost ready when I stopped working on it so it's a good place to start from.
However, I'm also not against someone else rewriting the implementation. Who knows it could be better? ;). Some parts of the RQ codebase have also changed quite a bit too.
Sent from my phone
On Dec 17, 2015, at 7:41 AM, Javier LΓ³pez [email protected] wrote:
Hey there, I took @Kisioj branch and I made a few changes on top of it then rebased against current master. Will this be a fair start? I haven't tested so much, but I'll go a little bit deeper tomorrow.
https://github.com/jlopex/rq/tree/javi/multi-dependencies
β
Reply to this email directly or view it on GitHub.
TLDR; it's up to you ;)
Sent from my phone
On Dec 17, 2015, at 7:41 AM, Javier LΓ³pez [email protected] wrote:
Hey there, I took @Kisioj branch and I made a few changes on top of it then rebased against current master. Will this be a fair start? I haven't tested so much, but I'll go a little bit deeper tomorrow.
https://github.com/jlopex/rq/tree/javi/multi-dependencies
β
Reply to this email directly or view it on GitHub.
@selwin it feels that it's based on a subset of your changes. I'll take a deeper look tomorrow and let you know about it. :+1:
I'm back again, I did a few run tests and all basic features seem to work as expected, plus I was able to run it on multi-queue with multiple-dependencies :+1: .
What I haven't seen is the cancel/delete implementation, @selwin was this developed on your branch? If you allow me I can try to port/add it.
I think this will be a big change and could break easily, so I think we should invest some time to write tests to validate the feature.
Yes, it will be a big change indeed.
I don't think I implemented job.cancel() . This is the multi-dependencies branch I was working on: https://github.com/nvie/rq/pull/279
Best,
Selwin
On Dec 18, 2015, at 5:18 AM, Javier LΓ³pez [email protected] wrote:
I'm back again, I did a few run tests and all basic features seem to work as expected, plus I was able to run it on multi-queue with multiple-dependencies .
What I'm not I haven't seen is the cancel/delete implementation, @selwin was this developed on your branch? If you allow me I can try to port/add it.
I think this will be a big change and could break easily so I think we should invest some time to write tests to validate the feature.
β
Reply to this email directly or view it on GitHub.
@jlopex. Do you have any news on this? I thought you were almost done? Is the current implementation usable?
Thanks for your work on this issue.
@olingerc you can take a look to the branch, it works, I did a few tests and basic multidependency seem to work.
https://github.com/jlopex/rq/tree/javi/multi-dependencies
I've been very busy with other non related stuff and I didn't have any chance to finish the changes I had in mind.
1) Add more tests as this change can break a lot of things.
2) Improve the implementation / pipeline all possible actions.
3) Implement the cancel / delete job when multiple dependencies.
@jlopex feel free to open a PR when you're done. RQ's internals have changed quite a bit from the first time I attempted the multi-dependencies feature so this would very likely require a significant effort to push through and require a few set of eyes to ensure it's stability.
I'm also very interested in this feature. I currently use luigi
for data pipelines but I think rq
is easier to reason about and more intuitive for newer developers. The addition of multiple dependencies to rq
would make it easier to drop luigi
.
@jlopex
Minor note on the implementation: From what I see, the primary responsibility of queue.enqueue()
and queue.enqueue_call()
are to create job instances whereas queue.enqueue_job
does the actual "work" of adding the job to a queue.
In that sense, the work of adding dependencies to the queue might better fit in the queue.enqueue_job
method.
For people like me that manually create jobs (easier job customization) and rely on the queue.enqueue_job
method, we'd have access to the multiple dependencies functionality. It's easy enough for me to modify the code but this might help someone else in the future.
If I get some extra time, I'll take a look at the other pending issues and I can potentially assist with a pull request.
Edit: On second look, it looks like the existing dependency implementation was already implemented in queue.enqueue_call()
.
On second look, it looks like the existing dependency implementation was already implemented in queue.enqueue_call()
One job can have multiple dependents but only one dependency.
@jlopex Thank you, the multiple dependencies implementation is working well on my end.
All, thanks for the great work on rq.
@jlopex @nvie I I took @jlopex's multi-dependencies and did a large cleanup. I was a little frustrated with the state of the code, so I did a large refactor to ensure we don't lose jobs during movements between states. Basically almost everything should be implemented as a transaction otherwise there is no way to know whether ctrl-z is going to leave jobs in an invalid state. This should also handle and race conditions due to additional dependencies being added while we are looking at the parents.
I went a little crazy and was considering just forking because of the size of the my changes but I thought I should see if you guys were at all interested in re-architect'ing to ensure we don't have race conditions: Anyway here in the PR (just into my own master, since it is still WIP) https://github.com/MicahChambers/rq/pull/1/
@MicahChambers Hello.
As far as I can see transaction
decorator and RQConnection
are the main changes you introduced.
Currently I port rq
on asyncio
stack. I met similar problems since pool, connection and pipeline are explicit parts of aioredis
library. I chose different approach for pipeline refactoring.
You may be interested.
@MicahChambers @proofit404 Nice that some people could put some extra love on that branch, I'm pretty busy with non opensource stuff but I can take a look to your changes, maybe with a little bit of extra work everybody can take profit of your effort.
Cool. Yeah, I feel like RQ kind of needs some love. A lot of issues aren't getting addressed and it really does need to be refactored. @proofit404 Cool I'll take at look at your PR. I'm not familiar with aioredis, I'll check it out. I'm going to try to do some stress testing this weekend. My understanding was the @nvie considered thought the code had too many race conditions, which I think he is right about. I'm hoping by pipelining everything we can make all the operations atomic.
Hey @selwin yesterday we had time to work a little bit more on the multi-dependencies support, specially on the waterfall cancelation of all related jobs.
After hacking a little bit on the code, we don't have a clear understanding of what are the "semantic" differences between "cancel" and "delete" job actions.
In the current implementation when a user "cancels" a job this job is deleted on the Queue structure, then is ignored by RQ logic (there's no reference to the job key on redis). but the job key will stay forever on redis. Only when a user "deletes" a job both RQ logic and Redis job entry deletion happens. Is this made on purpose? If so, do you know why?
Maybe we should merge both cancel and delete actions in order to simplify and keep the redis db as tidy as possible?
Another option I see, is to set a expire value (ttl) to the job when the job is cancelled so the job stays at redis but will expire and once it happens redis will take care of deleting the key.
For example, you python process have a job key. It can cancel and then enqueue this job without holding whole job spec. If you delete this job, you can't restore it easily. If you want some jobs removes automatically after some period of time, use ttl option with enqueue procedure.
@proofit404 I see your point, so RQ keeps the Job entries "on hold" out of any queue.
So then developer has to "reassign" the Job to a valid Queue in order to "reactivate" the Job, I've never did this before but I understand other developers may need this.
The risk of this behaviour is that some developers may not "reassign" this Job to any Queue and then it will remain on Redis until it is flushed, therefore wasting unnecessary resources.
The risk of this behaviour is that some developers may not "reassign" this Job to any Queue and then it will remain on Redis until it is flushed, therefore wasting unnecessary resources.
In this case they must use ttl
argument to be sure job will expire after some logical period.
I've been using (and recommending) rq for over 4 years now. It has never let me down :-).
Is somebody still working on this issue? Or does somebody have a working branch?
I apologize for nagging but I'd really, really like to ditch celery and multi-dependencies are required for that. If it is possible to get the multi-dependencies
branch (or some other working base) merged with a reasonable amount of work I'm totally willing to do the work. Is there any chance of getting this done without pipelining everything? I have had only a short look at the code base so I'm not intimately familiar with potential break points.
+1
Sorry, I know plus 1s are not really helpful, but I really need this feature as well. I would also be ready and available to help!
Ok, I took a look at @jlopex mutli-dependency branch and I was able to merge it into the current master. My fork is called mymaster-multi.
Unfortunately it has some additional things that I need for my project. As soon as I get some time (and I have used it in production for a bit) I'll create a clean PR.
I have to stress that jlopex did all the work (thanks for that!) so maybe he wants to prepare the PR himself?
Hey @olingerc feel free work on this branch as it was yours. A year ago we pushed a little bit on it, but unfortunately we got busy with a lot of non open-source work.
At that point we did not feel confortable to make a PR, a lot of testing was missing to validate those changes.
I would love to see of all this on master soon π
@jlopex Thanks a lot!
Here is your branch merged with current master: (https://github.com/olingerc/rq/tree/multi-deps-olingerc)
The current implementation works in my system and I use some really complex dependency graphs. I now need to work (a lot) on the tests and think about the cancel/delete implementation.
P.S. I removed the on_cancel callback since it does not really fit into this topic.
I've worked quite a lot on the PR and think that it is almost ready. However I've hit a road block for which I need some input to get past. The thing is, when the queue enqueues dependents of a finished job it all works fine when there is only one worker. As soon as there are more it can happen that two workers go over the same list of dependents at the same time. In very rare cases (around 1 in 100 on my system) both workers would enqueue the same job which of course is not good. I thought using pipe.watch would avoid this problem, but as the code is right now it definitely happens.
In other words, just to make it clearer, a job depends on two independent jobs which on their side do not depend on anything. Each of those jobs get finished at the same time by a worker (rare but it happens) and they both now want to enqueue the dependent third job. Since they do not know anything about each other, the third job gets enqueued twice!
I thought about adding a check to see wheteher the job isn't already in the queue before adding it but since this all is concurrent it does not seem to work. How can I lock this so that only one worker can access it?
Could somebody please help me find a solution? The offending code block is here plus 3 lines before.
@olingerc I also though the pipe.watch will take care of it. I'll take a look to it, probably we need to make a check within the transaction to avoid this duplicated job enqueue calls.
@jlopex. Thanks for the help! Today I went back to using your implementation of enqueue_dependents but unfortunately I still get the same situation. I'll try to come up with a test for this specific corner case.
I've added a test case to my branch which fails because of this error. In essence, a job (which depends on two in parallel running upstream jobs), writes a file with a random name into a folder. If there are two files in the folder, it means that the job was enqueued twice.
I give it 100 rounds but usually it fails around 10. This also nicely shows that it is a difficult problem to reproduce since sometimes it takes 20 or 30 passes for the test to fail.
NOTE: This means that the current state of the PR fails "by design" in order for us to test solutions.
Okay, I think I found a solution. Please check the updated PR.
Unfortunately this made an old issue re-rise: #739. Will check that now.
Update: Unfortunately it still happens, but now it is much rarer. I have to run the tests 10 times before the error occurs. Tricky business...
P.S: I managed to solve the issue with #739
@selwin @jlopex I've tried several ways on approaching this but have as of yet failed to find a solution. I'd love some input and wild ideas :-)
@olingerc I'm new to the project but I'm interested in helping unblock this.
I thought about adding a check to see wheteher the job isn't already in the queue before adding it but since this all is concurrent it does not seem to work. How can I lock this so that only one worker can access it?
Might it be possible for dependent jobs might be able to safeguard themselves against this issue? If you accept that a dependent job might get queued twice and be running concurrently in two different workers, the job could write it's job_id and the worker_id somewhere external (e.g. Redis or a Database). If a job finds a job_id that matches its own but the worker_id differs then it would know it is a duplicate invocation and commit suicide somehow. This check would obviously need to happen before the execution of the body of the job starts. Would having a duplicate die be possible while allowing the first invocation of the job to continue?
I'm sure my understanding of the problem is terribly naΓ―ve so please tolerate my general ignorance here.
@danielbrownridge Nice to get some input on this. What you are suggesting is actually what I am doing in my application. The problem is that it is "outside" of rq. These duplicate job enqueues should be avoided by rq itself so we should try to focus on the race condition and solve it. Maybe you could try to use redis to test your solution in rq itself?
I think @nvie is prepared to help, but we should first help get #427 merged. My focus has shifted to another project ATM but I wil surely come back to this in a couple of months.
ah, after reading the last few comments this would solve my duplicate job problem as well.
Getting back to this, If I understand correctly @olingerc 's branch is on valid path, but is only blocked by #427 ?
I'm picking this up again. Trying to get a grip on #427 first.
@olingerc that would be helpful, thanks!
https://github.com/rq/rq/pull/916 has been created to replace https://github.com/rq/rq/pull/427.
Working first on the delete_dependents part was a good idea.
is this live already?
I tried the depends_on argument with a tuple and( as suggested by the comment above) list but it does not seem to work.
I want to send a notification from my Backend to the frontend once a stack of interdependent jobs is finished. How would you recommend to implement it?
Hi all, I'm new with RQ.
Any news about multi-dependency?
I am still very much interested but do not have the time currently to
review this. Also, There were other things to do first like reviewing the
registries? I hope that other people can take this up?
On Wed, Aug 8, 2018 at 9:27 PM, lucianobarcaro notifications@github.com
wrote:
Hi all, I'm new with RQ.
Any news about multi-dependency?
β
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/rq/rq/issues/260#issuecomment-411523766, or mute the
thread
https://github.com/notifications/unsubscribe-auth/AE0BdURuVwbgwSup5NfD6BJo2Wpc1tuPks5uOzuwgaJpZM4BAAWj
.
Seems that dramatiq has a way for expressing that.
Didn't test it yet, but I sure as hell want to get away from Celery's buggy Canvas after today.
EDIT: of course, I was wrong
Much needed feature. Now that #427 and #916 is resolved, what is the remaining blocker?
Any update on this? I am running rq in three projects, all of which would profit a lot from this feature. For the last two years I always stop myself from migrating to other frameworks like ray, thinking hey this idea is around since 2013 this should arrive any minute. Please don't get me wrong, I understand that rq is a free-time project and I have the greatest respect for your work.
Could you please give me an update on the situation or any ETA so I can carefully evaluate the migration process.
@neuronflow not sure of the current state, but we did reach a point were we're struggling with some jobs being duplicated and since then it's true that we've not push further for this feature. Maybe you can try to backport the development branch we did and see if it works now with the latest changes.
I am interested in this improvement.
In the meantime, I am using this workaround.
Look forward to this very helpful feature! Hope this can be landed soon.
I would also love this feature. In the meantime, I created a manager to complete jobs with multiple or tree like dependency.
https://github.com/crispyDyne/rq-manager
Might be a bit of a hack, but it works for me!
Hmm. Seems this has been dropped again. Shame. I am still available for
testing if that helps.
This PR was merged https://github.com/rq/rq/pull/1259 a few months ago, and this is the last significant bit of feature needed for multidependency. The only thing missing now is modifying queue.enqueue()
to accept queue.enqueue(my_func, depends_on=[job_1, job_2])
.
Hi, I too would love to see multi-dependencies make it into a release.
Is it really just down to allowing lists of dependencies to be passed to the enqueue
functions? And maybe adding some unit tests? (Has the duplicated jobs issue been solved?)
In an effort to help get the ball rolling again, I have made a pull request (#1397) that allows depends_on
to take lists and tuples (as well as single jobs or job ids). Being completely new to this project, I really don't know if this is along the lines of what you're looking for or not. But hopefully it's helpful.
Most helpful comment
This PR was merged https://github.com/rq/rq/pull/1259 a few months ago, and this is the last significant bit of feature needed for multidependency. The only thing missing now is modifying
queue.enqueue()
to acceptqueue.enqueue(my_func, depends_on=[job_1, job_2])
.