put <pri> <delay> <ttr> <bytes> <id>\r\n
<data>\r\n
This would allow to implement failovers.
I can't see the reason for specifying a custom job ID within a queue.
And I really see a reason and really hope this will implemented!
Here is an example:
Please implement this!! :+1:
You have number of repeatable job that you want to execute every day once but you want it handle in the queue. Today you have delete first the whole tube, that's ugly (by iterate and delete the queue items). With this solution it's lot easier and flexible!
This can be done easily by checking if a daily job is exucted before putting it into or after reserving it from beanstalkd.
And, why not /etc/cron.daily/?
Allowing to specify custom job IDs would allow me to implement sort of HA and Failover on the library level:
Unless I'm missing something and current protocol allows for better solutions.
PS. Adding job with ID that already exist should trigger an error
How do you maintain your job ids without central point of failure? In general, in a distributed system you cannot have a job which runs exactly once (http://bravenewgeek.com/you-cannot-have-exactly-once-delivery/). Beanstalkd provides at most once and we do not want to change that. You can have at least once with more than one beanstalkd and some distributed locking or similar (some people use memcached or a db for that purpose).
Maintaining job ids is out of the scope of this ticket. What I need is to be able to specify my own job ID.
This would allow to implement failovers.
@rzajac Could you elaborate a little on this? I'm not exactly sure about your use-case.
@JensRantil I explained it a little bit here: https://github.com/kr/beanstalkd/issues/264#issuecomment-104544535
This feature request breaks BC for protocol v1.x
@rzajac Ah, sorry. Missed that. Thanks!
I'm going to be the devil's advocate here and shoot down some of the use cases :-)
@aight8 wrote:
You have number of repeatable job that you want to execute every day once but you want it handle in the queue. Today you have delete first the whole tube, that's ugly (by iterate and delete the queue items). With this solution it's lot easier and flexible!
There are various approaches to regular cronjobs:
/etc/cron.daily putting your daily job on the queue.You can check a specific jobs state
Valid point. Workaround is to store the job id in another datastore.
and so on...
Not an argument. Carry on. ;)
@rzajac wrote:
Allowing to specify custom job IDs would allow me to implement sort of HA and Failover on the library level:
I really don't think this is a good idea. Doing double writes independently to two queues is bound to eventually make them diverge and have different state. There are all sorts of race conditions. Examples; One TTL times out on one queue and not on the other. Another problem is that you currently can't reserve a specific job. You can delete a specific job, but then you can't be sure that no other consumer has reserved it etc.
The _real_ solution here would be to use something like Zookeeper's ZAB or probably even better RAFT algorithm. All writes would go through master and a majority would need to acknowledge each state change. This would obviously introduce complexity, new failure modes and additional latency to every operation.
@rzajac @JensRantil I've also run into this.
The way I do it right now is put the external-id in Redis like @JensRantil suggested and save a mapping to the Beanstalkd generated id. Then, I use it later to cancel, query the job etc. In a way, having Beanstalkd take a custom Id would eliminate the need for an extra piece of infra.
This will also help with self-throttling the job on the client side as well :) Simply checking if the job is already there allow us to avoid sending another one or just increase delay time.
Okay, I'm going to close this issue as a no-go. Reasons are as follows:
Please open a new issue describing your _use-case_ if you believe if your use-case can't be worked around using the above approaches.
Most helpful comment
@rzajac @JensRantil I've also run into this.
The way I do it right now is put the
external-idin Redis like @JensRantil suggested and save amappingto the Beanstalkd generated id. Then, I use it later to cancel, query the job etc. In a way, having Beanstalkd take a custom Id would eliminate the need for an extra piece of infra.