Sidekiq: Force retry a Sidekiq job

Created on 14 May 2014 · 17Comments · Source: mperham/sidekiq

I have a Sidekiq job that may at times fail to complete successfully because the external API it consumes won't reply in a timely fashion. It's not an exception (per ruby) but a pretty standard mishap in life :).

Thus I want to make Sidekiq put the job into the retry queue _without_ having to throw a "proper" ruby exception (which would have a number of side effects, most prominently it'd start bugging me for a fix via my error handler service).

How should I go about it?

Source

fastcatch

👍9

Most helpful comment

@f3ndot yeah, we basically just do this but we only do it for exceptions that we know can happen / don't want to be alerted about. That is, we wouldn't want this retry indefinitely so we only do it for exceptions that shouldn't. For example we do it with Net::OpenTimeout in jobs that upload to S3, or AWS::CloudWatch::Errors::Throttling in monitoring jobs.

def perform(*args)
  begin
    do_work()
  rescue KnownException => e
    Logger.info {"#{self.class} caught #{e.inspect()}, retrying"}
    perform_in(2.minutes, *args)
  end
end

jonhyman on 9 Nov 2017

👍13

All 17 comments

When we have an error case we don't want to throw an exception, we just
reschedule the job in the future. It doesn't go into the retry set but it
accomplishes the goal.

Sent from my mobile device
On May 14, 2014 11:04 AM, "fastcatch" [email protected] wrote:

I have a Sidekiq job that may at times fail to complete successfully
because the external API it consumes won't reply in a timely fashion. It's
not an exception (per ruby) but a pretty standard mishap in life :).

Thus I want to make Sidekiq put the job into the retry queue _without_having to throw a "proper" ruby exception (which would have a number of
side effects, most prominently it'd start bugging me for a fix via my error
handler service).

How should I go about it?

Reply to this email directly or view it on GitHubhttps://github.com/mperham/sidekiq/issues/1704
.

jonhyman on 14 May 2014

Thanks, it sure is viable and simple. Still I don't really like it mostly because the API I'm calling may fail for other reasons that I'm not really getting notified of and thus rescheduling would go on forever. Then I need to build much of the plumbing (retry counting and such) that is already built in into Sidekiq. (I may propose factoring the reschedule stuff out into its own method such that it can be independently invoked.)

I'll experiment both with your solution and raising an improper true ruby exception and see which works out better.

fastcatch on 15 May 2014

👍2

I stumbled upon sidekiq-retries (https://github.com/govdelivery/sidekiq-retries) that looks vey much promising. This thread seems to have stalled, so I close it.

fastcatch on 17 May 2014

I posted a reasonable workaround here: http://stackoverflow.com/questions/19682594/sidekiq-airbrake-only-post-exception-when-retries-extinguished

That said, it would be nice to be about to return RETRY_LATER or something for a cleaner exit.

dhempy on 25 Aug 2015

@jonhyman Just so I understand correctly, when you say "reschedule" do you mean:

# app/controller/some_controller.rb
class SomeController < ApplicationController
  def index
    # ...
    SomeWorker.perform_async(remote_resource: RemoteClient.get(params[:id]))
    # ...
  end
end

# app/workers/some_worker.rb
class SomeWorker
  include Sidekiq::Worker

  def perform(remote_resource:)
    return self.class.perform_in(5.minutes, remote_resource: remote_resource) unless remote_resource.ready?
    # do the work once remote resource is ready
  end
end

I somewhat wish I could get the exponential backoff that retries afford with a Ruby exception, without the actual raising of exceptions that, as @fastcatch said, causes unwanted side-effects.

f3ndot on 9 Nov 2017

def perform(*args)
  begin
    do_work()
  rescue KnownException => e
    Logger.info {"#{self.class} caught #{e.inspect()}, retrying"}
    perform_in(2.minutes, *args)
  end
end

jonhyman on 9 Nov 2017

👍13

This is something that I would love to be able to do. We're reaching out to websites, so we want to use the existing retry functionality which triggers when there's an HTTP exception. But the HTTP exceptions are expected, so we want to be able to trigger a retry without the exception. The exceptions hurt our New Relic error rate and dirty Rollbar.

octalmage on 27 Mar 2018

@octalmage https://rollbar.com/docs/notifier/rollbar-gem/#exception-level-filters

mperham on 28 Mar 2018

For the record: if I remember correctly I ended up defining a new exception and monkey-patching Sidekiq to handle this exception differently (i.e catch and force retry but do not re-raise).

It was a long time ago and have moved on since but I can dig you up the code if you need it. (It'd take some time, though. And it's more than likely that Sidekiq has changed enough so that it cannot be used w/o modification.)

fastcatch on 28 Mar 2018

I’d prefer not to manage blacklists for New Relic and Rollbar, especially since it could silence real errors. I think I’m going to do something like @fastcatch but with Sidekiq middleware. Recast the know exceptions and catch them in the middleware. I’ll let y’all know if this works.

octalmage on 28 Mar 2018

@jonhyman thanks for the "reschedule" workaround suggestion.

If I understood well by using perform_in it will schedule a new job, so the retry counter will start from 0 (zero), on the new job.

In my case, I would like to also increment the retry counter when "rescheduling" the job, so the sidekiq_retries_exhausted hook would be called when max retries is reached.

Do you guys have an idea about how to do it, without needing to raise an error?

lucasdavila on 17 Apr 2018

👍12

@lucasdavila
hey did you found a solution to this.

Basically I want to do "If I understood well by using perform_in it will schedule a new job, so the retry counter will start from 0 (zero), on the new job."
and also to kill the current job so as it doesn't throw any exceptions.

darkthirst on 19 May 2018

👍2

Same, my case is I want to catch a known Exception and then throw it back to the Retry queue (my function relies on a 3rd party service that has intermittent problems) without it showing up in Sentry. So it's the case of yes, it's an error and I want it retried but no, I don't want it going to Sentry.

As is, I can't catch the error because then it won't go to the Retry queue so my Sentry is littered with these known 500s and I have to clean it every week or so.

rbucks on 21 May 2018

@rbucks configure Raven to not send the exception.

mperham on 22 May 2018

👍2

I just run into this problem also and temporary fixed it using a counter on the worker that randomly fails on a specific api call to prevent infinite loop:

perform(*args, counter = 0)
  ... 
  perform_in(10.seconds, *args, counter + 1) if error && counter < 3
end

user7788 on 25 Nov 2019

Hi @user7788, while trying your approach, I am getting this error.

NoMethodError: undefined method `perform_in' for Worker

amir-biybulatov on 30 Aug 2020

Hi @user7788, while trying your approach, I am getting this error.
NoMethodError: undefined method `perform_in' for Worker

self.class.perform_in(10.seconds)

user7788 on 30 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Sidekiq does not use config/database.yml settings as expected

paul-ylz · 4Comments

Routes -authentication with devise signing out user

nikhilm492 · 4Comments

Periodic jobs: adding arguments

agrobbin · 4Comments

How to check the logs in Heroku

rajcybage · 3Comments

Capistrano deploys restart Sidekiq in an old release

davidcelis · 3Comments