Sidekiq: heroku worker autoscaling

Created on 27 Jun 2012  Â·  34Comments  Â·  Source: mperham/sidekiq

Hi

Given Heroku deployment scenarios, have you given any thought to integrating or including the ability to autoscale the sidekiq worker, as in fire it up and tear it down deepening on the concurrency load.

I see that that HireFire is available as open source or hosted on Heroku as a plugin and is currently designed to support Delayed Job and Resque.

I also saw this... http://verboselogging.com/2010/07/30/auto-scale-your-resque-workers-on-heroku

This might be a nice feature to at least be able to support, at least for activity that is not scheduled to perform significantly in the future.

When a request is queued, it perhaps can see if sidekiq is running and if not fire it up, and when sidekiq identifies that all requests have been processed shut the worker down.

Just a thought...

Thanks!
-Simon

PS: I think I'm getting the hang of how to use sidekiq properly...and have changed some of my approach accordingly.

Most helpful comment

Eight years later... I was looking for a bit more governance around the quiet/shutdown process and wanted a reporting UI, so spun this foundation off into a newer plugin configuration: https://github.com/gmac/sidekiq-heroku-autoscale. Thanks for paving the way here, @JustinLove.

All 34 comments

Auto scaling is needed for resque because its workers are so heavyweight / resource intensive. Sidekiq workers are cheap so just create as many as you think you might need. You might have 4 resque workers but 50 sidekiq workers, for example.

On Heroku one use case is between scaling 0 and 1 sidekiq processes which doesn't relate to how heavy one process is. If for example your users are all in a single time zone at office hours you could save some money by shutting at quiet hours. Maybe this is something for a middleware than sidekiq proper though.

I think that's a reasonable use case for a cron job running heroku scale. Sidekiq certainly can't start itself. :-)

It could from client middleware.

Yeah, that's a cool hack. Have Rails start sidekiq if it's not already when pushing a message!

Internesting conversation, thanks for keeping it going...

BTW: if think worker is an overloaded term when it comes to heroku (procfiles) vs. sidekiq. From a deployment perspective, especially in the Heroku environment, it would be good to be able to shut down the Heroku Dyno that is running sidekiq when there is nothing in the queue its managing. Why spend $35 a month if its only needed for a fraction of that, given costing per second? Some of us are on a tight budget...

I like the sidekiq middleware concept. But does that not sit on the sidekiq processor side of things rather than on the Rails side ?

I assume the hack/hook would need to be in the worker when process_async is called to start sidekiq up and in a middleware post process hook to shut it down ?

As an aside and related, I need to look at the redis connection pooling more too, but fond the comments in the Wiki a little confusing ? If I had multiple workers I'd need to define this once, can one also have rails use the same pool and may out total redis connections to 10 (or the max supported by the redis to go add-on version)...

@acds sidekiq has both client and server middleware chains. The client chain can be used to start sidekiq and the server chain to stop.

For the connection pool you can call Sidekiq.redis do |con| end in your own code to use connections from the pool but that conversation is off topic for this issue.

@betelgeuse per client and server middleware,I can take a crack at this. But some initial hints would be great:

1) How do I detect if sidkiq is running from the client side ?
2) How do I detect that all the queues are empty on the server side?

Thanks

You add client middleware in Sidekiq.configure_client:

Sidekiq.configure_client do |config|
  config.client_middleware do |chain|
    chain.add MyMiddlewareClass
  end
end

See the Middleware wiki page for details.

def empty?(name)
  Sidekiq.redis { |conn| conn.llen("queue:#{name}") == 0 }
end

For checking if sidekiq is running one option could be the resque:workers key.

@mperham Thanks...

I got the first bit from Wiki as indicated...

I think I need to enumerate

empty?

for all queues to trigger a shutdown. I just need to identify the queues...

On the client I see

Client::registered_queues

but not on the server side. This could also be defied on the server side though and it would have the same result ?

    def self.registered_queues
      Sidekiq.redis { |x| x.smembers('queues') }
    end

I guess for the start up one can allays just start the workers ? as running

heroku ps:scale worker=1

multiple times has no effect it seems.

BTW: I assume firing up multiple sidkiq process is not advisable, to solve the different queses with different concurrency problem ?

There's no reason you can't run multiple sidekiq processes. One per core is a reasonable plan.

Sidekiq.redis { |x| x.smembers('queues') } will give you the list of known queues.

Your trouble will be identifying if a sidekiq process is running. kill -9 means you can't reliably determine this without a heartbeat. You'll probably need to add server-side middleware that implements a heartbeat in redis (e.g. update a timestamp in redis every time you start to process a message). The client middleware can examine the heartbeats and determine if a sidekiq process is up or not with some confidence.

We can get a lot by simply asking Heroku how many workers it is running. For server side middleware, we know there is at least one process (I run therefore I am). My challenge is figuring out whether that process has busy workers, when some of the jobs can take a long time - the job-started timestamp is insufficient. It appears that the worker lists (and therefore counts) are private right now.

any progress hacking it?

I have a basic system working, but I had to set up separate single-thread worker for the long-running jobs. So I guess it's a question of making time to extract the code to a gem.

For my needs, a gist would work just as well! Did you hack in auto-scaling just for Heroku, or also for local dev (testing)? If you didn't do local scaling, I'd be happy to take that on.

I generally run the background process in a separate Guarded Foreman, so I haven't done anything with local. I did make the scaler a separate object, so it should not be too hard to add.

Quick and dirty code dump: https://gist.github.com/3797439

Thanks, Justin. That will give me a running start. I'll share what I've got when I'm done -- I expect that I'll be scaling to 'n' workers, instead of just one, with environment settings to control the scaling calculations, which might be an interesting feature to some.

On Sep 27, 2012, at 6:23 PM, Justin Love [email protected] wrote:

I generally run the background process in a separate Guarded Foreman, so I haven't done anything with local. I did make the scaler a separate object, so it should not be too hard to add.

Quick and dirty code dump: https://gist.github.com/3797439

—
Reply to this email directly or view it on GitHub.

Just wanted to let you guys know that this would be extremely useful, at least for me as well. I was considering migrating to Resque just to use HireFire or similar solutions, but I like Sidekiq better. I think it's silly to spend so much money on a Heroku worker, specially when the payload is small, which is the case of my startup (basically sending emails and minimal image processing for a still tiny number of users). I'll give Justin's code a try. Thanks!

@fixr Sidekiq gives you N workers, where N is dependent on memory. Resque gives you one. From that perspective, you've already won huge over Resque even without auto-scaling. Another $35/month buys you another dyno with another N workers. It doesn't sound like you would need even a second Sidekiq instance.

@mperham I think it's more a matter of paying for a worker to run full-time when you don't need to. No one is arguing against Sidekiq's performance advantage over Resque. :) Just for clarification, are you against adding autoscaling to Sidekiq or is it just a low priority?

Auto-scaling sounds like an _awesome_ new Sidekiq Pro feature!

On Mon, Oct 1, 2012 at 4:57 PM, Gerlando Piro [email protected]:

@mperham https://github.com/mperham I think it's more a matter of
paying for a worker to run full-time when you don't need to. No one is
arguing against Sidekiq's performance advantage over Resque. :) Just for
clarification, are you against adding autoscaling to Sidekiq or is it just
a low priority?

—
Reply to this email directly or view it on GitHubhttps://github.com/mperham/sidekiq/issues/274#issuecomment-9054555.

I avoid features that my experience tells me will turn into a quagmire of bugs and support issues. Unique jobs is one of them. Auto-scaling is another.

Understand that you are trying to monitor and change a distributed system. This is an easy problem to solve when every thing is working correctly but VERY HARD to handle all the edge and error cases. Read about Zookeeper and distributed consensus algorithms like Paxos if you want more info.

I'm happy to defer auto-scaling to HireFire for that reason.

Agreed. And things like ZooKeeper only help you to a certain degree by giving you the basic primitives to build coordinated and distributed services. For an example, check out http://github.com/ryanlecompte/redis_failover :)

@mperham Thanks for clarifying!

Thank you all for your feedback. This might be a dumb question, but... is there a way to launch a Sidekiq worker attached to/from a Heroku _web_ instance (or a Rails instance for that matter)?

If you add your heroku username/password/appname to the app config variables, you can execute API scale commands for 'worker', or whatever process you define.

@JustinLove Looking at your gist ... it's good stuff. Just wondering how you solve the problem of knowing when all workers are idle (e.g. with multiple worker threads), or is your gist based on the assumption of one worker thread? Jobs are removed from the queue as they're started and I don't see an easy way of knowing they finished or knowing if any workers are still busy.

@JustinLove I modified your gist to work with the 'heroku-api' gem: https://gist.github.com/3836804. I'm still testing the whole thing, seems to be working fine. The previous gem seems deprecated as a ruby client and didn't work for me on Cedar. What's nice about the new API is that you don't need to set username/pass configs, but only your user API key :).

I incorporated heroku-api and packaged it up as a separate gem. I've certainly overlooked something in the way of documentation, so please let me know if you run into any trouble.

https://github.com/JustinLove/autoscaler

Awesome, nice job guys!

Eight years later... I was looking for a bit more governance around the quiet/shutdown process and wanted a reporting UI, so spun this foundation off into a newer plugin configuration: https://github.com/gmac/sidekiq-heroku-autoscale. Thanks for paving the way here, @JustinLove.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

andrewhavens picture andrewhavens  Â·  4Comments

bartimaeus picture bartimaeus  Â·  3Comments

mperham picture mperham  Â·  4Comments

homanchou picture homanchou  Â·  3Comments

michaeldiscala picture michaeldiscala  Â·  4Comments