Forem: Epic: Moving Jobs From DelayedJob to Sidekiq

Created on 30 Dec 2019  ยท  42Comments  ยท  Source: forem/forem

Why We Are Moving To Sidekiq

Because DelayedJob relies on your database to process jobs, when you get a large volume of jobs it can be very slow. For this reason we are switching to Sidekiq to help ensure that as our job volume increases we have the ability to keep up with the new load. Also in DelayedJob, because it didnt handle lots of jobs well we have been ignoring any job failures in production which is not great when you are trying to ensure reliability. On Sidekiq we won't have to worry about allowing jobs to fail bc Sidekiq is built to handle a lot of jobs very quickly thanks to the help of Redis.

Sidekiq also has A LOT of very nice features and plugins that we can use as our job/worker flow becomes more complex. If you want to know more about why Sidekiq is a great background job tool checkout this blog post I wrote about moving to it from Resque.

Why not hook Sidekiq directly into ActiveJob?
Yes, Sidekiq can hook into ActiveJob, however, there are some downsides. Notably, it is 2-20x slower. Another big downside is that ActiveJob is not compatible with Sidekiq's commercial features which we may want to make use of in the future. Using ActiveJob to send emails is about the only thing I plan to tie Sidekiq into bc those jobs are simple. Any other job it would be advantageous if we could have full control over it by completely moving it to Sidekiq and keeping the interface as simple as possible.

Strategy For Moving Jobs

Same as we did for Redis keys, we are going to be moving jobs over one by one. When a job moves from DJ(DelayedJob) to Sidekiq we are going to be renaming it by ending its name with Worker
For example: BustMultipleCachesJob => BustMultipleCachesWorker
These new workers will be stored in a separate folder away form the jobs which will help us quickly evaluate what still needs to be done.

How You Can Help!

There are a lot of jobs that need to be moved which is why I would LOVE help from the community doing this! Grab a job, move it to Sidekiq and then open a PR with this issue tagged.

When Moving a Job

When you are moving a job there are 2 ways you can do it
1. Create a new Worker class and delete the job all in the same PR: This is 100% fine for those jobs that do not run a lot. For example, if we run a job from the scheduler once a day, that is one we could roll over in a single PR.
2. Create a new Worker, open a PR. Once that PR is merged, open a second PR to remove the old Job: This is what we need to do for jobs that are executed a lot. Think indexing or cache busting jobs. By merging the PRs separately we can ensure we don't strand any jobs that might be inflight when the new worker is created.

What queue should I put it in?
Look at how often the job is executed and how important that job is. For example, jobs kicked off from the UI that are updating data we want do very quickly so the user gets the data they want. A scheduled job handing out badges, that is not as important and can be delayed a little bit so it can go on a low_priority queue. The Sidekiq queues we have are

:queues:
  - ["default", 1]
  - ["low_priority", 10]
  - ["medium_priority", 100]
  - ["high_priority", 1000]
  - ["scheduler", 1000]
  - ["mailers", 1000]

The numbers on the right indicate the queue importance. The higher the number the more important the queue is and the fast those jobs will be executed. The numbers are actually a ratio. In the event that every queue is full we will execute 1000 high_priority jobs for every 100 medium_priority, every 10 low_priority etc.

When you are moving jobs PLEASE feel free to take a look at the job itself and improve on either performance or how it is being executed. For example:

  • Add TESTS for the new worker even if the old job didn't have them. Also add this new shared_context example spec for ensuring the correct queue is set for each job. Other test helpers have also been added that you can use thanks to @rhymes
  • Add early return statements when data is missing to avoid unnecessary work
  • Is the job silently failing? There may be times we don't want jobs to silently fail anymore
  • Are the arguments objects? We DO NOT want to be passing any object arguments, simple strings is our goal so take this time to change those to strings, ids, whatever.

Example PRs

Single PR Job Move: https://github.com/thepracticaldev/dev.to/pull/5269
Multiple PR Job Move: PR 1 - https://github.com/thepracticaldev/dev.to/pull/5302

Jobs TODO

Let me know what you want to work on!!!

Articles:

  • [x] bust_multiple_caches_job
  • [x] detect_human_language_job.rb
  • [x] score_calc_job.rb
  • [x] update_analytics_job.rb
  • [x] update_main_image_background_hex_job.rb

    Audit

  • [x] save_to_persistent_storage_job.rb

    Badge Achievements

  • [x] send_email_notification_job.rb

    ChatChannels

  • [x] index_job.rb

    Classified

  • [x] bust_cache_job.rb

    Comments

  • [x] bust_cache_job.rb

  • [x] calculate_score_job.rb
  • [x] create_first_reaction_job.rb
  • [x] create_id_code_job.rb
  • [x] send_email_notification_job.rb
  • [x] touch_user_job.rb

    Events

  • [x] bust_cache_job.rb

    Follows

  • [x] create_chat_channel_job.rb

  • [x] send_email_notification_job.rb
  • [x] touch_follower_job.rb

    Mentions

  • [x] create_all_job.rb

  • [x] send_email_notifications_job.rb

    Notifications

  • [x] mention_job.rb

  • [x] new_badge_achievement_job.rb
  • [x] new_reaction_job.rb
  • [x] remove_all_job.rb
  • [x] welcome_notification_job.rb
  • [x] milestone_job.rb
  • [x] new_comment_job.rb
  • [x] notifiable_action_job.rb
  • [x] tag_adjustment_notification_job.rb
  • [x] moderation_notification_job.rb
  • [x] new_follower_job.rb
  • [x] remove_all_by_action_job.rb
  • [x] update_job.rb

    Orgs

  • [x] bust_cache_job

    Pages

  • [x] bust_cache_job

    Podcast Eps

  • [x] bust_cache_job

  • [x] create_job
  • [x] update_reactable_job

    Podcasts

  • [x] bust_cache_job

  • [x] get_episodes_job

    Pro Memberships

  • [x] popular_history_job

    Reactions

  • [x] bust_homepage_cache_job.rb

  • [x] bust_reactable_cache_job.rb
  • [x] create_job.rb
  • [x] update_reactable_job.rb

    Search

  • [x] index_job

  • [x] remove_from_index_job

    Streams

  • [x] twitch_webhook..._job

    Tags

  • [x] bust_cache_job

    Users

  • [x] bust_cache_job.rb

  • [x] follow_job.rb
  • [x] self_delete_job.rb
  • [x] touch_job.rb
  • [x] estimate_default_language_job.rb
  • [x] resave_articles_job.rb
  • [x] subscribe_to_mailchimp_newsletter_job.rb

    Webhook

  • [x] destroy_job

  • [x] dispatch_event_job

    Other

  • [x] html_variant_trial_create_job.rb

  • [x] html_variant_success_create_job.rb
  • [x] export_content_job.rb
  • [x] slack_bot_ping_job.rb
  • [x] rss_reader_fetch_user_job.rb
Epic sre good first issue ready for dev

Most helpful comment

And with the merging of #6071, this is entirely complete - delayed_job is gone. A big shout out and thank you to everyone that helped out! It has been a pleasure to see the community contributing so strongly ๐Ÿ’ฏ.

All 42 comments

Great epic :))

Hey @mstruve ๐Ÿ‘‹

Great epic description. I migrated the first one in the list (#5309)

just one question. Is there any reason why not using ActiveJob with Sidekiq?

@mstruve

I'll take up :

Articles

  • [ ] detect_human_language_job.rb
  • [ ] score_calc_job.rb

This looks interesting, i was looking to start my year off with contributing to Open Source projects and this might be a good opportunity for me. However i am pretty new to all this and if i am stuck somewhere i hope you guys could help.

@duduribeiro

just one question. Is there any reason why not using ActiveJob with Sidekiq?

Benchmarks show that ActiveJob is 2-20x times slower pushing jobs to Redis and has roughly 3x the processing overhead (with Rails 5.1.4 and Sidekiq 5.1.1). See issue https://github.com/mperham/sidekiq/issues/3782 for discussion.

from https://github.com/mperham/sidekiq/wiki/Active-Job#performance

It seems this has to do with the whole serialization logic in ActiveJob that is more complex than in Sidekiq

I'd like to contribute, however, the local setup gives me quite some troubles. Should I open a new issue for each individually, all at once, or none (could be just my machine)? All the problems:

  • cannot load such file -- dry-equalizer => resolve via: gem install dry-equalizer (why is it not in the Gemfile?)
  • cannot load such file -- dry-configurable => resolve via: gem install dry-configurable (why is it not in the Gemfile?)
  • uninitialized constant ReverseMarkdown::Converters::Base (NameError) => not sure, why it does not pick up the reverse markdown gem's base file -> I wrapped the manual loading inside a defined?(ReverseMarkdown::Converters::Base) guard clause for now
  • uninitialized constant Article::Storext (NameError) => I had this issue after upgrading storext from 2.x to 3.x in a previous project for which we did not find a solution and thus had to lock that gem to 2.x (gem 'storext', '~> 2.2', '< 3.0'), which is preventing you to upgrade to rails 6, thus that is not a viable solution. Do you have an idea?

My machine runs macOS 10.14, I went through the setup guide, use rbenv, etc. I have many ruby/rails projects running on Mac machine, so Postgres/Ruby/etc. is all installed properly and running. Thanks for any advice ๐Ÿ™‚

=> after those mentioned "fixes", bin/setup finally succeeds and the app is running locally. However, spec/services/rss_reader_spec.rb is failing. See my changes here: https://github.com/swiknaba/dev.to/pull/1

Hi @swiknaba, please open a new issue for setup help! Thank you!

started with

Follows

send_email_notification_job.rb

Have a Pr ready for Notifications::Mention Job refactor ๐ŸŽ‰ https://github.com/thepracticaldev/dev.to/pull/5312

PR for Page::BustCacheJob refactor at #5328 #5338

With PR #5326 we added some helpers that can be of help in moving tests over to Sidekiq. They closely mimick those builtin in Rails for ActiveJob so you might be familiar with them:

They are:

There's also sidekiq_enqueued_jobs which either returns all jobs or all the jobs, I'll update it shortly to support returning jobs by worker and not just by queue.

Let me know if you have any questions or need other helpers to be "ported" from https://api.rubyonrails.org/classes/ActiveJob/TestHelper.html

Hi there! We migrated Articles::UpdateMainImageBackgroundHexJob in this PR: https://github.com/thepracticaldev/dev.to/pull/5345. We wait for your review! :)

Hi @mstruve,
Great epic! It's time to help the community so I'd like to take :

    index_job.rb
Classified
  bust_cache_job.rb

@wildeng You got it! Thank you!!!

Hi, I can pick up:

calculate_score_job.rb

I have the app up and running (just need to figure out how to run rspec using the docker set up) so will hopefully get a PR out soon.

@VegaFromLyra you got it! Thank you!

@duduribeiro

just one question. Is there any reason why not using ActiveJob with Sidekiq?

Benchmarks show that ActiveJob is 2-20x times slower pushing jobs to Redis and has roughly 3x the processing overhead (with Rails 5.1.4 and Sidekiq 5.1.1). See issue mperham/sidekiq#3782 for discussion.

from https://github.com/mperham/sidekiq/wiki/Active-Job#performance

It seems this has to do with the whole serialization logic in ActiveJob that is more complex than in Sidekiq

awesome @rhymes. I didn't know about that benchmark.. Thanks for the explanation!

@mstruve I can pick up the remaining two jobs related to Comments next:

  bust_cache_job.rb
  send_email_notification_job.rb

@mstruve once the ones I have are done I can pick up others, you can even randomly assign them to me :smile:

@wildeng you have been such a huge help I think I will let you pick out whichever jobs your heart desires! ๐Ÿค—

Hello @mstruve ,

I would like to take care of:

  • html_variant_trial_create_job.rb
  • html_variant_success_create_job.rb

@wildeng you have been such a huge help I think I will let you pick out whichever jobs your heart desires! hugs

Thanks a lot :) I could then pickup what remains in Follows and all the Mentions ones

@wildeng and @cyrillefr you got it!

๐Ÿ‘‹ Was there any consideration about AWS SQS vs Sidekiq?

I'm not saying that because of Shoryuken because it could be with Lambdas (fully Serverless), using SQS as an Event Source for Lambdas.

I understand that the Redis performance is unbeatable, but I'm wondering if Serverless, "unlimited" scalability, and all that stuff could compensate that.

Please don't forget to declare the files you're working on BEFORE starting to work on the PR otherwise people might end up working on the same set of jobs.

Thank you!

@phstc not really at the moment, as Sidekiq and Redis has good performance in themselves.

The needs for infinite scalability are not there yet, but as the website grows and our job queues grow with it we might consider scaling and distributing the queue even more in the future. Shoryuken is an interesting tool!

๐Ÿ‘‹ I will tackle some more of the notification jobs :)
welcome_notification (Pr is here: https://github.com/thepracticaldev/dev.to/pull/5465)
milesstone_jobs

Hi @rhymes

It is also because of the Serverless nature as well. You don't need to maintain SQS. But conversion from Sidekiq to Shoryuken is pretty much just changing an include - in case this becomes an option in the future.

Another option is SQS as event Source for Lambdas, so fully Serverless.

I will be happy to give guidance if you guys are interested in these options in the future. DEV is awesome ๐Ÿ‘

@mstruve @rhymes I can pick up Podcast Episodes next but in code I see the following:

create_job.rb
update_media_url_job.rb

which is not in sync with the list above? bust_cache_job for this has already been migrated but I can pick up the above two

Thanks for the catch @VegaFromLyra!

I could take next in line: Notifications -> new_comment_job.rb

@rhymes @mstruve Can I take those?

Reactions

  • create_job.rb
  • update_reactable_job.rb

@rhymes @mstruve Requesting those:

Reactions

  • bust_homepage_cache_job.rb
  • bust_reactable_cache_job.rb

I can take the next two of the Notifications ones

  • notifiable_action_job.rb (https://github.com/thepracticaldev/dev.to/pull/5588)
  • tag_adjustment_notification_job.rb

I'll take slack_bot_ping_job.rb

I'll take rss_reader_fetch_user_job.rb ๐Ÿ™‹โ€โ™‚๏ธ

Submit a PR - https://github.com/thepracticaldev/dev.to/pull/5673 - to improve tags caching/purging, so you might want to hold on working on Tags::BustCacheJob until that's merged

Requesting those:

Notification

  • moderation_notification_job.rb
  • remove_all_by_action_job.rb
  • update_job.rb

I'll take export_content_job.rb ๐Ÿ™‹โ€โ™‚๏ธ

I'll take dispatch_event_job.rb ๐Ÿ™‹โ€โ™‚๏ธ

I can take these two next:

 - PopulateHistoryJob
Users
 - FollowJob (this was marked as done above but looks like it's up for grabs?)

@VegaFromLyra the Users::FollowJob has a new worker, it just hasnt been removed yet. Someone likely took it without letting us know but we marked it off after their first PR.

And with the merging of #6071, this is entirely complete - delayed_job is gone. A big shout out and thank you to everyone that helped out! It has been a pleasure to see the community contributing so strongly ๐Ÿ’ฏ.

Was this page helpful?
0 / 5 - 0 ratings