I'm having an issue where a cap deploy (and the deploy:restart task it executes) is causing Sidekiq to restart in an old release. In fact, when I manually issue a cap sidekiq:stop and cap sidekiq:start, Sidekiq ends up starting in the _oldest_ release rather than the current one. Here's what's happening:
I end up seeing a background job failure reported by New Relic, with a stacktrace pointing to:
/home/goodbrews/api/releases/20130530024501/app/workers/webhook_worker.rb
Note that release number. However, when SSHing into the server, here's what the current release is:
[email protected]:~/api$ ls -lah
total 16K
drwxrwxr-x 4 goodbrews goodbrews 4.0K May 31 16:49 .
drwxr-xr-x 8 goodbrews goodbrews 4.0K May 29 17:28 ..
lrwxrwxrwx 1 goodbrews goodbrews 43 May 31 16:49 current -> /home/goodbrews/api/releases/20130531164922
drwxrwxr-x 7 goodbrews goodbrews 4.0K May 31 16:49 releases
drwxrwxr-x 9 goodbrews goodbrews 4.0K May 25 02:58 shared
[email protected]:~/api$ ls -lah releases
total 28K
drwxrwxr-x 7 goodbrews goodbrews 4.0K May 31 16:49 .
drwxrwxr-x 4 goodbrews goodbrews 4.0K May 31 16:49 ..
drwxrwxr-x 11 goodbrews goodbrews 4.0K May 30 02:45 20130530024501
drwxrwxr-x 11 goodbrews goodbrews 4.0K May 31 14:51 20130531145135
drwxrwxr-x 11 goodbrews goodbrews 4.0K May 31 15:07 20130531150733
drwxrwxr-x 11 goodbrews goodbrews 4.0K May 31 15:30 20130531153013
drwxrwxr-x 11 goodbrews goodbrews 4.0K May 31 16:49 20130531164922
As you can see, Sidekiq jobs are apparently running on old code. This is after I issued a new deployment to fix my worker and hopefully solve the error. But Sidekiq restarted in what is actually the _oldest_ kept release. I issued a cap sidekiq:stop and cap sidekiq:start manually with no luck. Errors are still streaming in and pointing to the oldest release.
Here's my deploy.rb file for reference:
require 'bundler/capistrano'
require 'capistrano-rbenv'
require 'sidekiq/capistrano'
require 'puma/capistrano'
require 'new_relic/recipes'
set :application, 'api'
server '192.81.210.163', :web, :app, :db, :primary => true
set :repository, '[email protected]:goodbrews/api.git'
set :branch, 'master'
set :scm, :git
set :user, 'goodbrews'
set :use_sudo, false
set :deploy_to, "/home/goodbrews/#{application}"
set :deploy_via, :remote_cache
set :rbenv_ruby_version, '2.0.0-p195'
ssh_options[:forward_agent] = true
default_run_options[:pty] = true
ssh_options[:keys] = [File.join(ENV['HOME'], '.ssh', 'goodbrews')]
namespace :deploy do
task :setup_config, roles: :app do
run "mkdir -p #{shared_path}/config/initializers"
run "mkdir -p #{shared_path}/sockets"
sudo "ln -nfs #{current_path}/config/nginx.conf /etc/nginx/sites-enabled/#{application}"
put File.read("config/auth.yml"), "#{shared_path}/config/auth.yml"
put File.read("config/database.yml"), "#{shared_path}/config/database.yml"
put File.read("config/newrelic.yml"), "#{shared_path}/config/newrelic.yml"
put File.read("config/redis.yml"), "#{shared_path}/config/redis.yml"
puts "Now edit the config files in #{shared_path}."
end
after "deploy:setup", "deploy:setup_config"
task :symlink_config, roles: :app do
run "ln -nfs #{shared_path}/config/auth.yml #{release_path}/config/auth.yml"
run "ln -nfs #{shared_path}/config/database.yml #{release_path}/config/database.yml"
run "ln -nfs #{shared_path}/config/newrelic.yml #{release_path}/config/newrelic.yml"
run "ln -nfs #{shared_path}/config/redis.yml #{release_path}/config/redis.yml"
end
after "deploy:finalize_update", "deploy:symlink_config"
end
after 'deploy:update_code', 'deploy:migrate'
after 'deploy:update', 'newrelic:notice_deployment'
after 'deploy', 'deploy:cleanup'
I should perhaps also note that the jobs being run currently are retries. Is it possible that retries are, for some reason, run against the code they were initially queued from?
Is it possible you have an old sidekiq process still running and that's where the errors are coming from?
On 31 May 2013, at 11:00, David Celis [email protected] wrote:
I should perhaps also note that the jobs being run currently are retries. Is it possible that retries are, for some reason, run against the code they were initially queued from?
—
Reply to this email directly or view it on GitHub.
That's actually what I found out was going on. I was waiting for a retry to trigger to make sure it wouldn't error out again, and it seems to be fine now. Sorry!
Most helpful comment
Is it possible you have an old sidekiq process still running and that's where the errors are coming from?
On 31 May 2013, at 11:00, David Celis [email protected] wrote: