Docker-gitlab: Unicorn does not come up (error 502) after hard restart of Docker server

Created on 27 Jul 2017 · 27Comments · Source: sameersbn/docker-gitlab

Steps to reproduce

Run GitLab using the guide
Power cycle the server running Docker

Actual result

GitLab will never come up fully, showing error 502.

The docker container logs will have this:

2017-07-26 23:20:38,558 INFO spawned: 'unicorn' with pid 612
2017-07-26 23:20:39,160 INFO exited: unicorn (exit status 1; not expected)
...
2017-07-26 23:20:46,864 INFO spawned: 'unicorn' with pid 647
2017-07-26 23:20:47,312 INFO exited: unicorn (exit status 1; not expected)
2017-07-26 23:20:48,313 INFO gave up: unicorn entered FATAL state, too many start retries too quickly

unicorn_stderr.log will have this:

...
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:195:in `pid=': Already running on PID:601 (or pid=/home/git/gitlab/tmp/pids/unicorn.pid is stale) (ArgumentError)
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:127:in `start'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/bin/unicorn_rails:209:in `<top (required)>'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/bin/unicorn_rails:22:in `load'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/bin/unicorn_rails:22:in `<main>'

Workaround

The only way to bring up GitLab will be to docker exec into the container, manually delete the stale pid file and restart the container:

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

Expected result

GitLab comes up without manual intervention.

Source

IlyaSemenov

👍33 ❤2

Most helpful comment

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

fover0932 on 2 Jun 2018

👍4 😕1

All 27 comments

Any progress here? I'm trying to run gitlab latest in a docker swarm and getting stuck on this. Is the pidfile still located there for latest version? What version of gitlab were you trying @IlyaSemenov ?

asbjornenge on 3 Oct 2017

Hello,
I have the exactly same issue.
Sometimes Gitlab dont start successfully (after server reboot)
Interested about resolution of this issue.

lucpolak on 11 Oct 2017

@lucpolak I finally got it working just using a more beefy server. I was trying to run on a g1-small on GCP, but upgrading to a n-standard-2 did the trick 👍

asbjornenge on 11 Oct 2017

Hey @asbjornenge, My server is pretty good. The VM is hosted on ESXI with Intel 4c CPU and 32Gb RAM.
It is provided by OVH.
The VM is Ubuntu with docker installed on it and 4Gb RAM allowed.

I have another VM with same config with gitlab-ce installed without docker and all works fine ;-(

lucpolak on 12 Oct 2017

We had the same issue on a DigitalOcean 4-core VPS with 8GB RAM (~30 regular users and a lot of CI pipelines).

What helped was reducing the number of unicorn workers from 8 to 6 (using the UNICORN_WORKERS variable).

arthurkrupa on 21 Dec 2017

Exact same issue on Synology NAS. Reducing the workers did not solve the issue.
Maybe it should be mentioned, that it already worked fine. The issues started about 1-2 month ago. Maybe with 10.2.x or 10.3.x

Mario-Eis on 5 Mar 2018

Is there a workaround like automatically removing the pid file at startup?

Mario-Eis on 16 Apr 2018

遇到了同样的问题. INFO exited: unicorn (exit status 1; not expected)2018-04-22 06:08:53,643 INFO spawned: 'unicorn' with pid 587
2018-04-22 06:08:54,647 INFO success: unicorn entered RUNNING state, process has stayed up for > than 1 seconds
(startsecs)

HengCC on 22 Apr 2018

sameersbn/gitlab:10.6.4

I am seeing this same behavior at the moment and can hardly understand how to go about resolving it. I am in the process of deploying gitlab in our on-prem kubernetes cluster. Some googling shows that some people have had success beefing up memory for the running instance. I beefed up pod spec to use up to 4G RAM but that has also been futile. Here is what I am seeing in the log before kubernetes restart the containing as an effort to repair it. In essence, it hangs here:

2018-04-22 09:01:04,820 CRIT Supervisor running as root (no user in config file)
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/gitaly.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/gitlab-workhorse.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/mail_room.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/nginx.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/sidekiq.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/sshd.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/unicorn.conf" during parsing
2018-04-22 09:01:04,824 INFO RPC interface 'supervisor' initialized
2018-04-22 09:01:04,825 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2018-04-22 09:01:04,825 INFO supervisord started with pid 1
2018-04-22 09:01:05,827 INFO spawned: 'gitaly' with pid 592
2018-04-22 09:01:05,829 INFO spawned: 'sidekiq' with pid 593
2018-04-22 09:01:05,831 INFO spawned: 'unicorn' with pid 594
2018-04-22 09:01:05,833 INFO spawned: 'gitlab-workhorse' with pid 595
2018-04-22 09:01:05,835 INFO spawned: 'cron' with pid 600
2018-04-22 09:01:05,853 INFO spawned: 'nginx' with pid 601
2018-04-22 09:01:05,855 INFO spawned: 'sshd' with pid 603
2018-04-22 09:01:07,564 INFO success: gitaly entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: sidekiq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: unicorn entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: gitlab-workhorse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:06:03,655 WARN received SIGTERM indicating exit request
2018-04-22 09:06:03,656 INFO waiting for sshd, gitlab-workhorse, sidekiq, cron, nginx, gitaly, unicorn to die
2018-04-22 09:06:03,657 INFO stopped: sshd (exit status 0)
2018-04-22 09:06:03,662 INFO stopped: nginx (exit status 0)
2018-04-22 09:06:03,663 INFO stopped: cron (terminated by SIGTERM)
2018-04-22 09:06:03,665 INFO stopped: gitlab-workhorse (terminated by SIGTERM)
2018-04-22 09:06:05,094 INFO stopped: unicorn (exit status 0)
2018-04-22 09:06:07,097 INFO waiting for sidekiq, gitaly to die
2018-04-22 09:06:07,669 INFO stopped: sidekiq (exit status 0)
2018-04-22 09:06:07,676 INFO stopped: gitaly (exit status 1)

See these two lines where it dies and look at how long it took for it to stop:

2018-04-22 09:01:07,564 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:06:03,655 WARN received SIGTERM indicating exit request

It just sits at this point until the container is restarted by kubernetes. I have also increased initialDelaySeconds: 300, a relatively higher number to see if that resolves it but no luck.

bsakweson on 22 Apr 2018

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

LM1LC3N7 on 21 May 2018

👍3

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

fover0932 on 2 Jun 2018

👍4 😕1

2018년 6월 2일 (토) 오전 11:09, fover0932 notifications@github.com님이 작성:

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker
restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/sameersbn/docker-gitlab/issues/1305#issuecomment-394049215,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEzfyNS3nKC5dlYdnxxZX8NRDwxwuiZyks5t4fPcgaJpZM4OkxsJ
.

compurator on 2 Jun 2018

😕1

I'm seeing this issue just now on my GitLab installation on a Synolog NAS.

I installed GitLab via Package Center, i.e, I'm using the package provided by Synology which is based on an old version (sameersbn/gitlab:9.4.4).

Fixed the issue by removing the stale PID file. Thanks!

herrmanthegerman on 5 Jul 2018

Any actual solution rather than a mitigation? Is it just that my docker container doesn't have enough memory assigned or is there something misconfigured?

sharkymcdongles on 6 Dec 2018

In my case, the problem was the fact that I was using wrong signed-SSL certificates, and especially the wrong dhparam.pem certificate
Nginx didn't recognize them and faulted with that
The bad side of the story is that it didn't show up in the logs anywhere
@bsakweson @sharkymcdongles did you try with self-signed for a short test?

StefanCristian on 14 Jan 2019

We experienced a file system full and had to restart GitLab. After restart we also had a error 502.

I did:

# gitlab-ctl status
run: alertmanager: (pid 551) 1449s; run: log: (pid 545) 1449s
run: gitaly: (pid 593) 1449s; run: log: (pid 589) 1449s
run: gitlab-monitor: (pid 597) 1449s; run: log: (pid 592) 1449s
run: gitlab-pages: (pid 558) 1449s; run: log: (pid 556) 1449s
run: gitlab-workhorse: (pid 553) 1449s; run: log: (pid 548) 1449s
run: logrotate: (pid 596) 1449s; run: log: (pid 591) 1449s
run: nginx: (pid 579) 1449s; run: log: (pid 578) 1449s
run: node-exporter: (pid 552) 1449s; run: log: (pid 547) 1449s
run: postgres-exporter: (pid 563) 1449s; run: log: (pid 560) 1449s
run: postgresql: (pid 561) 1449s; run: log: (pid 557) 1449s
run: prometheus: (pid 594) 1449s; run: log: (pid 590) 1449s
run: redis: (pid 549) 1449s; run: log: (pid 543) 1449s
run: redis-exporter: (pid 550) 1449s; run: log: (pid 544) 1449s
run: registry: (pid 542) 1449s; run: log: (pid 540) 1449s
run: sidekiq: (pid 541) 1449s; run: log: (pid 539) 1449s
run: sshd: (pid 20) 1480s; run: log: (pid 19) 1480s
run: unicorn: (pid 33646) 1s; run: log: (pid 559) 1449s

All services were up except unicorn which kept on restarting.

I've checked the log files of unicorn and it stated:

ArgumentError: Already running on PID:777 (or pid=/opt/gitlab/var/unicorn/unicorn.pid is stale)

So as already mentioned above a simple rm /opt/gitlab/var/unicorn/unicorn.pid was enough. Actually because GitLab (omnibus installation) was keeping on restarting unicorn, I did not have to restart anything. After a second, unicorn was up and running and GitLab was healthy again! :-)

jcberthon on 15 Jan 2019

Removing the PID and restarting also solved my issue.
Caused by reboot of my Synology NAS.

@solidnerd @sameersbn
Can we fix this permanently by adding a cleanup in the entrypoint ?

Example:

#!/bin/bash

#Define cleanup procedure
cleanup() {
    echo "Container stopped, performing cleanup..."
}

#Trap SIGTERM
trap 'cleanup' SIGTERM

#Execute a command
"${@}" &

#Wait
wait $!

#Cleanup
cleanup

GJRTimmer on 12 Feb 2019

👍3

Could it be because docker kills the gitlab container before unicorn had enough time to shutdown?
Maybe we could try setting the --stop-timeout Docker setting to a higher value.

JMLX42 on 21 Jun 2019

Same here, this command worked for me but the PID path is different in my case opt/gitlab/var/unicorn/unicorn.pid

docker exec -it gitlab rm /opt/gitlab/var/unicorn/unicorn.pid && docker restart gitlab

I've put that in my cron file and it works!

mgscreativa on 8 Jan 2020

The 502 problem happens when I stop and then start GitLab from the Synology package manager UI, which I don't consider to be a "hard restart". As such, its a problem every docker GitLab synology deployment is going to have very quickly.

Users will have to be lucky enough to find this thread and learn to run a docker command (the rm and restart mentioned above, which works for me) to fix the problem. And running a docker command in synology is not straightforward via the UI - you have to do the following:

gitlab synology restart fix

This command has to be issued every time the NAS restarts etc. unless they use a cron job fix mentioned, which I'm not sure how to do on a synology - @mgscreativa can you please elaborate?

I think this issue is pretty serious and needs a proper fix.

abulka on 29 Jan 2020

Hi @abulka sorry I don't have Synology hardware!

mgscreativa on 29 Jan 2020

Same here on an EC2 instance booting from a RancherOS AMI. So this is not specific to Synology. This occurred after sudo reboot. The workaround of running docker exec -it gitlab rm /opt/gitlab/var/unicorn/unicorn.pid && docker restart gitlab worked.

12.4.5 (539f5fc0384)

hannes-ucsc on 31 Jan 2020

This issue has been automatically marked as stale because it has not had any activity for the last 60 days. It will be closed if no further activity occurs during the next 7 days. Thank you for your contributions.