Docker-gitlab: Unicorn does not come up (error 502) after hard restart of Docker server

Created on 27 Jul 2017  ยท  27Comments  ยท  Source: sameersbn/docker-gitlab

Steps to reproduce

  1. Run GitLab using the guide
  2. Power cycle the server running Docker

Actual result

GitLab will never come up fully, showing error 502.

The docker container logs will have this:

2017-07-26 23:20:38,558 INFO spawned: 'unicorn' with pid 612
2017-07-26 23:20:39,160 INFO exited: unicorn (exit status 1; not expected)
...
2017-07-26 23:20:46,864 INFO spawned: 'unicorn' with pid 647
2017-07-26 23:20:47,312 INFO exited: unicorn (exit status 1; not expected)
2017-07-26 23:20:48,313 INFO gave up: unicorn entered FATAL state, too many start retries too quickly

unicorn_stderr.log will have this:

...
/home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:195:in `pid=': Already running on PID:601 (or pid=/home/git/gitlab/tmp/pids/unicorn.pid is stale) (ArgumentError)
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/lib/unicorn/http_server.rb:127:in `start'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/gems/unicorn-5.1.0/bin/unicorn_rails:209:in `<top (required)>'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/bin/unicorn_rails:22:in `load'
        from /home/git/gitlab/vendor/bundle/ruby/2.3.0/bin/unicorn_rails:22:in `<main>'

Workaround

The only way to bring up GitLab will be to docker exec into the container, manually delete the stale pid file and restart the container:

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

Expected result

GitLab comes up without manual intervention.

Most helpful comment

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

All 27 comments

Any progress here? I'm trying to run gitlab latest in a docker swarm and getting stuck on this. Is the pidfile still located there for latest version? What version of gitlab were you trying @IlyaSemenov ?

Hello,
I have the exactly same issue.
Sometimes Gitlab dont start successfully (after server reboot)
Interested about resolution of this issue.

@lucpolak I finally got it working just using a more beefy server. I was trying to run on a g1-small on GCP, but upgrading to a n-standard-2 did the trick ๐Ÿ‘

Hey @asbjornenge, My server is pretty good. The VM is hosted on ESXI with Intel 4c CPU and 32Gb RAM.
It is provided by OVH.
The VM is Ubuntu with docker installed on it and 4Gb RAM allowed.

I have another VM with same config with gitlab-ce installed without docker and all works fine ;-(

We had the same issue on a DigitalOcean 4-core VPS with 8GB RAM (~30 regular users and a lot of CI pipelines).

What helped was reducing the number of unicorn workers from 8 to 6 (using the UNICORN_WORKERS variable).

Exact same issue on Synology NAS. Reducing the workers did not solve the issue.
Maybe it should be mentioned, that it already worked fine. The issues started about 1-2 month ago. Maybe with 10.2.x or 10.3.x

Is there a workaround like automatically removing the pid file at startup?

้‡ๅˆฐไบ†ๅŒๆ ท็š„้—ฎ้ข˜. INFO exited: unicorn (exit status 1; not expected)2018-04-22 06:08:53,643 INFO spawned: 'unicorn' with pid 587
2018-04-22 06:08:54,647 INFO success: unicorn entered RUNNING state, process has stayed up for > than 1 seconds
(startsecs)

sameersbn/gitlab:10.6.4

I am seeing this same behavior at the moment and can hardly understand how to go about resolving it. I am in the process of deploying gitlab in our on-prem kubernetes cluster. Some googling shows that some people have had success beefing up memory for the running instance. I beefed up pod spec to use up to 4G RAM but that has also been futile. Here is what I am seeing in the log before kubernetes restart the containing as an effort to repair it. In essence, it hangs here:

2018-04-22 09:01:04,820 CRIT Supervisor running as root (no user in config file)
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/cron.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/gitaly.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/gitlab-workhorse.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/mail_room.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/nginx.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/sidekiq.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/sshd.conf" during parsing
2018-04-22 09:01:04,820 WARN Included extra file "/etc/supervisor/conf.d/unicorn.conf" during parsing
2018-04-22 09:01:04,824 INFO RPC interface 'supervisor' initialized
2018-04-22 09:01:04,825 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2018-04-22 09:01:04,825 INFO supervisord started with pid 1
2018-04-22 09:01:05,827 INFO spawned: 'gitaly' with pid 592
2018-04-22 09:01:05,829 INFO spawned: 'sidekiq' with pid 593
2018-04-22 09:01:05,831 INFO spawned: 'unicorn' with pid 594
2018-04-22 09:01:05,833 INFO spawned: 'gitlab-workhorse' with pid 595
2018-04-22 09:01:05,835 INFO spawned: 'cron' with pid 600
2018-04-22 09:01:05,853 INFO spawned: 'nginx' with pid 601
2018-04-22 09:01:05,855 INFO spawned: 'sshd' with pid 603
2018-04-22 09:01:07,564 INFO success: gitaly entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: sidekiq entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: unicorn entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: gitlab-workhorse entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: cron entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:01:07,564 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:06:03,655 WARN received SIGTERM indicating exit request
2018-04-22 09:06:03,656 INFO waiting for sshd, gitlab-workhorse, sidekiq, cron, nginx, gitaly, unicorn to die
2018-04-22 09:06:03,657 INFO stopped: sshd (exit status 0)
2018-04-22 09:06:03,662 INFO stopped: nginx (exit status 0)
2018-04-22 09:06:03,663 INFO stopped: cron (terminated by SIGTERM)
2018-04-22 09:06:03,665 INFO stopped: gitlab-workhorse (terminated by SIGTERM)
2018-04-22 09:06:05,094 INFO stopped: unicorn (exit status 0)
2018-04-22 09:06:07,097 INFO waiting for sidekiq, gitaly to die
2018-04-22 09:06:07,669 INFO stopped: sidekiq (exit status 0)
2018-04-22 09:06:07,676 INFO stopped: gitaly (exit status 1)

See these two lines where it dies and look at how long it took for it to stop:

2018-04-22 09:01:07,564 INFO success: sshd entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-04-22 09:06:03,655 WARN received SIGTERM indicating exit request

It just sits at this point until the container is restarted by kubernetes. I have also increased initialDelaySeconds: 300, a relatively higher number to see if that resolves it but no luck.

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

2018๋…„ 6์›” 2์ผ (ํ† ) ์˜ค์ „ 11:09, fover0932 notifications@github.com๋‹˜์ด ์ž‘์„ฑ:

docker exec -it gitlab rm /home/git/gitlab/tmp/pids/unicorn.pid && docker
restart gitlab

This solved my issue, thanks :-)

-----me too, thanks!!!

โ€”
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/sameersbn/docker-gitlab/issues/1305#issuecomment-394049215,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEzfyNS3nKC5dlYdnxxZX8NRDwxwuiZyks5t4fPcgaJpZM4OkxsJ
.

I'm seeing this issue just now on my GitLab installation on a Synolog NAS.

I installed GitLab via Package Center, i.e, I'm using the package provided by Synology which is based on an old version (sameersbn/gitlab:9.4.4).

Fixed the issue by removing the stale PID file. Thanks!

Any actual solution rather than a mitigation? Is it just that my docker container doesn't have enough memory assigned or is there something misconfigured?

In my case, the problem was the fact that I was using wrong signed-SSL certificates, and especially the wrong dhparam.pem certificate
Nginx didn't recognize them and faulted with that
The bad side of the story is that it didn't show up in the logs anywhere
@bsakweson @sharkymcdongles did you try with self-signed for a short test?

We experienced a file system full and had to restart GitLab. After restart we also had a error 502.

I did:

# gitlab-ctl status
run: alertmanager: (pid 551) 1449s; run: log: (pid 545) 1449s
run: gitaly: (pid 593) 1449s; run: log: (pid 589) 1449s
run: gitlab-monitor: (pid 597) 1449s; run: log: (pid 592) 1449s
run: gitlab-pages: (pid 558) 1449s; run: log: (pid 556) 1449s
run: gitlab-workhorse: (pid 553) 1449s; run: log: (pid 548) 1449s
run: logrotate: (pid 596) 1449s; run: log: (pid 591) 1449s
run: nginx: (pid 579) 1449s; run: log: (pid 578) 1449s
run: node-exporter: (pid 552) 1449s; run: log: (pid 547) 1449s
run: postgres-exporter: (pid 563) 1449s; run: log: (pid 560) 1449s
run: postgresql: (pid 561) 1449s; run: log: (pid 557) 1449s
run: prometheus: (pid 594) 1449s; run: log: (pid 590) 1449s
run: redis: (pid 549) 1449s; run: log: (pid 543) 1449s
run: redis-exporter: (pid 550) 1449s; run: log: (pid 544) 1449s
run: registry: (pid 542) 1449s; run: log: (pid 540) 1449s
run: sidekiq: (pid 541) 1449s; run: log: (pid 539) 1449s
run: sshd: (pid 20) 1480s; run: log: (pid 19) 1480s
run: unicorn: (pid 33646) 1s; run: log: (pid 559) 1449s

All services were up except unicorn which kept on restarting.

I've checked the log files of unicorn and it stated:

ArgumentError: Already running on PID:777 (or pid=/opt/gitlab/var/unicorn/unicorn.pid is stale)

So as already mentioned above a simple rm /opt/gitlab/var/unicorn/unicorn.pid was enough. Actually because GitLab (omnibus installation) was keeping on restarting unicorn, I did not have to restart anything. After a second, unicorn was up and running and GitLab was healthy again! :-)

Removing the PID and restarting also solved my issue.
Caused by reboot of my Synology NAS.

@solidnerd @sameersbn
Can we fix this permanently by adding a cleanup in the entrypoint ?

Example:

#!/bin/bash

#Define cleanup procedure
cleanup() {
    echo "Container stopped, performing cleanup..."
}

#Trap SIGTERM
trap 'cleanup' SIGTERM

#Execute a command
"${@}" &

#Wait
wait $!

#Cleanup
cleanup

Could it be because docker kills the gitlab container before unicorn had enough time to shutdown?
Maybe we could try setting the --stop-timeout Docker setting to a higher value.

Same here, this command worked for me but the PID path is different in my case opt/gitlab/var/unicorn/unicorn.pid

docker exec -it gitlab rm /opt/gitlab/var/unicorn/unicorn.pid && docker restart gitlab

I've put that in my cron file and it works!

The 502 problem happens when I stop and then start GitLab from the Synology package manager UI, which I don't consider to be a "hard restart". As such, its a problem every docker GitLab synology deployment is going to have very quickly.

Users will have to be lucky enough to find this thread and learn to run a docker command (the rm and restart mentioned above, which works for me) to fix the problem. And running a docker command in synology is not straightforward via the UI - you have to do the following:

gitlab synology restart fix

This command has to be issued every time the NAS restarts etc. unless they use a cron job fix mentioned, which I'm not sure how to do on a synology - @mgscreativa can you please elaborate?

I think this issue is pretty serious and needs a proper fix.

Hi @abulka sorry I don't have Synology hardware!

Same here on an EC2 instance booting from a RancherOS AMI. So this is not specific to Synology. This occurred after sudo reboot. The workaround of running docker exec -it gitlab rm /opt/gitlab/var/unicorn/unicorn.pid && docker restart gitlab worked.

12.4.5 (539f5fc0384)

This issue has been automatically marked as stale because it has not had any activity for the last 60 days. It will be closed if no further activity occurs during the next 7 days. Thank you for your contributions.

84f

Sorry about this. Does this issue still exists with the newer releases?

ah i see its present in 12.4.5 too. will make a fix soon

@sameersbn I think we need to this with puma right by now instead of unicorn.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

laoshancun picture laoshancun  ยท  5Comments

globalcitizen picture globalcitizen  ยท  4Comments

STOIE picture STOIE  ยท  4Comments

chenjie4255 picture chenjie4255  ยท  3Comments

lenovouser picture lenovouser  ยท  3Comments