I'm running a single Sidekiq process on a container and I need a way to do a health check. Is there some tcp endpoint to query?
I'm not sure such an endpoint exists. Sidekiq has an optional web interface that could be scraped, I suppose.
Where I work, we expose various aspects of the application's health via pinglish. One item is the number of enqueued sidekiq jobs. We alert if that number gets too far from zero. That indicates Sidekiq's health indirectly, but more inline with the business impact of an unhealthy Sidekiq.
Our infrastructure team also uses Shinken to monitor each host for the appropriate number of Sidekiq worker processes, but I'm not as familiar with that part.
What does a health check look like, do you have any documentation? Sidekiq isn't a network service - it doesn't open any listening sockets.
We're using marathon to manage docker containers and it has a health check ping such as
[
{
"path": "/health_check",
"protocol": "HTTP"
}
]
But launching sidekiq as a service container has presented problems with monitoring. I'm able to do something like to get it to pass but would like to have bettter monitoring in place.
[
{
"protocol": "COMMAND",
"portIndex": 0,
"command": {
"value": "ps ax | grep -v grep | grep sidekiq > /dev/null"
}
}
]
That type of health check makes sense for a web service, where you want to ensure the port is open and requests are routing. It makes less sense for Sidekiq: the workers could be locked up but the health check return 200 all day. Your ps command is just as useful. As @mikegee pointed out, really you want to monitor your queue sizes and ensure they aren't backing up.
@mperham Facing a similar problem with Rancher. Rancher offers health checks to a (Web) service Docker container. In case it is not healthy anymore (TCP or HTTP check), Rancher spins up a new container on a different host.
The Web interface input fields in Rancher do look like that:


I know that Sidekiq is not a network service. But it would be really nice if with Sidekiq one could optionally for health checks bring up a web endpoint and/or if I could tell to Sidekiq to simply open a Port.
An idea could be that the endpoint gives back 500 HTTP error for example, if the Sidekiq process could not be started correctly. Otherwise if all went ok and the process is running it could give back 2xx/3xx HTTP status
I totally agree with @phlegx ...@mperham 's comments are entirely correct unless you factor in the huge popularity of running everything in containers...and they sometimes die...and the scheduler needs to know :)
Perhaps a Sidekiq plug-in can do this? I imagine it could be as simple as returning http 200 unconditionally, where failure to connect implies the service is unhealthy.
Has anyone found a workaround for this use case yet?
I am running into the same issue where I have deployed sidekiq in a docker container and need it to respond to /alive.txt endpoint for Marathon to consider it "alive" and not kill the instance.
@Monte9 it looks like Marathon supports running arbitrary commands to check health. Perhaps check if Sidekiq is healthy by looking at the output of ps.
Thanks for the response @mikegee! I was able to find a workaround in my specific case.
Basically I had a Docker container that was running sidekiq, and what I ended up doing was before I ran sidekiq, I started a simple python web server in the background that just returned 200 for the alive.txt endpoint.
cd alive && python3 -m http.server 3000 &> /dev/null &
That way, when Marathon hit my sidekiq docker container, it got the alive reponse as needed.
@mperham Just thinking out load here, what about a service that connects to redis checks the heartbeat of all the existing sidekiq processes (using the sidekiq api)? Of course then it would need to know about the expected number of sidekiq processes (containers) and assuming that if the process is having a heartbeat it's alive and can process jobs.
Never used it but possibly relevant: https://github.com/arturictus/sidekiq_alive
Most helpful comment
@mperham Facing a similar problem with Rancher. Rancher offers health checks to a (Web) service Docker container. In case it is not healthy anymore (TCP or HTTP check), Rancher spins up a new container on a different host.
The Web interface input fields in Rancher do look like that:
I know that Sidekiq is not a network service. But it would be really nice if with Sidekiq one could optionally for health checks bring up a web endpoint and/or if I could tell to Sidekiq to simply open a Port.
An idea could be that the endpoint gives back 500 HTTP error for example, if the Sidekiq process could not be started correctly. Otherwise if all went ok and the process is running it could give back 2xx/3xx HTTP status