I have Laravel queue worker processes running inside a Docker container on AWS (EC2 Ubuntu 16.4 instance). The containers will run for 4 to 5 days, then unexpectedly exit with a 0 code. Viewing the logs shows no Exceptions thrown from the app. Because the container is run with a restart policy of on-failure and the process is exiting with 0, Docker will not restart the container.
There are no resource constraints currently on the containers and the EC2 instance is not reporting any high resource utilization.
Interestingly, I have a similar stack as this one but is running Laravel 5.2 and does not suffer from this issue. My assumption is this may be due to the changes made to the Queuing logic in 5.3.
I've removed or replaced any sensitive information with
******.
The docker-entrypoint.sh file, which Docker executes as ENTRYPOINT is:
: "${APP_ENV:=local}"
: "${QUEUE_TRIES:=0}"
: "${QUEUE_SLEEP:=3}"
: "${QUEUE_DELAY:=0}"
: "${MEMORY_LIMIT:=256}"
: "${QUEUE_TIMEOUT:=60}"
if [ -z "$QUEUE" ]; then
echo >&2 'error: A queue to listen on is not specified '
echo >&2 'You need to specify QUEUE'
exit 1
fi
exec php /var/www/html/artisan queue:work \
--queue=$QUEUE \
--env=$APP_ENV \
--tries=$QUEUE_TRIES \
--delay=$QUEUE_DELAY \
--sleep=$QUEUE_SLEEP \
--memory=$MEMORY_LIMIT \
--timeout=$QUEUE_TIMEOUT \
--no-interaction
Notable Environment settings.
"APP_ENV=production",
"QUEUE=*****",
"QUEUE_TRIES=1",
"MEMORY_LIMIT=256",
"QUEUE_DRIVER=sqs",
"QUEUE_TIMEOUT=600",
"APP_DEBUG=false",
"APP_LOG=errorlog"
Logs from the container and information about the status of the container.
$ docker logs ubuntu_pricing_transactions_worker_5
[2016-12-16 18:15:57] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 18:15:57] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 18:15:58] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 18:15:58] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 18:19:43] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 18:19:43] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 18:19:47] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 20:33:00] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 20:40:24] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-16 21:15:20] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-20 14:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-20 14:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-20 14:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-20 20:00:18] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-21 18:36:32] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 14:00:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 14:00:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 14:00:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 14:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 20:00:14] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 20:00:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 20:00:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 20:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 20:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-23 20:00:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:15] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:16] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:17] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:17] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:18] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:18] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:19] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:19] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:21] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:21] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:21] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:22] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-27 17:42:23] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-28 14:00:14] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-28 20:00:13] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-28 20:00:14] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-29 14:00:13] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-31 20:00:14] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2016-12-31 20:00:18] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-02 18:44:24] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-02 18:44:24] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-02 18:44:25] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:46] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:46] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:47] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:47] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:47] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:47] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:48] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:48] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-04 21:35:49] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
[2017-01-05 14:00:19] Processed: App\Jobs\Multifamily\Pricing\RunTransaction
$ docker inspect ubuntu_pricing_transactions_worker_5
[
{
"Id": "be19bcd78db06b66ffa6aa1032e35662b0d9fbf082854866b0c66c0acdda9d2c",
"Created": "2016-12-16T17:50:26.84270574Z",
"Path": "docker-entrypoint.sh",
"Args": [],
"State": {
"Status": "exited",
"Running": false,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 0,
"ExitCode": 0,
"Error": "",
"StartedAt": "2017-01-02T18:44:23.519963601Z",
"FinishedAt": "2017-01-06T15:03:00.663556969Z"
},
"Image": "sha256:d30dfbff04bc9d16e91efb6adb2fc1aaa461b07449690efd22c3c3c742e2c203",
"ResolvConfPath": "/var/lib/docker/containers/be19bcd78db06b66ffa6aa1032e35662b0d9fbf082854866b0c66c0acdda9d2c/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/be19bcd78db06b66ffa6aa1032e35662b0d9fbf082854866b0c66c0acdda9d2c/hostname",
"HostsPath": "/var/lib/docker/containers/be19bcd78db06b66ffa6aa1032e35662b0d9fbf082854866b0c66c0acdda9d2c/hosts",
"LogPath": "/var/lib/docker/containers/be19bcd78db06b66ffa6aa1032e35662b0d9fbf082854866b0c66c0acdda9d2c/be19bcd78db06b66ffa6aa1032e35662b0d9fbf082854866b0c66c0acdda9d2c-json.log",
"Name": "/ubuntu_pricing_transactions_worker_5",
"RestartCount": 0,
"Driver": "aufs",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "",
"ExecIDs": null,
"HostConfig": {
"Binds": [],
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {}
},
"NetworkMode": "ubuntu_default",
"PortBindings": {},
"RestartPolicy": {
"Name": "on-failure",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": [],
"CapAdd": null,
"CapDrop": null,
"Dns": null,
"DnsOptions": null,
"DnsSearch": null,
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
"Runtime": "runc",
"ConsoleSize": [
0,
0
],
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": null,
"BlkioDeviceReadBps": null,
"BlkioDeviceWriteBps": null,
"BlkioDeviceReadIOps": null,
"BlkioDeviceWriteIOps": null,
"CpuPeriod": 0,
"CpuQuota": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": null,
"DiskQuota": 0,
"KernelMemory": 0,
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": -1,
"OomKillDisable": false,
"PidsLimit": 0,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0
},
"GraphDriver": {
"Name": "aufs",
"Data": null
},
"Mounts": [],
"Config": {
"Hostname": "be19bcd78db0",
"Domainname": "*****",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"ExposedPorts": {
"9000/tcp": {}
},
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [],
"Cmd": null,
"Image": "*****",
"Volumes": null,
"WorkingDir": "/var/www/html",
"Entrypoint": [
"docker-entrypoint.sh"
],
"OnBuild": null,
"Labels": {
"com.docker.compose.config-hash": "210a06591101da05fb1455db832f30b4ce3d114049352ffb701693becdf320d0",
"com.docker.compose.container-number": "5",
"com.docker.compose.oneoff": "False",
"com.docker.compose.project": "ubuntu",
"com.docker.compose.service": "pricing_transactions_worker",
"com.docker.compose.version": "1.9.0"
}
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "a25749da93ae10d83677d9b7c82fcca1d337121d51eb3c312e9c69cf0b58fd0e",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": null,
"SandboxKey": "/var/run/docker/netns/a25749da93ae",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"ubuntu_default": {
"IPAMConfig": null,
"Links": null,
"Aliases": [
"be19bcd78db0",
"pricing_transactions_worker"
],
"NetworkID": "c8a70836ff00d8317da76e241219e0e5281ff66f488fc73438fca73188e8eae9",
"EndpointID": "",
"Gateway": "",
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": ""
}
}
}
}
]
N/A
This is the expected behaviour. They "restart" when they've used enough memory that they need a reboot. By "restart", I mean that just quit, and they expect that someone else will restart them.
It's normal to use something like supervisor to manage that.
Thanks for the explanation @GrahamCampbell.
They "restart" when they've used enough memory that they need a reboot.
Is there any signaling from the process prior to it exiting that it needs to restart? I would argue a process exhausting its memory allocation – either a hard or soft memory limit – should not exit with a 0 status. The 0 exit is also hard to get around in the case of container-based deployments, as Docker relies on the exit code for its restart policies.
If I use an always restart policy with Docker, I can ensure the queue workers are always running, same as what Supervisor achieves (e.g. Process Management). However, then I do not have a means of "draining" the workers in order to deploy updated container instances without forcefully stoping them, which is not a solution in my case.
Currently in both my 5.3 and 5.2 environments, an automated deployment script does the following:
docker pull the container image – tagged as passing by the CI system.docker exec -ti previous-container-name php artisan queue:restart.docker inspect). Forcefully stop the container if a timeout period is reached.docker rm container-name to remove the previous container instance.docker run -d ... container-name to start a new container instance from the new image.This procedure effectively "drains" any queue worker before updating and restarting.
As it stands I'm not sure of a solution to this problem given the current behavior. Can we reopen this thread for further discussion? Again, I'd argue exiting with 0 is inappropriate.
Any ideas from the community would be appreciated.
After reviewing the source, I do believe this should be addressed. Please reopen the issue.
The daemon() method of the Illuminate\Queue\Worker class creates the daemon loop and if either $this->memoryExceeded($options->memory) or $this->queueShouldRestart($lastRestart)) returns true, the stop() method is called.
stop() is where the issue lies.
stop() simply triggers an event then calls die. As noted by the PHP documentation, die is functionally equivalent to exit(), which explicitly is exit(0).
There is currently no way of changing this behavior and there should be for reasons I mentioned in my previous comment. A few possible solutions are:
Change die to exit(1). This is the simple fix, however changes the current behavior which may cause unexpected results and is not backward compatible.
Add an --exit-code option to the queue:work command which allows setting the exit code. This allows configuration by users which require a different exit code – in most cases being 1. The default can be 0 to maintain current behavior (backward compatible). This requires a change to the stop() method signature in order to pass the exit code value. This method is public, however when searching through files in the src/Illuminate directory for stop(), it doesn't appear to be called from outside the class?
Add a --fail-on-exit flag to the queue:work command which sets the exit code to 1, otherwise the exit code is 0. This is similar to option 2, but assumes the most likely use case and removes decision for the user about what the exit code should be, since its typical convention for any other value than 0 to be considered an error in POSIX-compatible systems.
https://en.wikipedia.org/wiki/Exit_status
I favor option 2 as it as it allows for configuration of the exit code and is backward compatible. Feedback is welcome and I'm happy to provide a PR if we agree on a resolution.
I've realized in 5.2 this is the same behavior – exiting with 0.
Can we reopen this thread and further the discussion @GrahamCampbell @taylorotwell?
Again, I'm happy to contribute a PR if we decide on how to resolve this.
Just for future reference #17302 fixed this.
Great, thanks @marcvdm!
Most helpful comment
After reviewing the source, I do believe this should be addressed. Please reopen the issue.
The daemon() method of the
Illuminate\Queue\Workerclass creates the daemon loop and if either$this->memoryExceeded($options->memory)or$this->queueShouldRestart($lastRestart))returnstrue, thestop()method is called.stop() is where the issue lies.
stop()simply triggers an event then callsdie. As noted by the PHP documentation,dieis functionally equivalent toexit(), which explicitly isexit(0).There is currently no way of changing this behavior and there should be for reasons I mentioned in my previous comment. A few possible solutions are:
Change
dietoexit(1). This is the simple fix, however changes the current behavior which may cause unexpected results and is not backward compatible.Add an
--exit-codeoption to thequeue:workcommand which allows setting the exit code. This allows configuration by users which require a different exit code – in most cases being1. The default can be0to maintain current behavior (backward compatible). This requires a change to thestop()method signature in order to pass the exit code value. This method is public, however when searching through files in thesrc/Illuminatedirectory forstop(), it doesn't appear to be called from outside the class?Add a
--fail-on-exitflag to thequeue:workcommand which sets the exit code to1, otherwise the exit code is0. This is similar to option 2, but assumes the most likely use case and removes decision for the user about what the exit code should be, since its typical convention for any other value than0to be considered an error in POSIX-compatible systems.https://en.wikipedia.org/wiki/Exit_status
I favor option 2 as it as it allows for configuration of the exit code and is backward compatible. Feedback is welcome and I'm happy to provide a PR if we agree on a resolution.