Magento2: Message queue consumer never exiting when --max-messages specified.

Created on 6 Sep 2018  路  31Comments  路  Source: magento/magento2

Preconditions

  1. Magento version 2.3.x & 2.4-develop

Steps to reproduce

  1. Create new consumer, subscribe it to empty topic
  2. Start consumer using bin/magento queue:consumers:start ConsumerName --max-messages 1

Expected result

  1. Consumer tries to get 1 message from empty queue, exits.

Actual result

  1. Consumer keeps listening forever, never exits.

Additional information

This behaviour is caused by the following commit: https://github.com/magento-partners/magento2ee/commit/19721b54aae0f18632b58486a3af544d37e5a866#diff-9876e6dc23c7c514b028b967d58aa0b0

Which unfortunately isn't public because message queue was developed for EE. The class CallbackInvoker was changed so the path taken when --max-messages is supplied enters an endless loop instead of exiting.

Only information supplied is this: "MAGETWO-57177: [Critical][OMS] Max messages makes the consumer die after existing messages are consumed" even though this is expected behaviour according to the developer documentation.

Can anyone please explain the reason for this change? Opening issue here because message queue is being moved to community edition in 2.3.0

Cron Clear Description Confirmed Format is valid Ready for Work done Reproduced on 2.3.x Reproduced on 2.4.x Dev.Experience

Most helpful comment

We added a patch, if there is no 10 messages then stop the job
Additionally we modified crontab to run consumers separately from magento cron
it helps till better solution invented :)

Index: vendor/magento/framework-message-queue/CallbackInvoker.php
<+>UTF-8
===================================================================
--- a/vendor/magento/framework-message-queue/CallbackInvoker.php    (date 1574846505000)
+++ b/vendor/magento/framework-message-queue/CallbackInvoker.php    (date 1574846505000)
@@ -21,9 +21,20 @@
      */
     public function invoke(QueueInterface $queue, $maxNumberOfMessages, $callback)
     {
+        $noMessages = 0;
         for ($i = $maxNumberOfMessages; $i > 0; $i--) {
             do {
                 $message = $queue->dequeue();
+                if ($message === null) {
+                    $noMessages++;
+                    if ($noMessages > 10) {
+                        exit;
+                    }
+                }
             } while ($message === null && (sleep(1) === 0));
             $callback($message);
         }

UPD: 2020-02-06
Since 2.3.4 consumers have flag consumers-wait-for-messages try to use it when add your consumer to crontab

All 31 comments

Hi @deefco. Thank you for your report.
To help us process this issue please make sure that you provided the following information:

  • [x] Summary of the issue
  • [x] Information on your environment
  • [ ] Steps to reproduce
  • [ ] Expected and actual results

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento-engcom-team give me {$VERSION} instance

where {$VERSION} is version tags (starting from 2.2.0+) or develop branches (2.2-develop +).
For more details, please, review the Magento Contributor Assistant documentation.

@deefco do you confirm that you was able to reproduce the issue on vanilla Magento instance following steps to reproduce?

  • [x] yes
  • [ ] no

@deefco, thank you for your report.
We've acknowledged the issue and added to our backlog.

Hi @okorshenko. Thank you for working on this issue.
In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

  • [ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).
    DetailsIf the issue has a valid description, the label Issue: Format is valid will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid appears.
  • [ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description label to the issue by yourself.

  • [ ] 3. Add Component: XXXXX label(s) to the ticket, indicating the components it may be related to.

  • [ ] 4. Verify that the issue is reproducible on 2.3-develop branch

    Details- Add the comment @magento-engcom-team give me 2.3-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.3-develop branch, please, add the label Reproduced on 2.3.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and _stop verification process here_!

  • [ ] 5. Verify that the issue is reproducible on 2.2-develop branch.

    Details- Add the comment @magento-engcom-team give me 2.2-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.2-develop branch, please add the label Reproduced on 2.2.x

  • [ ] 6. Add label Issue: Confirmed once verification is complete.

  • [ ] 7. Make sure that automatic system confirms that report has been added to the backlog.

Hi @okorshenko. Thank you for working on this issue.
Looks like this issue is already verified and confirmed. But if your want to validate it one more time, please, go though the following instruction:

  • [ ] 1. Add/Edit Component: XXXXX label(s) to the ticket, indicating the components it may be related to.
  • [ ] 2. Verify that the issue is reproducible on 2.3-develop branch

    Details- Add the comment @magento-engcom-team give me 2.3-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.3-develop branch, please, add the label Reproduced on 2.3.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and _stop verification process here_!

  • [ ] 3. Verify that the issue is reproducible on 2.2-develop branch.

    Details- Add the comment @magento-engcom-team give me 2.2-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.2-develop branch, please add the label Reproduced on 2.2.x

  • [ ] 4. If the issue is not relevant or is not reproducible any more, feel free to close it.

Is there any update on this? We are trying to run Magento crons using CronJobs in Kubernetes, but once workers are spawned (and literally _never_ die unless they work on the number of messages defined in max-messages), the Kubernetes CronJob will never exit.

We are also able to replicate this running docker exec and kubectl exec when running magento crons.

(On Magento 2.3.0 and 2.3.1)

Is there any update on this? We are seeing same issues with Kubernetes. Instead of just exiting the consumer if there is no message, Magento waits for messages. So in theory you might have a consumer running forever on a dev environment even if you set max-messages to 1.
And if the consumer is running when the cron runs, it hangs there forever.

I never had this issue until I upgraded from 2.3.1 to the new 2.3.2. All of the PIDs on my server are bin/magento queue:consumers:start and end with --max-messages=10000.

Because they never exit, lfd (Login Failure Daemon) sends me emails on the hour saying that they're using excessive resources (I.e. Process Time). They simply live too long and lfd keeps letting me know about it.

Is there a work-around? Would not like to disable lfd's notification of course.

Also only have this issue since upgrading to Magento 2.3.2, related issue: https://github.com/magento/magento2/issues/23540

Same issue here, on EE

Same here, upgraded from 2.3.1 to 2.3.2 and started receiving loads of "Excessive resources used" warnings from lfd, all from magento queue:consumers never exiting.
Reverted back to 2.3.1 for now

Hi folks

I've send in some proposals for improving these message queue consumer processes to have them waste fewer resources: https://github.com/magento/community-features/issues/180
Feel free to leave a comment over there if you think this is a good idea or if somebody has a better idea let me know!

same problem since 2.3.2

more and more "consumers" get started, dont end and finally overload the server:

19-09-_2019_10-42-31

Any update on this @magento-engcom-team ?

@magento-engcom-team I'm also experiencing this issue and am wondering if this issue has been given any priority or consideration?

EE merchant here with the same issue. This may be a bit of a dirty workaround but we wrote a quick bash script that runs every 5 minutes looks for cron jobs that are consumers and kills them. Its worked ok and kept the consumers running so far.

Script looks like this

!/bin/bash

ps -ef | grep consumers | grep -v grep | awk '{print $2}' | xargs kill

@jordanvector pkill -f queue:consumers might work better than 4 pipes ;)

Is there any update on this?

When we have the consumers enabled, our cron jobs stop working entirely. Anyone else have the same issue? (Magento 2.3.3)

Currently using
'cron_consumers_runner' => [ 'cron_run' => false ]
as a temporary "fix". We'll manually run the consumers when we need them.

We added a patch, if there is no 10 messages then stop the job
Additionally we modified crontab to run consumers separately from magento cron
it helps till better solution invented :)

Index: vendor/magento/framework-message-queue/CallbackInvoker.php
<+>UTF-8
===================================================================
--- a/vendor/magento/framework-message-queue/CallbackInvoker.php    (date 1574846505000)
+++ b/vendor/magento/framework-message-queue/CallbackInvoker.php    (date 1574846505000)
@@ -21,9 +21,20 @@
      */
     public function invoke(QueueInterface $queue, $maxNumberOfMessages, $callback)
     {
+        $noMessages = 0;
         for ($i = $maxNumberOfMessages; $i > 0; $i--) {
             do {
                 $message = $queue->dequeue();
+                if ($message === null) {
+                    $noMessages++;
+                    if ($noMessages > 10) {
+                        exit;
+                    }
+                }
             } while ($message === null && (sleep(1) === 0));
             $callback($message);
         }

UPD: 2020-02-06
Since 2.3.4 consumers have flag consumers-wait-for-messages try to use it when add your consumer to crontab

@magento give me 2.4-develop instance

Hi @ravi-chandra3197. Thank you for your request. I'm working on Magento 2.4-develop instance for you

Hi @ravi-chandra3197, here is your Magento instance.
Admin access: https://i-17951-2-4-develop.instances.magento-community.engineering/admin
Login: admin Password: 123123q
Instance will be terminated in up to 3 hours.

Same issue here on 2.3.3 . Thought it was an issue with the cron scheduler but now believe it's these queue consumers that never complete. Caused 2 separate development servers to crash. Not the best first impression of Magento 2 unfortunately.
Screenshot 2019-12-24 at 12 40 48

Hi @engcom-Charlie. Thank you for working on this issue.
Looks like this issue is already verified and confirmed. But if you want to validate it one more time, please, go though the following instruction:

  • [ ] 1. Add/Edit Component: XXXXX label(s) to the ticket, indicating the components it may be related to.
  • [ ] 2. Verify that the issue is reproducible on 2.4-develop branch

    Details- Add the comment @magento give me 2.4-develop instance to deploy test instance on Magento infrastructure.
    - If the issue is reproducible on 2.4-develop branch, please, add the label Reproduced on 2.4.x.
    - If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and _stop verification process here_!

  • [ ] 3. If the issue is not relevant or is not reproducible any more, feel free to close it.


:white_check_mark: Confirmed by @engcom-Charlie
Thank you for verifying the issue. Based on the provided information internal tickets MC-30139 were created

Issue Available: @engcom-Charlie, _You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself._

:white_check_mark: Confirmed by @engcom-Charlie
Thank you for verifying the issue. Based on the provided information internal tickets MC-30139 were created

Issue Available: @engcom-Charlie, _You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself._

I just had this issue and did a little research. I had this entry in my app/etc/env.php:

'queue' => [
    'consumers_wait_for_messages' => 1
]

It is described in the DevDocs:

Specifies whether consumers should continue polling for messages if the number of processed messages is less than the max-messages value. The default value of 0 prevents stuck deployments caused by long delays in message queue processing. Set the value to 1 to allow consumers to wait for messages.

I switched the value to 0 and this solved the issue. So to be honest, I think this is not a bug, but a configuration issue.

It seems this project helps mitigate the situation until this is properly solved / documented.

https://github.com/magemojo/m2-ce-cron

My server load dropped from 12 to 4 after installing.

I wrote a post about this, but I wanted to note that I believe there is an issue in the documentation. If you look at this file: vendor/magento/framework-message-queue/CallbackInvoker.php

You will see that Magento 2 actually makes the default for this setting 1 (contrary to the documentation)

    /**
     * Checks if consumers should wait for message from the queue
     *
     * @return bool
     */
    private function isWaitingNextMessage(): bool
    {
        return $this->deploymentConfig->get('queue/consumers_wait_for_messages', 1) === 1;
    }

--- I had to force the setting to be 0 by adding the below to env.php:

'queue' => [
        'consumers_wait_for_messages' => 0
    ],

After doing that, my other cron processes started running again and I no longer saw the parent cron job as "stuck"

Full article: https://www.cadence-labs.com/2020/03/magento-2-stuck-or-long-running-cron-after-upgrade-to-2-3-x/

@cadencelabs-master Thank you! This is definitely the problem, and the new default causes serious issues for instances hosted in Docker.

At a minimum, the documentation for the message queues (https://devdocs.magento.com/guides/v2.3/config-guide/mq/manage-message-queues.html#start-message-queue-consumers) needs to be updated to reflect the default value actually being 1 in Magento 2.3.x.

While I'm glad to see the consumers-wait-for-messages flag, it doesn't make any sense to me why the default value for this would be 1 when the workers are launched as a part of cron:run?

Since the behavior described looks like the desired behavior and we can achieve the expected result with the configuration flag consumers-wait-for-messages I will be closing this issue for now.

We've updated the 2.3.x and 2.4.x devdocs accordingly. See PR https://github.com/magento/devdocs/pull/8010.

Was this page helpful?
0 / 5 - 0 ratings