Suitecrm: Robustness Big Issue 2: cron shouldn't run during servicing operations

Created on 13 Dec 2016  路  3Comments  路  Source: salesagility/SuiteCRM

The Schedulers process, cron.php needs to die() and not execute while SuiteCRM is undergoing self-modification, namely in these situations:

  1. Installations
  2. Upgrades
  3. Quick repair and rebuild
  4. anything else?... module deployment from Studio?

This, along with the issue of running cron as root (Issue #2787), has been found (reproducibly) to be guilty for broken fresh installations, and broken upgrades.

You can read about the tests here: https://github.com/salesagility/SuiteCRM/issues/1688

Why does it break installations?

Cron jobs run every minute. Many installations and upgrades take more than a minute. Cron jobs will generate caching. They will use database tables that haven't yet been created in installations. They might cause all kinds of unforeseen, and untested for, conflicts and errors.

Errors are random (depend on the interaction of the specific cron job running and the web app modification) and extremely unintuitive to track and detect. Most people never know what hit them.

The installer screens actually suggests setting up cron right before the installation begins! They could at least save that suggestion for the end of the install...

preinstall

How to implement fix?

There is already some mechanism in SugarCRM for similar things, see install.php:

$GLOBALS['installing'] = true;
define('SUGARCRM_IS_INSTALLING', $GLOBALS['installing']);

But this is not currently being used for cron jobs, and is very incomplete; it definitely does not cover all required situations.

Proposed improvements are:

  1. Have a generic "servicing" flag that covers all of the above mentioned self-modifications of SuiteCRM. The flag is true when SuiteCRM is self-modifying, false when it isn't. If you can't cover all cases in the beginning, start by covering a few. Each incremental improvement helps lower the odds of chaos.

  2. Make sure a simple, known process is available to reset this flag to false, in case some servicing process gets hung and never terminates: the simplest one would be at the end of every Quick Repair and Rebuild.

  3. At the beginning of cron.php, check for that flag and exit if it is true (logging an appropriate message).

  4. Make this explicit in the Release Notes of the new version. People need to be properly warned.

I'm sure @salesagility will agree that assigning a few days of a developer's time to solve this issue immediately is an excellent investment and will eliminate one of the biggest causes of inexplicably broken systems...

Important Bug

Most helpful comment

@pgorod Done! Sad to say I'm not one for exciting Saturdays haha.

All 3 comments

And, just for motivation... all this I got from quick searches on Github and the forums. Most of it is recent. And there is plenty more where these came for...

Troubles very likely related to the current issue:

That's why I say it's good news that we can no longer say "we can't fix it, we don't know why it happens, it's just random...". Now we know.

@samus-aran while you're busy labelling, this is a good one to turn from Suggestion into Bug and, in my perspective as a forum helper, at least Medium Priority. This keeps showing up.

(So... another Saturday run?)

@pgorod Done! Sad to say I'm not one for exciting Saturdays haha.

Was this page helpful?
0 / 5 - 0 ratings