Semanticmediawiki: Error: 1305 SAVEPOINT wikimedia_rdbms_atomic1 does not exist caused by SMW\SQLStore\QueryEngine\HierarchyTempTableBuilder::buildTempTable

Created on 2 Oct 2019  路  9Comments  路  Source: SemanticMediaWiki/SemanticMediaWiki

Setup and configuration

  • SMW version: 3.1.0
  • MW version: 1.31.3
  • PHP version: 7.3.9-1+0~20190902.44+debian9~1.gbpf8534c (fpm-fcgi)
  • DB system (MySQL, Blazegraph, etc.) and version: 10.4.8-MariaDB-1:10.4.8+maria~buster

Issue

I have consistently been seeing errors of the type

Error: 1305 SAVEPOINT wikimedia_rdbms_atomic1 does not exist (localhost)

PHP message: DB Error: A database query error has occurred. Did you forget to run your application's database schema updater after upgrading?

The error happens when I save certain pages with SMW calls, and happens in the deferred updates that get queued from WikiPage::doModify(). Semantic MediaWiki seems to be creating and working with a templorary table called smw_new which it creates, inserts to, truncates, inserts to again and then drops. According to the MariaDB documentation and the MySQL documentation calling TRUNCATE TABLE on a temporary table still implicitly commits, and therefor flushes the SAVEPOINT, which then causes the issue with the missing savepoint when RELEASE SAVEPOINT gets called. This is a problem, because all of this happens inside a transaction created by MediaWiki Core. The TRUNCATE TABLE in question seems to be this one here: https://github.com/SemanticMediaWiki/SemanticMediaWiki/blob/54a9495e2804509941442471df3666906e140e83/src/SQLStore/QueryEngine/HierarchyTempTableBuilder.php#L183
Changing this TRUNCATE TABLE to DELETE FROM seems to fix the issue, as DELETE FROM does not come with an implicit COMMIT.

Stack trace

The stack trace points to the MediaWiki database classes, so I doubt it is helpful, but here it is (line numbers might be a bit off because I added some error logging to the file)

/includes/libs/rdbms/database/Database.php (1459) __construct
/includes/libs/rdbms/database/Database.php (1429) makeQueryException
/includes/libs/rdbms/database/Database.php (1202) reportQueryError
/includes/libs/rdbms/database/Database.php (3505) query
/includes/libs/rdbms/database/Database.php (3598) doReleaseSavepoint
/includes/libs/rdbms/database/Database.php (3687) endAtomic

Steps to reproduce

Unfortunately I don't have an easy use case for reproduction, the only one I had includes both Semantic MediaWiki, Scribunto and the Semantic Scribunto extension and is fairly convoluted. I can try to extract it if really needed, but it is a bit hard.

bug store-sql

Most helpful comment

For the sake of this, I made a benchmark script. It runs both things 5 times to avoid any warmup issues the database might have.

<?php

error_reporting(E_ALL);
ini_set('display_errors', 1);

$dbhost = '';
$dbuser = '';
$dbpass = '';
$dbname = '';

$rows = 10000;
$runs = 5;
$table = 'smw_new';
$deleteQueries = [
    'DELETE FROM',
    'TRUNCATE TABLE',
];

echo '<!DOCTYPE html>';
echo '<html lang="en">';
echo '<head><meta charset="utf-8"><title>DELETE BENCHMARK</title></head>';
echo '<body>';
echo '<h1>DELETE BENCHMARK</h1>';

echo '<p>Run for '.$rows.' rows, running '.$runs.' times to avoid warmup effects.</p>';

$sqlCreate = 'CREATE TEMPORARY TABLE '.$table.' ( id INT UNSIGNED KEY ) ENGINE=MEMORY';
$sqlDrop = 'DROP TEMPORARY TABLE '.$table;
$db = new mysqli($dbhost, $dbuser, $dbpass, $dbname);

for($run = 0; $run < $runs; $run++) {
    $db->query($sqlCreate);
    foreach($deleteQueries as $deleteQuery) {
        echo '<h2>Run '.$deleteQuery.' (Run '.($run+1).')</h2>';
        for($i = 0; $i < $rows; $i++) {
            $db->query('INSERT INTO '.$table.' VALUES('.$i.')');
        }
        $before = $db->query('SELECT DATE_FORMAT(NOW(6), "%Y-%m-%d %H:%i:%s.%f" ) as ts')->fetch_object()->ts;
        $db->query($deleteQuery.' '.$table);
        $after = $db->query('SELECT DATE_FORMAT(NOW(6), "%Y-%m-%d %H:%i:%s.%f" ) as ts')->fetch_object()->ts;
        echo '<p>Time before: '.$before.'</p>';
        echo '<p>Time after: '.$after.'</p>';
        $dab = new DateTime($before);
        $daa = new DateTime($after);
        echo '<p>Difference: '.$dab->diff($daa)->f.'s</p>';
    }
    $db->query($sqlDrop);
}

echo '</body>';
echo '</html>';

On my test server, the result looks something like this:

Run DELETE FROM (Run 5)
Time before: 2019-10-16 12:59:43.087599
Time after: 2019-10-16 12:59:43.087730
Difference: 0.000131s

Run TRUNCATE TABLE (Run 5)
Time before: 2019-10-16 12:59:43.633259
Time after: 2019-10-16 12:59:43.633365
Difference: 0.000106s

This is for a temporary table (same table as smw_new in the extension) with 10000 rows. The difference is in the 1/10000 of a second, which does seem neglectible. Especially given that SMW will rarely have 10000 rows in this table.

I want to reiterate what notr1ch sair, that there is no master/slave replication in our setup, so this is not related to that. Using TRUNCATE on temporary tables causes an implicit commit, which causes our problem.

All 9 comments

Thanks for the great bug report! I've created a pull request with your proposed change.

After merging the change @mwjames requested it to be reverted in https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/4320#issuecomment-540624350

After merging the change @mwjames requested it to be reverted in
https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/4320#issuecomment-540624350

Please see the comments I left and try answer (or investigate) them.

While the change seems trivial there are reported performance
differences between DELETE FROM and TRUNACTE [0] and I'd like them the
be reviewed before applying such change.

[0] https://stackoverflow.com/questions/3256242/pros-cons-of-truncate-vs-delete-from

On 10/11/19, Jeroen De Dauw notifications@github.com wrote:

After merging the change @mwjames requested it to be reverted in
https://github.com/SemanticMediaWiki/SemanticMediaWiki/pull/4320#issuecomment-540624350

--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/4319#issuecomment-540632776

For the sake of this, I made a benchmark script. It runs both things 5 times to avoid any warmup issues the database might have.

<?php

error_reporting(E_ALL);
ini_set('display_errors', 1);

$dbhost = '';
$dbuser = '';
$dbpass = '';
$dbname = '';

$rows = 10000;
$runs = 5;
$table = 'smw_new';
$deleteQueries = [
    'DELETE FROM',
    'TRUNCATE TABLE',
];

echo '<!DOCTYPE html>';
echo '<html lang="en">';
echo '<head><meta charset="utf-8"><title>DELETE BENCHMARK</title></head>';
echo '<body>';
echo '<h1>DELETE BENCHMARK</h1>';

echo '<p>Run for '.$rows.' rows, running '.$runs.' times to avoid warmup effects.</p>';

$sqlCreate = 'CREATE TEMPORARY TABLE '.$table.' ( id INT UNSIGNED KEY ) ENGINE=MEMORY';
$sqlDrop = 'DROP TEMPORARY TABLE '.$table;
$db = new mysqli($dbhost, $dbuser, $dbpass, $dbname);

for($run = 0; $run < $runs; $run++) {
    $db->query($sqlCreate);
    foreach($deleteQueries as $deleteQuery) {
        echo '<h2>Run '.$deleteQuery.' (Run '.($run+1).')</h2>';
        for($i = 0; $i < $rows; $i++) {
            $db->query('INSERT INTO '.$table.' VALUES('.$i.')');
        }
        $before = $db->query('SELECT DATE_FORMAT(NOW(6), "%Y-%m-%d %H:%i:%s.%f" ) as ts')->fetch_object()->ts;
        $db->query($deleteQuery.' '.$table);
        $after = $db->query('SELECT DATE_FORMAT(NOW(6), "%Y-%m-%d %H:%i:%s.%f" ) as ts')->fetch_object()->ts;
        echo '<p>Time before: '.$before.'</p>';
        echo '<p>Time after: '.$after.'</p>';
        $dab = new DateTime($before);
        $daa = new DateTime($after);
        echo '<p>Difference: '.$dab->diff($daa)->f.'s</p>';
    }
    $db->query($sqlDrop);
}

echo '</body>';
echo '</html>';

On my test server, the result looks something like this:

Run DELETE FROM (Run 5)
Time before: 2019-10-16 12:59:43.087599
Time after: 2019-10-16 12:59:43.087730
Difference: 0.000131s

Run TRUNCATE TABLE (Run 5)
Time before: 2019-10-16 12:59:43.633259
Time after: 2019-10-16 12:59:43.633365
Difference: 0.000106s

This is for a temporary table (same table as smw_new in the extension) with 10000 rows. The difference is in the 1/10000 of a second, which does seem neglectible. Especially given that SMW will rarely have 10000 rows in this table.

I want to reiterate what notr1ch sair, that there is no master/slave replication in our setup, so this is not related to that. Using TRUNCATE on temporary tables causes an implicit commit, which causes our problem.

For the sake of this, I made a benchmark script.

Thanks.

Not for the sake of it but to confirm that there is __no__ significant
impact or evidence for a performance penalty (as documented) in the
application and execution of it.

On my test server, the result looks something like this:

Run DELETE FROM (Run 5)
Time before: 2019-10-16 12:59:43.087599
Time after: 2019-10-16 12:59:43.087730
Difference: 0.000131s

Run TRUNCATE TABLE (Run 5)
Time before: 2019-10-16 12:59:43.633259
Time after: 2019-10-16 12:59:43.633365
Difference: 0.000106s

Given above data, the results shows indeed an insignificant difference
therefore making the change from TRUNCATE to DELETE FROM can go
forward.

I want to reiterate what notr1ch sair, that there is no master/slave
replication in our setup, so this is not related to that. Using TRUNCATE
on temporary tables causes an implicit commit, which causes our problem.

As I said before, I'm unable to replicate the issue therefore I cannot
blindly rely on statements which cannot be tested or replicated hence
I depend on verifiable facts which you provided with the comparison
table.

Performance is key and I invested quite a couple of hours to tweak and
squeeze the system as it is by removing unnecessary lookup queries,
reviewing table indices, or introducing prefetch lookups to compensate
for fetch queries where the list of retrievable objects are
predicable. So, any change that can potentially harm performance needs
additional scrutiny and due diligence hence my insistence here.

PS: While traveling, I was revisiting the approach on how those
hierarchies are build and if time permits I will add a feature in
future which removes the need for either (TRUNCATE or DELETE FROM) and
reduces the amount of required temporary table operations by half.

On 10/16/19, Alex Winkler notifications@github.com wrote:

For the sake of this, I made a benchmark script.

<?php

error_reporting(E_ALL);
ini_set('display_errors', 1);

$dbhost = '';
$dbuser = '';
$dbpass = '';
$dbname = '';

$rows = 10000;
$runs = 5;
$table = 'smw_new';
$deleteQueries = [
  'DELETE FROM',
  'TRUNCATE TABLE',
];

echo '<!DOCTYPE html>';
echo '<html lang="en">';
echo '<head><meta charset="utf-8"><title>DELETE BENCHMARK</title></head>';
echo '<body>';
echo '<h1>DELETE BENCHMARK</h1>';

echo '<p>Run for '.$rows.' rows, running '.$runs.' times to avoid warmup
effects.</p>';

$sqlCreate = 'CREATE TEMPORARY TABLE '.$table.' ( id INT UNSIGNED KEY )
ENGINE=MEMORY';
$sqlDrop = 'DROP TEMPORARY TABLE '.$table;
$db = new mysqli($dbhost, $dbuser, $dbpass, $dbname);

for($run = 0; $run < $runs; $run++) {
  $db->query($sqlCreate);
  foreach($deleteQueries as $deleteQuery) {
      echo '<h2>Run '.$deleteQuery.' (Run '.($run+1).')</h2>';
      for($i = 0; $i < $rows; $i++) {
          $db->query('INSERT INTO '.$table.' VALUES('.$i.')');
      }
      $before = $db->query('SELECT DATE_FORMAT(NOW(6), "%Y-%m-%d %H:%i:%s.%f" )
as ts')->fetch_object()->ts;
      $after = $db->query('SELECT DATE_FORMAT(NOW(6), "%Y-%m-%d %H:%i:%s.%f" )
as ts')->fetch_object()->ts;
      echo '<p>Time before: '.$before.'</p>';
      echo '<p>Time after: '.$after.'</p>';
      $dab = new DateTime($before);
      $daa = new DateTime($after);
      echo '<p>Difference: '.$dab->diff($daa)->f.'s</p>';
  }
  $db->query($sqlDrop);
}

echo '</body>';
echo '</html>';

On my test server, the result looks something like this:

Run DELETE FROM (Run 5)
Time before: 2019-10-16 12:59:43.087599
Time after: 2019-10-16 12:59:43.087730
Difference: 0.000131s

Run TRUNCATE TABLE (Run 5)
Time before: 2019-10-16 12:59:43.633259
Time after: 2019-10-16 12:59:43.633365
Difference: 0.000106s

This is for a temporary table (same table as smw_new in the extension)
with 10000 rows. The difference is in the 1/10000 of a second, which does
seem neglectible. Especially given that SMW will rarely have 10000 rows in
this table.

I want to reiterate what notr1ch sair, that there is no master/slave
replication in our setup, so this is not related to that. Using TRUNCATE
on temporary tables causes an implicit commit, which causes our problem.

--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/4319#issuecomment-542690448

PS: While traveling, I was revisiting the approach on how those
hierarchies are build and if time permits I will add a feature in
future which removes the need for either (TRUNCATE or DELETE FROM) and
reduces the amount of required temporary table operations by half.

That also works of course. I understand you can't blindly trust comments that you can't verify, fair enough. I'm glad that my statistics can help alleviate any worries you might have about this. :)

PS: Thank you for taking the time working on this!

That also works of course. I understand you can't blindly trust comments
that you can't verify, fair enough. I'm glad that my statistics can help
alleviate any worries you might have about this. :)

Well, that's the future but for now doing the proposed change is the
more conservative route of action.

On 10/16/19, Alex Winkler notifications@github.com wrote:

PS: While traveling, I was revisiting the approach on how those
hierarchies are build and if time permits I will add a feature in
future which removes the need for either (TRUNCATE or DELETE FROM) and
reduces the amount of required temporary table operations by half.

That also works of course. I understand you can't blindly trust comments
that you can't verify, fair enough. I'm glad that my statistics can help
alleviate any worries you might have about this. :)

--
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/SemanticMediaWiki/SemanticMediaWiki/issues/4319#issuecomment-542729881

Well, that's the future but for now doing the proposed change is the more conservative route of action.

Of course. Thank you :)

Back-ported with 7a7e0a6ad9296b6ff0ea9cb0cab2f71824cbd3bc

Was this page helpful?
0 / 5 - 0 ratings