Semanticmediawiki: Warning about the table specification in SQL requests

Created on 28 Jan 2020  Â·  5Comments  Â·  Source: SemanticMediaWiki/SemanticMediaWiki

Setup and configuration

  • SMW version: 3.1.2 (e00b1755a)
  • MW version: 1.33.2
  • PHP version: 7.2.24
  • DB system (MySQL, Blazegraph, etc.) and version: MySQL 5.7.28

Issue

This is only a maintenance issue, probably for the mid-term. Everything works fine around this issue, at least for now.

There are warnings such as Wikimedia\Rdbms\Database::tableName: use of subqueries is not supported this way. in the logs. This is a result of the warning added in wikimedia/mediawiki@d395dfb039fc on 2018-02-14. In the given stack, it is caused by this type of parameter "table name" in a select query: it is expected it is only a simple table name, but here it is an escaped table name with a JOIN (myprefix_smw_object_ids AS t0 INNER JOIN myprefix_t1 AS t1 ON t0.smw_id=t1.id in my specific example).

A way to fix it would be to use a description of the JOIN with a PHP array in the parameter join_conds (last parameter in select()) such as [ 'joined_table' => [ 'INNER JOIN', [ 'y = z' ] ].

At first sight, for this specific stack trace, it should be modified in SMW\SQLStore\QueryEngine\QueryEngine and in SMW\SQLStore\QueryEngine\QuerySegmentListProcessor::table(), but there are other occurrences of a select query where the table name is in fact a table name with a JOIN.

I have no idea if MediaWiki will impose such specification about table names or will still accept raw SQL, but I guess this warning is probably here to discourage raw SQL. I will ask Aaron in the following days.

Stack trace

{
  "class": "RuntimeException",
  "message": "",
  "code": 0,
  "file": "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/Database.php:2436",
  "trace": [
    "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/Database.php:2554",
    "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/Database.php:2665",
    "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/Database.php:1826",
    "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/Database.php:1782",
    "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/DBConnRef.php:53",
    "/opt/mediawiki/1.33.2/includes/libs/rdbms/database/DBConnRef.php:297",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/MediaWiki/Connection/Database.php:288",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/SQLStore/QueryEngine/QueryEngine.php:417",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/SQLStore/QueryEngine/QueryEngine.php:224",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/SQLStore/SQLStore.php:374",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/SQLStore/SQLStore.php:362",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/includes/query/SMW_QueryProcessor.php:338",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/ParserFunctions/AskParserFunction.php:364",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/ParserFunctions/AskParserFunction.php:197",
    "/opt/mediawiki/1.33.2/extensions/SemanticMediaWiki/src/ParserFunctionFactory.php:402",
    "/opt/mediawiki/1.33.2/includes/parser/Parser.php:3528",
    "/opt/mediawiki/1.33.2/includes/parser/Parser.php:3235",
    "/opt/mediawiki/1.33.2/includes/parser/Preprocessor_DOM.php:1285",
    "/opt/mediawiki/1.33.2/includes/parser/Parser.php:3409",
    "/opt/mediawiki/1.33.2/includes/parser/Preprocessor_DOM.php:1285",
    "/opt/mediawiki/1.33.2/includes/parser/Parser.php:3049",
    "/opt/mediawiki/1.33.2/includes/parser/Parser.php:1359",
    "/opt/mediawiki/1.33.2/includes/parser/Parser.php:491",
    "/opt/mediawiki/1.33.2/includes/content/WikitextContent.php:369",
    "/opt/mediawiki/1.33.2/includes/content/AbstractContent.php:555",
    "/opt/mediawiki/1.33.2/includes/Revision/RenderedRevision.php:265",
    "/opt/mediawiki/1.33.2/includes/Revision/RenderedRevision.php:234",
    "/opt/mediawiki/1.33.2/includes/Revision/RevisionRenderer.php:193",
    "/opt/mediawiki/1.33.2/includes/Revision/RevisionRenderer.php:142",
    "{\"function\":\"MediaWiki\\\\Revision\\\\{closure}\",\"class\":\"MediaWiki\\\\Revision\\\\RevisionRenderer\",\"type\":\"->\",\"args\":[\"[object] (MediaWiki\\\\Revision\\\\RenderedRevision: {})\",[]]}",
    "/opt/mediawiki/1.33.2/includes/Revision/RenderedRevision.php:197",
    "/opt/mediawiki/1.33.2/includes/poolcounter/PoolWorkArticleView.php:194",
    "/opt/mediawiki/1.33.2/includes/poolcounter/PoolCounterWork.php:123",
    "/opt/mediawiki/1.33.2/includes/page/WikiPage.php:1254",
    "/opt/mediawiki/1.33.2/includes/diff/DifferenceEngine.php:922",
    "/opt/mediawiki/1.33.2/includes/diff/DifferenceEngine.php:883",
    "/opt/mediawiki/1.33.2/includes/diff/DifferenceEngine.php:711",
    "/opt/mediawiki/1.33.2/includes/page/Article.php:931",
    "/opt/mediawiki/1.33.2/includes/page/Article.php:615",
    "/opt/mediawiki/1.33.2/includes/actions/ViewAction.php:68",
    "/opt/mediawiki/1.33.2/includes/MediaWiki.php:499",
    "/opt/mediawiki/1.33.2/includes/MediaWiki.php:294",
    "/opt/mediawiki/1.33.2/includes/MediaWiki.php:865",
    "/opt/mediawiki/1.33.2/includes/MediaWiki.php:515",
    "/opt/mediawiki/1.33.2/index.php:42"
  ]
}

Steps to reproduce

  1. Use a log management system like Graylog + Monolog or it can be set $wgDebugLogFile = '/tmp/some-file.log'; in LocalSettings.php.
  2. Navigate on some page where there is a SMW query Ask.
bug

Most helpful comment

@AaronSchulz I ping you here because the log-warning discussed here was introduced by you in wikimedia/mediawiki@d395dfb039fc and is triggered by many Semantic MediaWiki calls. In this case the table name is a string with a JOIN with other tables.

Some questions below but before I explain the stack trace:

  1. First, SMW calls internal SMW database wrapper for a SELECT
  2. Second, SMW internal database wrapper calls MW DBConnRef->select()
  3. Third, DBConnRef->select() directly passes its arguments to Database->select()
  4. Fourth Database->select() passes its arguments to Database->selectSQLText()
  5. Fifth, Database->selectSQLText( tableName, … ) detects the table argument is not an array and not an empty string, and calls Database->tableNamesWithIndexClauseOrJOIN( [ tableName ], … )
  6. Sixth, Database->tableNamesWithIndexClauseOrJOIN() calls Database->tableNameWithAlias(), which directly calls Database->tableName()
  7. At the end, Database->tableName() detects SQL keywords (AS and JOIN), triggers a warning, and returns its original parameter.

Questions:

  1. Given you added a warning here, I guess passing a precomputed table-name-with-a-JOIN is now an undesirable behaviour, right? Just to be sure it is not a side-effect given you introduced it in a commit related to subqueries.
  2. If so, is there some recommendation/documentation/best practice to create a JOIN request, and more generally with a lot of special features like @mwjames says: “join construct including temp tables, index hints etc”?
    I found Manual:Database access with (only) one example with a JOIN, but many examples in MW codebase, particularly in the various special pages; I guess these examples could be used to improve documentation. I am voluntary to improve the page mentioned above using these examples (if it is the right page), could you review it after and if needed expand it with edge cases?
  3. If possibly this feature of passing a precomputed table-name-with-a-JOIN is going to become a deprecated feature, is there some planning?

All 5 comments

Thanks for the elaborate report!

There are warnings such as Wikimedia\Rdbms\Database::tableName: use of subqueries is not supported this way. in the logs. This is a result of the warning added in wikimedia/mediawiki@d395dfb on 2018-02-14. In the given stack, it is caused by this type of parameter "table name" in a select query: it is expected it is only a simple table name, but here it is an escaped table name with a JOIN (myprefix_smw_object_ids AS t0 INNER JOIN myprefix_t1 AS t1 ON t0.smw_id=t1.id in my specific example).

I appreciated the report, and yes, I've seen these warnings before, yet they bring absolute no benefits for extensions and only increase migration costs which I find increasingly irritating. It just so happened that last week I'd had to spend a couple of hours bisecting a particular WMF infused change to make MW 1.34 work again with SMW (and the way the change was introduced could have been done differently), wasting hours of my time to find that patch and introduce some spaghetti code in SMW to go around it is not something I'd like to continue to do.

If someone wants to have a look at these warning feel free but please don't expect me to make this any priority on my task list.

I have no idea if MediaWiki will impose such specification about table names or will still accept raw SQL, but I guess this warning is probably here to discourage raw SQL.

I hope not because there are areas in SMW that rely on raw SQL given that the MW Database interface in some cases is not reliable.

the MW Database interface in some cases is not reliable.

Are those issues documented?

Are those issues documented?

I don't know but during my time working on SMW and MW I found several situations.

So, it depends on the use case which I have seen in connection with some SELECT queries where the array notation via Database::select is just not reliable when working with a lot of join construct including temp tables, index hints etc,. Building your SQL query from ground up and then inject it via Database:query is much more cleaner and avoids any WTF moments.

@AaronSchulz I ping you here because the log-warning discussed here was introduced by you in wikimedia/mediawiki@d395dfb039fc and is triggered by many Semantic MediaWiki calls. In this case the table name is a string with a JOIN with other tables.

Some questions below but before I explain the stack trace:

  1. First, SMW calls internal SMW database wrapper for a SELECT
  2. Second, SMW internal database wrapper calls MW DBConnRef->select()
  3. Third, DBConnRef->select() directly passes its arguments to Database->select()
  4. Fourth Database->select() passes its arguments to Database->selectSQLText()
  5. Fifth, Database->selectSQLText( tableName, … ) detects the table argument is not an array and not an empty string, and calls Database->tableNamesWithIndexClauseOrJOIN( [ tableName ], … )
  6. Sixth, Database->tableNamesWithIndexClauseOrJOIN() calls Database->tableNameWithAlias(), which directly calls Database->tableName()
  7. At the end, Database->tableName() detects SQL keywords (AS and JOIN), triggers a warning, and returns its original parameter.

Questions:

  1. Given you added a warning here, I guess passing a precomputed table-name-with-a-JOIN is now an undesirable behaviour, right? Just to be sure it is not a side-effect given you introduced it in a commit related to subqueries.
  2. If so, is there some recommendation/documentation/best practice to create a JOIN request, and more generally with a lot of special features like @mwjames says: “join construct including temp tables, index hints etc”?
    I found Manual:Database access with (only) one example with a JOIN, but many examples in MW codebase, particularly in the various special pages; I guess these examples could be used to improve documentation. I am voluntary to improve the page mentioned above using these examples (if it is the right page), could you review it after and if needed expand it with edge cases?
  3. If possibly this feature of passing a precomputed table-name-with-a-JOIN is going to become a deprecated feature, is there some planning?
Was this page helpful?
0 / 5 - 0 ratings