Semanticmediawiki: ' vs. ' using {{PAGENAME}}

Created on 3 May 2020  Â·  6Comments  Â·  Source: SemanticMediaWiki/SemanticMediaWiki

Setup and configuration

{
    "SMWElasticStore": {
        "mysql": "10.1.44-MariaDB-0ubuntu0.18.04.1",
        "es": "6.5.4"
    },
    "smw": "3.2.0-alpha",
    "mediawiki": "1.34.1",
    "php": "7.2.24-0ubuntu0.18.04.4"
}

Issue

I couldn't explain the difference in behaviour for [0] on the following use case:

{{#ask:
 [[Has title::{{PAGENAME}}]] 
}}

Doesn't match anything

vs.

{{#ask:
 [[Has title::MEMOIRE D'AMOUR PERDU]] 
}}

Returns expected results

Findings

The issue is that {{PAGENAME}} generates and returns an encoded string for MEMOIRE D'AMOUR PERDU which is MEMOIRE D'AMOUR PERDU and the reason why the actual query that relies on {{PAGENAME}} contains [[Has title::MEMOIRE D'AMOUR PERDU]] hence matches nothing given the annotation is stored as MEMOIRE D'AMOUR PERDU.

I don't know whether {{PAGENAME}} has always encoded ' or not or whether it is only a recent phenomenon with MW 1.34 assuming that the [0] query worked in the past. Anyway, using parser replacements such as {{PAGENAME}} may result in unexpected behaviour.

Note

If the property is holding a page type reference then the issue seems not as severe as when the property is a text type because the page type value is internally validated using a Title transformation (to ensure the input can build a valid Title reference) to create an entity representation while the text type uses the input as-is without any transformations.

[0] https://sandbox.semantic-mediawiki.org/wiki/MEMOIRE_D%27AMOUR_PERDU

documentation

All 6 comments

I don't know whether {{PAGENAME}} has always encoded ' or not or whether it is only a recent phenomenon with MW 1.34

It is not recent, see: https://www.mediawiki.org/wiki/Manual:PAGENAMEE_encoding#PAGENAME
(the page was created in 24 March 2011‎).

Still there is something it the water. I happen to know that this query worked in previous versions. If it was 1.33 or earlier I cannot tell.

Apparently there are others having similar issues with ' [0].

I don't think there is anything we can do besides raising the point and making users aware of that using {{FULLPAGENAME}} or {{PAGENAME}} can create string representations that contains HTML encoded parts and may not be comparable of what the repository has stored when a user simply wrote ' (not HTML encoded) as value input.

Still there is something it the water. I happen to know that this query worked in previous versions.

The following test confirms the difference in behaviour.

{
    "description": "Test `{{PAGENAME}}` with encoded HTML (#4764)",
    "setup": [
        {
            "namespace": "SMW_NS_PROPERTY",
            "page": "Has Page",
            "contents": "[[Has type::Page]]"
        },
        {
            "namespace": "SMW_NS_PROPERTY",
            "page": "Has text",
            "contents": "[[Has type::text]]"
        },
        {
            "page": "P0468 ' 1",
            "contents": "[[Has text::P0468 ' 1]] query: {{#ask: [[Has text::{{PAGENAME}}]] |default=not matched }}"
        },
        {
            "page": "P0468 ' 2",
            "contents": "[[Has text::P0468 ' 2]] query: {{#ask: [[Has text::P0468 ' 2]] |default=not matched }}"
        },
        {
            "page": "P0468 ' 3",
            "contents": "[[Has text::{{PAGENAME}}]] query: {{#ask: [[Has text::{{PAGENAME}}]] |default=not matched }}"
        }
    ],
    "tests": [
        {
            "type": "parser",
            "about": "#0 (using `{{PAGENAME}}` as query content replacement)",
            "subject": "P0468 ' 1",
            "assert-output": {
                "to-contain": [
                    ">P0468 ' 1 query: not matched"
                ]
            }
        },
        {
            "type": "parser",
            "about": "#1 (manually query input)",
            "subject": "P0468 ' 2",
            "assert-output": {
                "to-contain": [
                    "P0468 ' 2 query: <a class=\"mw-selflink selflink\">P0468 ' 2</a>"
                ]
            }
        },
        {
            "type": "parser",
            "about": "#2",
            "subject": "P0468 ' 3",
            "assert-output": {
                "to-contain": [
                    "P0468 &#39; 3 query: not matched"
                ]
            }
        }
    ],
    "settings": {
        "wgContLang": "en",
        "wgLang": "en",
        "smwgPageSpecialProperties": [
            "_MDAT"
        ]
    },
    "meta": {
        "version": "2",
        "is-incomplete": false,
        "debug": false
    }
}

[0] https://phabricator.wikimedia.org/T251962

@kghbln I think positions have been clarified, we might also add the test to verify the intended behaviour but aside from that nothing actionable is expected and users finding themselves in the situation as above can use this ticket as reference.

In other words: What do I have to use if I would like to automatically add the page name to the query? Currently I have no clue. I tried all possible variants and apart from adding the title manually nothing worked.

Last time I changed the page which was at the time working was 2019-08-26 which proves that MW 1.33 did not have an issue. Only MW 1.34 shows the borked behaviour.

Seriously? I have to install the ParserFunctions extension and use an extra parser function {{#titleparts: {{PAGENAME}} |1 }} to retrieve an page name automatically is such a case???

Was this page helpful?
0 / 5 - 0 ratings