Contao: Suche - Keine Suchergebnisse wenn Referenzseite auf NewsDetails festgelegt wird

Created on 11 May 2020  ·  30Comments  ·  Source: contao/contao

Affected version(s)
Contao 4.9.2

Description
Ich möchte in meiner Suche nur die Ergebnisse meines Blogs anzeigen. Dazu habe ich als Referenzseite meine "News Detail Seite" ausgewählt.

Wenn ich nun eine Suche durchführen, kommen keine Ergebnisse zurück.

Reproduktion in der Online-Demo

  1. Module > Suchmaschine > Referenzseite > News Detail einstellen
  2. Suche nach einem Begriff innerhalb der News (z.B. release)
  3. Keine Ergebnisse gefunden

Anpassen der Referenzseite

  1. Löschen der Referenzseite
  2. Suche nach einem Begriff innerhalb der News (z.B. release)
  3. Ergebnis wird angezeigt.
bug

Most helpful comment

@contaoacademy Can you please checkout the latest 4.9.x-dev branch and try again? We have fixed the issue in #1738.

All 30 comments

@Toflar Habe das Problem gestern mit der aktuellsten dev-master getestet. Sind da die Änderungen auch schon drin?

Update: sieht nicht so aus, habe zumindest die Änderungen nicht im Code gefunden.
Ich werde die Änderungen kurz manuell einfügen und noch mal testen

@contaoacademy dev-master hat die Änderungen noch nicht. Mit 4.9.x-dev solltest du es testen können.

Mit 4.9.x-dev solltest du es testen können.

Danke. Der Code ist jetzt dabei.

Allerdings funktioniert damit die Suche leider immer noch nicht zuverlässig.
Ich kann das Problem nicht als behoben bestätigen.

Erst wenn ich keine Referenz angeben, dann kommen die Ergebnisse.

@ausi @Toflar Könntet ich euch das noch mal ansehen und das Ticket noch mal öffnen

Es ist alles korrekt, du musst den Suchindex neu aufbauen.

@contaoacademy hast du nach Einspielen der Änderung den Suchindex neu aufgebaut und den Such Cache gelöscht?

Ja. Mehrfach
Ich erstelle euch gerne ein Video

Was für eine pid haben die Einträge in tl_search?

I also tried it on the 4.9 branch, the news detail pages are not indexed correctly. It appears the crawler is not receiving the news detail URLs. It only indexed news detail URLs for me, that happen to be linked some where else on the page (e.g. in a news list).

@aschempp @qzminski is it possible that the urlmatcher does not return the matched pageModel in case of a news?

Hier der Inhalt von tl_search und tl_search_index
tl_search
tl_search_index

Hier die DB-Abfrage:

SELECT * FROM (SELECT tl_search_index.pid AS sid, GROUP_CONCAT(word) AS matches, COUNT(*) AS count, SUM(relevance) AS relevance FROM tl_search_index WHERE (word LIKE '%hund%') AND tl_search_index.pid IN(SELECT id FROM tl_search WHERE pid IN(36)) GROUP BY tl_search_index.pid ORDER BY relevance DESC) matches LEFT JOIN tl_search ON(matches.sid=tl_search.id);

@aschempp @qzminski is it possible that the urlmatcher does not return the matched pageModel in case of a news?

Pretty sure it does. CMF-Routing (currently) does not know anything about "news" or thelike. It simply matches a page with additional {parameters} at the end, which are then converted to $_GET as per Contao 3.

I don't think it could be a subrequest either. But what @fritzmg said makes sense, not sure if that is the problem though?

Ah, I think the indexing is correct, so it's not a problem of the linked PR. As you can see, the pid is correctly assigned in tl_search.
But I guess something might be wrong with the query adjustments of @ausi in https://github.com/contao/contao/pull/1678.

I will take a look at it.

@contaoacademy Can you send me an SQL dump of your setup? (e.g. via Slack)

Ah, I think the indexing is correct, so it's not a problem of the linked PR. As you can see, the pid is correctly assigned in tl_search.

The search indexer skips pages where the pid could not be determined. I get this for most news detail pages, but not for all (for whatever reason):

"Contao\CoreBundle\Crawl\Escargot\Subscriber\SearchIndexSubscriber",http://c49.local/news/lorem-ipsum,http://c49.local/share/c49-en.xml,3,noindex,"Forwarded to the search indexer. Did not index because of the following reason: No page ID could be determined."

Pretty sure it does. CMF-Routing (currently) does not know anything about "news" or thelike. It simply matches a page with additional {parameters} at the end, which are then converted to $_GET as per Contao 3.

For most news URLs

https://github.com/contao/contao/blob/41c7ba2827e81190dd97efe1732c84b77f4ca036/core-bundle/src/Search/Indexer/DefaultIndexer.php#L116

throws

"None of the routers in the chain matched url '/news/lorem-ipsum'"

and thus no page ID is retrieved.

@contaoacademy Can you send me an SQL dump of your setup? (e.g. via Slack)

Done

It is not related to the search query. Something with the indexing seems to be wrong. As you can see in https://github.com/contao/contao/issues/1730#issuecomment-627240088 some of the news get indexed with pid set to 13 and 13 is the id of the error 404 page.

Okay, then something with the matcher doesn't work 🤷‍♂️

Tried to find the issue here together with @qzminski - no luck so far.
Something in that url matching process is really strange.

I did some tests. The error is new in Contao 4.9.2
Contao 4.9.1 work correctly.

I found the files which causes the problem:
core-bundle/src/Resources/contao/pages/PageRegular.php
core-bundle/src/Search/Document.php

https://github.com/contao/contao/pull/1457/files

Update: Maybe some more files… but if I replace this 2 files in Contao 4.9.1 with the files from Contao 4.9.2 the search doesn't work any more.

After all I think this isn't something new because you already tried to fix it with https://github.com/contao/contao/pull/1670.

If I can help you with further test let me know.

I did some more tests with 4.9.x-dev

I found out some interesting.

If I disable my 404 page and double run the crawler. The tl_search is filled correctly.

After the first crawling:
crawl-1

After the second crawling
crawl-2

Here the logs from both crawlings
logs.zip

@contaoacademy Can you please checkout the latest 4.9.x-dev branch and try again? We have fixed the issue in #1738.

Thx. Looks great!

Was this page helpful?
0 / 5 - 0 ratings