Readthedocs.org: Search isn't indexing the index page of nested pages in htmldir builder

Created on 27 Feb 2019  路  12Comments  路  Source: readthedocs/readthedocs.org

You can test with this project

https://anymail.readthedocs.io/en/latest/search/?q=Sending+email&check_keywords=yes&area=default

There aren't results from this page https://anymail.readthedocs.io/en/latest/sending/

This is kind of related to the doctype that we want to remove https://github.com/rtfd/readthedocs.org/issues/4638 and kind of the same as https://github.com/rtfd/readthedocs.org/issues/5254

In a more minimal example this is more visible, our search returns 0 results and fallback to sphinx.

The explanation is that we try to index the file sending.fjson but that file doesn't exists, what we want is to index sending/index.fjson

Bug

All 12 comments

We shouldn't use doctype anymore :).

I think basename is sending (not sending/index or index), that's why.

How do you figure out if you should check for an HTMLDir path without doctype? I guess we could just try both, but that feels like it would lead to other bugs.

That's what I want I figure out now. If there isn't a clean way, we have the doctype from the config module in the database (old projects don't have this, but we could create a migration).

Seems like this should work?

In [6]: for p in HTMLFile.objects.filter(project__slug='anymail', version__slug='latest', path='sending/index.html'):
   ...:     print(p.json_file_path)
   ...:
readthedocs.projects.models:1154[6212]: INFO Adjusted json file path: sending/index -> sending
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending.fjson

Oh, the filename is /home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/index.fjson -- that's the issue.

Seems like the logic is broken, and without it it would work 馃槩

Seems like this logic works for everything except the index file... There's some weird edge case here:

In [1]:
   ...: import os
   ...: for p in HTMLFile.objects.filter(project__slug='anymail', version__slug='latest', path__startswith='sending'):
   ...:     print(p.json_file_path)
   ...:     print(os.path.exists(p.json_file_path))
   ...:
   ...:
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/templates/index -> sending/templates
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/templates.fjson
True
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/tracking/index -> sending/tracking
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/tracking.fjson
True
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/django_email/index -> sending/django_email
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/django_email.fjson
True
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/signals/index -> sending/signals
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/signals.fjson
True
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/exceptions/index -> sending/exceptions
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/exceptions.fjson
True
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/anymail_additions/index -> sending/anymail_additions
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending/anymail_additions.fjson
True
readthedocs.projects.models:1154[9789]: INFO Adjusted json file path: sending/index -> sending
/home/docs/checkouts/readthedocs.org/media/json/anymail/latest/sending.fjson
False

Yeah, index is the only one affected.

It's because sending.rst or sending/index.rst both lead to the same file :/

There isn't a sending.rst file, just sending/index.rst https://github.com/anymail/django-anymail/blob/master/docs/sending/index.rst

Right, but both files could lead to the same output file, so there's no way to know which it came from :/

Was this page helpful?
0 / 5 - 0 ratings

Related issues

humitos picture humitos  路  3Comments

cagataycali picture cagataycali  路  4Comments

adamjstewart picture adamjstewart  路  4Comments

jaraco picture jaraco  路  4Comments

krzychb picture krzychb  路  4Comments