Search appears to work with beginning of word, but then not when the full word is queried.
Query: "where"
Expected preview options: anything heading/paragraph/etc containing "where"
site_name: 'DEDocketResearch'
theme: 'material'
extra_css: [extra.css]
extra_javascript: [extra.js]
nav:
- '': 'index.md'
- '1-1-DELADocketSurvey': '1-1-DELADocketSurvey/1-1-DELADocketSurvey.md'
- '1-2-DEJMTScrapeAndSurvey': '1-2-DEJMTScrapeAndSurvey/1-2-DEJMTScrapeAndSurvey.md'
- '1-3-LAJMTInitialComparison': '1-3-LAJMTInitialComparison/1-3-LAJMTInitialComparison.md'
I've saw the related issues, but couldn't figure if/how this was related exactly.
This project is awesome and @squidfunk is so responsive!
Have you seen #1097? It's definitely related to Lunr.js stemmer. Furthermore, which/what/where may be stopwords.
Stop words. Right. That makes sense: "wh" stems correctly (including results with "which"), but "which" as a whole is filtered out.
I'm looking at this https://github.com/olivernn/lunr.js/issues/212 – Any guidance on doing this within the theme? Do you expect I will have to rebuild?
For future reference, I will try to lay out the process which Material currently uses for localization and in the end sketch out how to achieve what you're asking for:
The English localization file partials/language/en.html is the base from which all other languages _extend_, which means that if a localization file does not specify a value for a placeholder, it will always fall back to the respective English translation. This is particularly true for the placeholders that were introduced after some of the localizations where submitted, like for example the skip.to.content placeholder for the equally-titled button. French, for example, will show "Skip to content", as it doesn't specify a translation for the placeholder. Now, this file contains three placeholders that are used to configure search behavior:
Lunr.js provides stemmers for some languages through lunr-languages (which is integrated with Material), but not for all (currently 36) supported by Material. Because I wanted to support search in those languages, I fiddled around with Lunr.js and found out that if I disable stemming and the stopword filter, those languages could be searched, too. The search experience may not be as smooth as it is with English, but it's better than nothing, like for example Hebrew:
This was the reason why I pulled search configuration into the localization files. As those values must be accessible from JavaScript, they are defined as meta tags within the head section:
This approach seems to work reasonably well. Some languages use stemmers from other languages, as they _seem_ to work well enough (Chinese and Korean use jp, Serbo-Croatian uses ro, etc.). Why _seem_? Because I don't speak those languages, but when integrating them I always try to search some of the localized terms to see whether Lunr.js catches them on a best effort basis.
So, answering your question, how could you disable stemming and stopword filtering? You just need to override partials/language/en.html and unset the three placeholders:
search.language: if this is set, the respective lunr.<language>.js file containing the stemmer for the language is loaded and initialized. Otherwise, no stemmer is loaded.search.pipeline.stopwords: if this is set to false, the _stopword filter_ will be removed from the pipeline.search.pipeline.trimmer: if this is set to false, the _trimmer_ will be removed from the pipeline.Why is this not possible via mkdocs.yml? Because up to now, nobody needed it. In theory, we could make it configurable by adjusting partials/language.html, which is the entry point for localization, but I think it won't be necessary. I want to keep configuration as lean as possible.
I hope that this shows that a lot of thought was put into how search works and how it can be localized and scaled to so many languages without much effort.
This does indeed show a lot of thought, and setting "search.pipeline.stopwords": false, does fix the issue.
Thanks again for work, and continued attention, on this project.