Mithril.js: Google search results for mithril.js.org show html vomit (see picture)

Created on 3 Apr 2018  Â·  14Comments  Â·  Source: MithrilJS/mithril.js


google-result-mithril

(Image should be entirely self-explanatory, title then some.)

Documentation Bug

Most helpful comment

That's just the code example from the page, Google happened to pick it up as the page description. The way to affect that would be to add a <meta name="description" content="..."> tag on the page.

All 14 comments

Thanks, it looks like we dont serve valid HTML.

https://validator.w3.org/check?uri=https%3A%2F%2Fmithril.js.org&charset=%28detect+automatically%29&doctype=Inline&group=0

Adding a doctype would be a good start...

Indeed, there are far fewer errors when parsing in HTML5 mode: https://validator.w3.org/nu/?doc=https%3A%2F%2Fmithril.js.org%2F

That's just the code example from the page, Google happened to pick it up as the page description. The way to affect that would be to add a <meta name="description" content="..."> tag on the page.

Good catch @codeclown!

So, where is the repo for that file so we can add it? This bug is still present, just checked.

https://github.com/MithrilJS/mithril.js/blob/next/docs/layout.html

The trouble is that may then get the same description for all pages (Google doesn't always honour meta tags). I'd need to test this more, but IIRC the home page is the only one that shows up in search results so it doesn't matter much, but we'd need some systematic testing to be sure.

Currently when searching f.e. site:mithril.js.org keys you'll get the start of actual page body content, don't know if the meta description would affect this. If not, then just making it reasonably generic would probably work.

https://support.google.com/webmasters/answer/35624?hl=en

Make sure that every page on your site has a meta description.

Differentiate the descriptions for different pages.

Programmatically generate descriptions.

You can, alternatively, prevent snippets from being created and shown for your site in Search results. Use the tag to prevent Google from displaying a snippet for your page in Search results.

^ several good tips from google here. We could programatically generate the meta tags to use the opening content on documentation pages. We could also turn off snippets otherwise as an easier solution for a specific page.

I'm open to doing this if it's welcome.

Go for it, @finetype!

Finally sat down to work on this today.

It would be "easy" to add snippets programatically, except for one problem: how to grab the right piece of text? The doc files aren't sufficiently standardized, unfortunately. Some ideal descriptions are under the first ---, some are under the first ##, most are under ###, and some are directly underneath the navigation table, above the ---. Any attempt to programatically extract a snippet from these docs will result in something pretty brittle and unwieldy.

I propose one of three solutions:

  1. Add a "meta description" section to the bottom of all the files. (Proposal: I can just copy some meaningful chunk of text by hand into such a section and put it under a #### Meta Description header on each doc).
  2. We could only add a custom snippet (or remove snippets) for that main index.md file that is causing the problem specifically mentioned in this bug report--that'd probably be the 80/20 solution.
  3. Remove snippets from the docs altogether with <meta name="nosnippets">. (This would fix the problem, would be nearly effortless, and would prevent this problem from happening on other docs as well, but then we don't get snippets at all, which are nice when they work.)

I'd prefer to go forward with (1), but would like some feedback before going that route.

1 sounds good to me (or if that's a bit time consuming, could just start

by quickly doing #3 and do #1 at a more relaxed pace.)

On Sat, May 5, 2018, 20:41 Kyle Baker notifications@github.com wrote:

Finally sat down to work on this today.

It would be "easy" to add snippets programatically, except for one
problem: how to grab the right piece of text? The doc files aren't
sufficiently standardized, unfortunately. Some ideal descriptions are under
the first ---, some are under the first ##, most are under ###, and some
are directly underneath the navigation table, above the ---. Any attempt
to programatically extract a snippet from these docs will result in
something pretty brittle and unwieldy.

I propose one of three solutions:

  1. Add a "meta description" section to the bottom of all the files.
    (Proposal: I can just copy some meaningful chunk of text by hand into such
    a section and put it under a #### Meta Description header on each doc.)
  2. Remove snippets from the docs altogether with name="nosnippets">
  3. We could only add a custom snippet for that main index.md file that
    is causing the problem specifically mentioned in this bug report--that'd
    probably be the 80/20 solution.

I'd prefer to go forward with (1), but would like some feedback before
going that route.

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/MithrilJS/mithril.js/issues/2114#issuecomment-386847188,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AMVFwRLnyNp0x5g5kPV4fOSLI5MauJneks5tvlSrgaJpZM4TET3e
.

I'm for 3, being that it's the most obvious win.

Heh. I'll just wait for more responses, I guess? lol. That's for that tiebreaker, @pygy. ;P

I re-ran the html validator--a lot of its error messages were because it was evaluating it as if it were HTML 4.1, but if you switch it to the HTML5 mode, we get a much nicer bit of output: https://validator.w3.org/nu/?showsource=yes&doc=https%3A%2F%2Fmithril.js.org%2F

I have added a doctype and a lang="en" to the html tag, as well as alt text to the logo, but the extra "p" closing tags are interesting... My best guess is that those are being inserted by the marked library erroneously, but there is some off custom handling around code blocks that may be causing it. If you look at the source we generate, the incorrect closing </p> tags are all after code blocks on that page.

Huh... maybe that's a bug in the validator (which is acknowledged as experimental). I see two

tags, one inside another. While that's odd, it seems correct to have two closing tags...

Looking at it now, I think the problem is in marked, though. Those inner p tags are written directly as p tags within the markdown, e.g. index.md, and the outer layer of p tags are probably added by marked around that.

Doesn't seem to cause any real issues, though, it seems, fwiw.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

omenking picture omenking  Â·  3Comments

mikejav picture mikejav  Â·  3Comments

dhinesh03 picture dhinesh03  Â·  4Comments

StephanHoyer picture StephanHoyer  Â·  4Comments

pygy picture pygy  Â·  4Comments