Audit group: Content best practices
Description: Document has a meta description
Failure description: Document does not have a meta description
Help text: Meta descriptions may be included in search results to concisely summarize page content. Read more in the Search Console Help page.
Success condition: Query selector head > meta[name=description] exists and has a non-empty content attribute.
Notes
nosnippet exists, use the audit description field to explain that the meta description is unused. Other robots directives that affect the meta description like noindex are covered in separate audits. _(not used)_Hi all 馃憢 I started working on this issue. Can I get assigned please? 馃槈
FYI there are couple of corner cases regarding "nosnippet" that are not mentioned in the task description:
noarchive, nosnippet)<meta name="robots" on a pagename="robots" can be replaced with a bot specific name="googlebot"X-Robots-Tag: noindex, nofollowX-Robots-Tag: googlebot: nofollowOther notes: everything is non-case sensitive, commas are required, spaces after commas are optional
Sources:
Per our discussion today, name="googlebot" and googlebot: are out of the picture for now since we are not focusing on search engine specific directives. This brings up a question: should we really support nosnippet since it's recognized only by a googlebot (according to this)?
Yeah let's remove the nosnippet condition. I'll update the audit description.
I checked the HTTP Archive data and that particular directive is only present in ~150 pages (of the 500k measured).
馃憣
BTW I'm not sure how much we can depend on httparchive data regarding SEO, their robot also follows the "robots" directives so the results may be skewed. On the other hand, if their robot does follow these directives, how did you get "noindex, nofollow" on the results of that query? 馃
HTTP Archive runs as a totally separate crawl from the Internet Archive's Wayback Machine. We've talked about respecting robots directives but they're ignored for now.
For the record: we've decided to trim the description before evaluating its length and to provide debugString's for audit failures (tag not found, empty value).
https://github.com/GoogleChrome/lighthouse/pull/3227#discussion_r136660967
Most helpful comment
HTTP Archive runs as a totally separate crawl from the Internet Archive's Wayback Machine. We've talked about respecting robots directives but they're ignored for now.