Wordpress-seo: Restrict schema output on password-protected pages

Created on 7 Oct 2019  路  9Comments  路  Source: Yoast/wordpress-seo

If a page is password protected, we should avoid outputting potentially sensitive content in schema.

Specifically, we should:

  • Only output the 'base script' (Organization > WebSite > WebPage).
  • The WebPage should only have the following properties (remove any not whitelisted here):

    • @type, @id, url, name, isPartOf, inLanguage, datePublished, dateModified, breadcrumb

  • If the WebPage @type is an array, or if it is a more specific subtype (e.g., FAQpage, ProfilePage), set it to WebPage.
  • Remove all other nodes (and references to/from them).
Schema schema (JSON+LD) minor

All 9 comments

@jono-alderson With regard to the third bullet point. Is this to say in other words that the WebPage @type should simply always be set to WebPage on password-protected pages?

Yes please! :)

always, as in also when the user IS logged in and does have the appropriate privileges to see the page? that seems a bit strict and it may break things that depend on the page type.

Yes; almost all schema consumption use-cases assume an external agent querying a page and extracting the data.
For the very rare use-cases where a person wants to run schema extraction in their browser / via an authenticated agent, they should probably be doing stuff with filtering and our API - that's way outside of Kansas 馃槅

I think that's much safer and simpler than trying to do anything based on conditional access, with little/no disadvantage.

Anything which breaks depending on the page type should break gracefully, as per the spec(s).

Also... I'm also wondering if we should alter the name property to "Private page", so that, e.g., plugins which produce invoices with customer names in the title don't accidentally expose those (we can't simply unset it, as it's a required field for WebPage).

Sounds reasonable I think?

@jono-alderson So has the decision officially been made and can I implement this, or does it still need to run by the Product team?

Let's get some feedback. Moving to Slack.

Update:

  • Don't provide special conditions for logged-in users on password-protected pages (i.e., don't "un-remove" the schema)
  • Don't alter the name property.
Was this page helpful?
0 / 5 - 0 ratings