If a page is password protected, we should avoid outputting potentially sensitive content in schema.
Specifically, we should:
Organization > WebSite > WebPage).WebPage should only have the following properties (remove any not whitelisted here):@type, @id, url, name, isPartOf, inLanguage, datePublished, dateModified, breadcrumbWebPage @type is an array, or if it is a more specific subtype (e.g., FAQpage, ProfilePage), set it to WebPage.@jono-alderson With regard to the third bullet point. Is this to say in other words that the WebPage @type should simply always be set to WebPage on password-protected pages?
Yes please! :)
always, as in also when the user IS logged in and does have the appropriate privileges to see the page? that seems a bit strict and it may break things that depend on the page type.
Yes; almost all schema consumption use-cases assume an external agent querying a page and extracting the data.
For the very rare use-cases where a person wants to run schema extraction in their browser / via an authenticated agent, they should probably be doing stuff with filtering and our API - that's way outside of Kansas 馃槅
I think that's much safer and simpler than trying to do anything based on conditional access, with little/no disadvantage.
Anything which breaks depending on the page type should break gracefully, as per the spec(s).
Also... I'm also wondering if we should alter the name property to "Private page", so that, e.g., plugins which produce invoices with customer names in the title don't accidentally expose those (we can't simply unset it, as it's a required field for WebPage).
Sounds reasonable I think?
@jono-alderson So has the decision officially been made and can I implement this, or does it still need to run by the Product team?
Let's get some feedback. Moving to Slack.
Update:
name property.