Mastodon: The problem of /interact/ path

Created on 25 Feb 2019  路  3Comments  路  Source: tootsuite/mastodon

Pitch


Add Disallow: /interact/ to robots.txt.

Motivation


Today, I use site:bgme.me to search in google and I found the path of /interact/ .

The problems of /interact/ path are as follow.

My account has enabled Opt-out of search engine indexing .
Open one toot of my account, view the source code of page, you can find the meta tag of <meta content='noindex' name='robots'>.

But open some url of /interact/ path that search from google, eg. https://bgme.me/interact/101594535014331725?type=favourite , https://bgme.me/interact/101590864098765986?type=favourite .

image

You can browse my toots through these pages. And on these pages, /interact/ path, there is no meta tag <meta content='noindex' name='robots'> and robots.txt also allow search engine crawl these pages. So even if I enabled Opt-out of search engine indexing, you can still search my toots from google.
What's worse is, you can browse the toots posted by users on other instances through these pages with /interact/ path, eg. https://bgme.me/interact/101580478936541577?type=reply .
That means even if you add Disallow: /interact/ to the robots.txt of your instance, but if the instances connected to you don't add this statement, you still can be searched from google.

Most helpful comment

I agree, I think the interact pages need <meta content='noindex' name='robots'>

All 3 comments

I agree, I think the interact pages need <meta content='noindex' name='robots'>

I agree, I think the interact pages need <meta content='noindex' name='robots'>

But how to deal with the toots posted by users on other instances?

But how to deal with the toots posted by users on other instances?

Once <meta content='noindex' name='robots'> is added to the interact pages eventually most instances will get the fix when they update.

Was this page helpful?
0 / 5 - 0 ratings