Add Disallow: /interact/ to robots.txt.
Today, I use site:bgme.me to search in google and I found the path of /interact/ .
The problems of /interact/ path are as follow.
My account has enabled Opt-out of search engine indexing .
Open one toot of my account, view the source code of page, you can find the meta tag of <meta content='noindex' name='robots'>.
But open some url of /interact/ path that search from google, eg. https://bgme.me/interact/101594535014331725?type=favourite , https://bgme.me/interact/101590864098765986?type=favourite .
You can browse my toots through these pages. And on these pages, /interact/ path, there is no meta tag <meta content='noindex' name='robots'> and robots.txt also allow search engine crawl these pages. So even if I enabled Opt-out of search engine indexing, you can still search my toots from google.
What's worse is, you can browse the toots posted by users on other instances through these pages with /interact/ path, eg. https://bgme.me/interact/101580478936541577?type=reply .
That means even if you add Disallow: /interact/ to the robots.txt of your instance, but if the instances connected to you don't add this statement, you still can be searched from google.
I agree, I think the interact pages need <meta content='noindex' name='robots'>
I agree, I think the interact pages need
<meta content='noindex' name='robots'>
But how to deal with the toots posted by users on other instances?
But how to deal with the toots posted by users on other instances?
Once <meta content='noindex' name='robots'> is added to the interact pages eventually most instances will get the fix when they update.
Most helpful comment
I agree, I think the interact pages need
<meta content='noindex' name='robots'>