Gatsby: Sitemap plugin adds trailing / to URLs

Created on 26 Apr 2018  Â·  4Comments  Â·  Source: gatsbyjs/gatsby

Description

We recently launched a new site on Gatsby and I've been keeping track of Google's re-indexing of our website. I noticed a lot of our pages are showing up as “Excluded – Submitted URL not selected as canonical” in Google Search Console.

Investigating this further, it seems that the sitemap generated by gatsby-plugin-sitemap adds a trailing / to the end of URLs, whereas Google's crawler does not. The end result is that the Google determines these are duplicate pages (ones with trailing / and ones without), prefers the URLs without the / as the canonical, and excludes most, if not all, of the URLs submitted via the sitemap.

More information on Google Search Console’s excluded URLs.

Steps to reproduce

  1. Install gatsby-plugin-sitemap and deploy your website.
  2. Log in to Google Search Console and submit the sitemap.
  3. Wait a few days for Google to index your website and parse the sitemap.
  4. Visit the Search Console, and notice a lot of URLs falling under “Excluded”.
  5. The excluded URLs will have a trailing /, whereas indexed URLs will not.

Expected result

It would be nice if gatsby-plugin-sitemap did not add a trailing / to the end of URLs by default. This would then ensure the sitemap URLs match what Google crawls independently, resulting in little to no excluded URLs due to the “Submitted URL not selected as canonical” reason.

Actual result

Lots of “Excluded – Submitted URL not selected as canonical” URLs in Google Search Console.

Environment

  • gatsby-plugin-sitemap version: 1.2.22
  • Gatsby version: 1.9.250
  • gatsby-cli version: 1.1.48
  • Node.js version: v8.4.0
  • Operating System: macOS High Sierra 10.13.4

File contents:

Only posting the relevant portions for brevity. There are more files, but they're not relevant.

gatsby-config.js:

module.exports = {
  siteMetadata: {
    siteName: "Dovetail",
    siteUrl: "https://dovetailapp.com"
  },
  plugins: [
    "gatsby-plugin-sitemap",
  ]
};

package.json:

{
  "dependencies": {
    "gatsby": "^1.9.244",
    "gatsby-link": "^1.6.40",
    "gatsby-plugin-canonical-urls": "^1.0.18",
    "gatsby-plugin-google-analytics": "^1.0.29",
    "gatsby-plugin-manifest": "^1.0.20",
    "gatsby-plugin-nprogress": "^1.0.14",
    "gatsby-plugin-react-helmet": "^2.0.10",
    "gatsby-plugin-react-next": "^1.0.11",
    "gatsby-plugin-sentry": "^0.0.4",
    "gatsby-plugin-sharp": "^1.6.42",
    "gatsby-plugin-sitemap": "^1.2.22",
    "gatsby-plugin-typescript": "^1.4.19",
    "gatsby-remark-images": "^1.5.61",
    "gatsby-source-filesystem": "^1.5.29",
    "gatsby-transformer-remark": "^1.7.39",
  },
}
question or discussion

Most helpful comment

Thanks for the detailed report @humphreybc! Having a quick look at your sitemap it seems there's a mix of paths with and without a trailing slash, which makes me wonder if gatsby-plugin-sitemap is the right place to look.

Could you try adding gatsby-plugin-remove-trailing-slashes to your your project and see if that helps?

All 4 comments

Thanks for the detailed report @humphreybc! Having a quick look at your sitemap it seems there's a mix of paths with and without a trailing slash, which makes me wonder if gatsby-plugin-sitemap is the right place to look.

Could you try adding gatsby-plugin-remove-trailing-slashes to your your project and see if that helps?

@humphreybc I was working through this exact issue today. I found that gatsby-plugin-sitemap sets the URL based on how the path is set during createPage.

In my case I wanted to ensure all my pages had a trailing slash so I went back and changed path: `/${post.slug}` to path: `/${post.slug}/` inside gatsby-node.js. I imagine you can do this in reverse to ensure none of your paths and therefore none of your sitemap URLs have a trailing slash. :smile:

Adding gatsby-plugin-remove-trailing-slashes seems to have fixed the issue. I guess it just modifies the ‘path’ in Gatsby, which the sitemap plugin uses. Nice find, @m-allanson and @lightstrike.

Seems like this is resolved! Closing the issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

benstr picture benstr  Â·  3Comments

ferMartz picture ferMartz  Â·  3Comments

3CordGuy picture 3CordGuy  Â·  3Comments

hobochild picture hobochild  Â·  3Comments

andykais picture andykais  Â·  3Comments