The plugin gatsby-plugin-sitemap removes trailing slashes from paths. For example:
https://www.example.com/test/ becomes https://www.example.com/test in the generated sitemap.xml
It seems that by design Gatsby redirects to path's with trailing slashes server-side, for example
$ curl -I http://localhost:9000/test
HTTP/1.1 301 Moved Permanently
X-Powered-By: Express
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers: Origin, X-Requested-With, Content-Type, Accept
Content-Type: text/html; charset=UTF-8
Content-Length: 177
Content-Security-Policy: default-src 'none'
X-Content-Type-Options: nosniff
Location: /test/
Vary: Accept-Encoding
Date: Mon, 14 Oct 2019 14:35:20 GMT
Connection: keep-alive
Where /test has a 301 redirect to /test/.
The result is that Search Engines (e.g. Google) that read the sitemap flag the pages as Duplicate, submitted URL not selected as canonical. If Gatsby (express) intends paths to end in a trailing slash, then this should be the same in the sitemap.xml. However this function removes the trailing slash in the sitemap.
hmm I can't see that behaviour on my test site ... https://gatsby-starter-carraway.netlify.com/sitemap.xml
Here's the code showing the plugin use: https://github.com/endymion1818/gatsby-starter-carraway/blob/master/gatsby-config.js#L23
Here is a snippet from my gatsby config:
...
plugins: [
`gatsby-plugin-react-helmet`,
`gatsby-plugin-sitemap`,
{
resolve: `gatsby-source-mongodb`,
options: {
connectionString: process.env.DB_HOST,
dbName: process.env.DB_NAME,
collection: `words`,
},
},
...
As you can see, I am using the default settings (no additional configuration).
I am running:
...
"gatsby-plugin-sitemap": "^2.0.4",
...
I am using the createPage API to programmatically create pages from entries in a database as documented here.
Would like to keep this open to understand why there is a difference in the resulting sitemap output,
not stale
@rudolphfunk We're seeing this too. We're using gatsby-plugins: offline, sitemap, and remove-trailing-slashes.
I'm not able to reproduce this. Are you sure it's not because you're using remove-trailing-slashes?
Example of generated sitemap in production: https://oathall-leavers.web.app/sitemap.xml
We ran into a lot of issues with remove-trailing-slashes. Mostly because that plugin caused a redirect from contact to contact/ and back to contact, killing our SEO scores.
We now do this technique:
// Remove trailing slash unless page is /
const replacePath = path => (path === `/` ? path : path.replace(/\/$/, ``))
exports.onCreatePage = ({ page, actions }) => {
const { createPage, deletePage } = actions
const oldPage = Object.assign({}, page)
page.path = replacePath(page.path)
if (page.path !== oldPage.path) {
deletePage(oldPage)
createPage(page)
}
}
This does NOT work on local development environments, but DOES work on production, which is all we care about.
We also turned off gatsby-plugin-offline, as we don't need PWA support really. We are still using the sitemap.xml, which is not generating trailing slashes in production anymore.
This works for all of our pages. Note that most of our pages (90%) are generated based on Markdown content, not the default gatsby pages. Here is a code snippet in gatsby-node.js to that end:
exports.createPages = async ({ actions, graphql, reporter }) => {
const { createPage } = actions
const result = await graphql(`
...
`)
if (result.errors) {
reporter.panicOnBuild(`Error while running GraphQL query.`)
return
}
result.data.allMarkdownRemark.edges.forEach(({ node }) => {
const { frontmatter } = node
const { meta } = frontmatter
if(meta && meta.path && meta.template) {
if(meta.template !== 'post' || meta.published === true) {
createPage({
path: meta.path,
component: path.resolve(`src/templates/${meta.template}Template.js`),
context: {
},
})
}
}
})
So, I think this is a problem with gatsby itself, not gatsby-plugin-sitemap, I just felt it was related due to some of the original redirects we were getting.
Hi I have still the same problem.
No trailing slash in the sitemap and no slash at canonical urls.
Most helpful comment
Hi I have still the same problem.
No trailing slash in the sitemap and no slash at canonical urls.