I am trying to build a gatsby site with wordpress using acf fields and gatsby-source-wordpress. Unfortunately since adding a a few posts I run into the issue that not all but different images are being fetched every time. The problem is hard to track as I don't get any error messages and it appears to be random.
The result is usually even worse when deploying to netlify.
The project I am working on currently has only 13 posts and 109 wordpress__acf_media to fetch in total.
I looked into similar issues but non of the solutions worked for me.
The same issues occurred in previous node versions and before updating all the modules. work on two other websites where I use wordpress-source-plugin as well and the result is very similar.
-> wordpress__acf_options fetched : 1
-> wordpress__acf_posts fetched : 13
-> wordpress__acf_pages fetched : 1
-> wordpress__acf_media fetched : 93
-> wordpress__acf_blocks fetched : 0
-> wordpress__acf_categories fetched : 5
-> wordpress__acf_tags fetched : 0
-> wordpress__POST fetched : 13
-> wordpress__PAGE fetched : 1
-> wordpress__wp_media fetched : 109
-> wordpress__wp_taxonomies fetched : 1
-> wordpress__CATEGORY fetched : 5
The result looks different every time:
success Downloading remote files - 6.985s - 10/109 15.60/s
success Downloading remote files - 45.059s - 58/109 2.42/s
success Downloading remote files - 18.224s - 39/109 5.98/s
Possible conflict with a wordpress plugin? Although I tried to deactivate and delete the ones that are not related to acf and the result didn't change.
• ACF to REST Api
• Advanced Custom Fields PRO
• Category Order and Taxonomy Terms Order
• TinyMCE Advanced
• Webhook Netlify Deploy
• WP REST API CacheDeactivate
{
resolve: `gatsby-source-wordpress`,
options: {
baseUrl: `arc.im-burrow.com`,
protocol: `https`,
hostingWPCOM: false,
useACF: true,
verboseOutput: false,
includedRoutes: [
"**/categories",
"**/posts",
"**/pages",
"**/media",
"**/tags",
"**/taxonomies",
],
excludedRoutes: [
"**/comments",
"**/users",
],
keepMediaSizes: false,
concurrentRequests: 10,
}
},
I assume it is not a graphql issue, never the less:
query {
allWordpressPost {
edges {
node {
title
path
categories{
slug
}
acf {
index {
localFile {
childImageSharp {
fluid(maxWidth: 700){
...GatsbyImageSharpFluid_withWebp
}
}
}
}
}
}
}
}
allWordpressCategory {
edges {
node {
slug
}
}
}
}
All images should be downloaded and image.localFile should be accessible.
Random amount of media is not being downloaded and return "null".
System:
OS: macOS 10.15.2
CPU: (4) x64 Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz
Shell: 3.2.57 - /bin/bash
Binaries:
Node: 12.13.0 - /usr/local/bin/node
Yarn: 1.10.1 - /usr/local/bin/yarn
npm: 6.12.0 - /usr/local/bin/npm
Languages:
Python: 2.7.16 - /usr/bin/python
Browsers:
Chrome: 78.0.3904.108
Safari: 13.0.4
npmPackages:
gatsby: ^2.18.10 => 2.18.10
gatsby-image: ^2.2.34 => 2.2.34
gatsby-plugin-manifest: ^2.2.31 => 2.2.31
gatsby-plugin-offline: ^3.0.27 => 3.0.27
gatsby-plugin-react-helmet: ^3.1.16 => 3.1.16
gatsby-plugin-sass: ^2.1.24 => 2.1.24
gatsby-plugin-sharp: ^2.3.5 => 2.3.5
gatsby-source-filesystem: ^2.1.40 => 2.1.40
gatsby-source-wordpress: ^3.1.51 => 3.1.51
gatsby-transformer-sharp: ^2.3.7 => 2.3.7
npmGlobalPackages:
gatsby-cli: 2.8.16
Any Ideas?
Hi @BURROO , thanks for opening an issue. By default Gatsby will open 200 connections for remote file downloads at once. My first thought is that this could be due to your server not being able to handle that many concurrent connections.
Try setting the env var GATSBY_CONCURRENT_DOWNLOAD to a lower number (maybe 5 or so) and give it another shot. If this ends up being too slow and you need better performance, using a host that can handle more concurrent connections would speed things up a lot (pantheon is good for this).
Check these docs for more info on that env var and let me know if you need any help getting that set up.
@Tyler thank you, that actually seemed to fix the problem. Do you think this is an issue with the server where my REST API / Wordpress is hosted? Since this happens locally I assume there shouldn't be a problem with the server where I am hosting the gatsby site.
@BURROO if that fixed your problem it's an issue with the amount of concurrent requests overloading your server. Local servers can also suffer from this problem, although you could likely tweak your local server until it can handle that many concurrent connections. For ex I'm using Local by flywheel locally and it's actually far slower and can handle less concurrent connections than the pantheon site I'm using for testing.
I'm currently building out the next version of gatsby-source-wordpress which is a ground-up rewrite. I will be adding a retry with exponential backoff algorithm for fetching images and data, so this shouldn't happen anymore in the next major version.
Also got this problem with gatsby-source-wordpress. The project has 800+ media files and still growing. The client just added more pages and the images on these pages are returning null. While the other slightly older posts still have their url passed to gatsby.
@TylerBarnes _"Try setting the env var GATSBY_CONCURRENT_DOWNLOAD to a lower number (maybe 5 or so) and give it another shot."_ How do I do this? I've seen the docs that you've linked, but weren't able to figure it out.
@BURROO what did you do to fix it? Could you link some code as reference?
Thanks!
@TylerBarnes thank you for the clarification.
@jromme It took me some time to figure this out as well.
I added this snippet to the package.json inside "scripts":
"build": "GATSBY_CONCURRENT_DOWNLOAD=5 gatsby build",
"develop": "GATSBY_CONCURRENT_DOWNLOAD=5 gatsby develop",
And then instead of gatsby develop or gatsby build, I was running npm run develop or npm run build.
If you also work with netlify for the remote build you can set an environment variable in the settings on your netlify panel.
You can also use the npm package dotenv. Check these docs for more info https://www.gatsbyjs.org/docs/environment-variables/#server-side-nodejs
Most helpful comment
@BURROO if that fixed your problem it's an issue with the amount of concurrent requests overloading your server. Local servers can also suffer from this problem, although you could likely tweak your local server until it can handle that many concurrent connections. For ex I'm using Local by flywheel locally and it's actually far slower and can handle less concurrent connections than the pantheon site I'm using for testing.
I'm currently building out the next version of
gatsby-source-wordpresswhich is a ground-up rewrite. I will be adding a retry with exponential backoff algorithm for fetching images and data, so this shouldn't happen anymore in the next major version.