Gatsby: "source and transform nodes" is taking a long time with gatsby-source-wordpress

Created on 1 Mar 2018  Â·  29Comments  Â·  Source: gatsbyjs/gatsby

Context: I'm using gatsby-source-wordpress with lots of ACF fields. I have about 1,000 posts with a custom post type, and attached to those posts are about 1800 total images currently. The images are attached with a gallery ACF field. I have a bunch of other fields, but I expect the images and far and away the most resource-intensive. I have a few custom taxonomies but they don't have very many terms.

My current issue is with "source and transform nodes":

success source and transform nodes — 459.213 s

The little command line spinner just sits there for 459 seconds without indication of what it's doing.
What can I do to optimize this compile time specifically related to node sourcing?

Most helpful comment

I know its closed but In my case when i tried all solutions and nothing seems to fix it .. the only thing that worked was:

Resizing the Terminal window !! when it gets stuck.

All 29 comments

That time is probably mostly downloading images.

Add some logging to createRemoteFileNode in gatsby-source-filesystem and you'll have more visibility there.

I want to add a generic jobs logging framework to core so that plugins like this could update their progress there.

@KyleAMathews good call, that does seem to be what's going on. Gatsby already caches downloaded images, so I'm not sure if there's anything else I can really do to speed that up.

I've added logging to createRemoteFileNode and found that there is still additional time spent during source and transform nodes that happens after all the images are downloaded. Where else should I look for long-running tasks that would happen during this step?

How much more time?

10+ minutes in develop mode. I've let it run for awhile then killed it several times, and at some point it will actually complete. The production build is still only taking 9 minutes in total though.

Try editing https://github.com/gatsbyjs/gatsby/blob/master/packages/gatsby/src/bootstrap/page-hot-reloader.js (replace src with dist) in your node_modules folder and log whenever CREATE_NODE is called with the action.

What does the logging of the bootstrap process look like so I can see the times?

I've added logging to CREATE_NODE in that file but haven't seen it do anything yet, the develop command hasn't finished yet and never gets to a point where it can live reload.

I also added a log on actions.createNode in gatsby/dist/redux/actions.js and that shows me the file nodes that are created when the images are loaded for the remote file nodes, but nothing during the time that it's hanging.

I'm still looking at the createNode logs.

After the removeFileNode are finished, there is a long wait, followed by wordpress__acf_posts, wordpress__acf_pages, WordPressAcf_image_and_text and a bunch of other ACF fields.

Maybe the delay is Gatsby trying to infer the data schema of all my flexible content fields? I only have 4 layouts in the single Flexible Content field, on about 36 pages total, not a whole lot.

Schema inference doesn't happen until it reaches that point in the bootstrap process.

This might be a good time to get profiling working ;-)

https://github.com/gatsbyjs/gatsby/issues/4218

See what code is running.

Until those profiling changes are merged into the repo, is there anything I can do now to figure out what's taking so long on my build step?

There's nothing that needs merged other than instructions. I link from that issue to a gist that walks through how to setup profiling. Profiling is generic as I understand for any node app.

There's also this https://medium.com/@paul_irish/debugging-node-js-nightlies-with-chrome-devtools-7c4a1b95ae27

Hi there! Is it possible to turn off fetching images and saving them locally? It would save a ton of local development time. I am using gatsby-source-contentful. My challenge is that I am working with two different spaces with each having more than 1300 assets. At the moment it takes about 40 minutes to complete "source and transform nodes".

@arminnaimi Unless you're using some image related plugin, gatsby-source-contentful shouldn't download any images. I've got 6 spaces with ~500 posts and ~1500 images each and it takes ~10 minutes to complete the source and transform nodes step. I created a ticket to speed it up as well: https://github.com/gatsbyjs/gatsby/issues/5079.

One thing I am noticing is that when excluding additional locales in the API response from Contentful, the whole build goes down to something more manageable like 40 seconds. The fact that it struggles with locales might be an issue with how effectively GraphQL can handle large JSON responses.

@arminnaimi It might be it - in my case each space has only 1 locale, which is different for each space.

Due to the high volume of issues, we're closing out older ones without recent activity. Please open a new issue if you need help!

@KyleAMathews I am running into this as well, but I don't need images to be downloaded. We are using wordpress offload to s3 to keep images out of our builds. Is there a way to stop it from downloading and just allow the url file path that points to our s3 bucket?

I know its closed but In my case when i tried all solutions and nothing seems to fix it .. the only thing that worked was:

Resizing the Terminal window !! when it gets stuck.

@KyleAMathews @r1q I can't understand how this method works?
Resizing the Terminal window will run the build to success.
How can I do this on Netlify?

I have this exact behaviour. Such wierd thing the terminal resizing part. Perhaps some "screen" stuff going on there?

Maybe someone from Netlify or iTerm team can help in debugging?

If its any help, I can reproduce it in osx terminal and platformio-ide on atom. Not a terminal thing. It happens in the gatsby stages where a terminal spinner character appears. The spinner seems to freeze and then if one resizes the terminal, things go forward.

Like so:
image

Im monitoring in activity monitor to make sure what node is doing. When things freeze, node is at 0%:

image

Then upon resize, it starts going at it again.

image

Happens both on gatsby build and develop. OSX mojave here.

I know its closed but In my case when i tried all solutions and nothing seems to fix it .. the only thing that worked was:

Resizing the Terminal window !! when it gets stuck.

This fixed it for me, too. How weird. But thank you so much :)

Is there an open issue regarding this bug which gets "fixed" by resizing the window?

Just adding in something that I've just found.

I found this would hang for a long time when I used iTerm, for a test I booted up the classic terminal for mac and this process stopped hanging. Wondering if it is a memory or buffer issue in iTerm causing this to hang?

Is there an open issue regarding this bug which gets "fixed" by resizing the window?

Yeah, there's https://github.com/gatsbyjs/gatsby/issues/27325

I'm going to lock this issue.

@shanejones would love it if you could figure out what might cause it. None of us can repro it. Please continue that discussion in the linked PR.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

theduke picture theduke  Â·  3Comments

magicly picture magicly  Â·  3Comments

brandonmp picture brandonmp  Â·  3Comments

dustinhorton picture dustinhorton  Â·  3Comments

dustinhorton picture dustinhorton  Â·  3Comments