Gatsby: Build gets stuck at Generating image thumbnails / Update schema on large sites

Created on 16 May 2019  Â·  18Comments  Â·  Source: gatsbyjs/gatsby

I am querying allImageSharp in a project with 300 pages and 2000 images in gatsby-node.
During gatsby build or gatsby develop, the build process freezes when generating thumbnail 20~ or so

Generating image thumbnails [==----------------------------] 25/9667 24.0 secs 0%

I also have the same query called in a static query on a page. If I remove the query from gatsby-node and use only the one in the page, the thumbnails generation correctly in around 600 seconds

Generating image thumbnails [=============================-] 9235/9667 560.0 secs 96%

This is what my query looks like:

gallery: allImageSharp {
  edges {
    node {
      fluid(
        sizes: "(orientation: portrait) 50vw, (orientation: portrait) and (min-width: 750px) 30vw, (orientation: landscape) 30vw,(orientation: landscape) and (min-width: 1170px) 18vw,(orientation: landscape) and (min-width: 1800px) 16vw"
      ) {
        aspectRatio
        base64
        presentationHeight
        presentationWidth
        sizes
        src
        srcSet
      }
    }
  }
}

Environment

System:
    OS: macOS 10.14.4
    CPU: (4) x64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz
    Shell: 3.2.57 - /bin/bash
  Binaries:
    Node: 10.15.3 - ~/.nvm/versions/node/v10.15.3/bin/node
    Yarn: 1.15.2 - /usr/local/bin/yarn
    npm: 6.9.0 - ~/.nvm/versions/node/v10.15.3/bin/npm
  Languages:
    Python: 2.7.15 - /usr/local/bin/python
  npmPackages:
    gatsby: ^2.4.5 => 2.4.5
    gatsby-image: ^2.0.41 => 2.0.41
    gatsby-plugin-algolia: ^0.3.0 => 0.3.0
    gatsby-plugin-intl: ^0.1.7 => 0.1.7
    gatsby-plugin-manifest: ^2.1.1 => 2.1.1
    gatsby-plugin-matomo: ^0.7.0 => 0.7.0
    gatsby-plugin-offline: ^2.1.0 => 2.1.0
    gatsby-plugin-react-helmet: ^3.0.12 => 3.0.12
    gatsby-plugin-robots-txt: ^1.4.0 => 1.4.0
    gatsby-plugin-sass: ^2.0.11 => 2.0.11
    gatsby-plugin-sharp: ^2.0.37 => 2.0.37
    gatsby-plugin-sitemap: ^2.1.0 => 2.1.0
    gatsby-plugin-typescript: ^2.0.13 => 2.0.13
    gatsby-remark-images: ^3.0.11 => 3.0.11
    gatsby-source-filesystem: ^2.0.34 => 2.0.34
    gatsby-transformer-json: ^2.1.11 => 2.1.11
    gatsby-transformer-remark: ^2.3.12 => 2.3.12
    gatsby-transformer-sharp: ^2.1.19 => 2.1.19
  npmGlobalPackages:
    gatsby-cli: 2.5.14
stale? question or discussion

Most helpful comment

What has worked for me:
_Downgrade node.js_

  1. Delete node_modules & package-lock.json & yarn.lock
  2. Use nvm(node version manager) and switch to node version 10.18.1:
    nvm install 10.18.1
    nvm use 10.18.1

  3. yarn install

  4. gatsby clean
  5. gatsby develop / gatsby build

All 18 comments

I restructured my gatsby-node.js as follows. Seems that createPages and generating image thumbnails start before my allImageSharp galleryQuery is resolved. When finally galleryQuery is resolved, generating image thumbnails crushes. It usually happens around image ~20. My allImageSharp query galleryQuery, is querying around 2000 images.

I suppose if createPages and generating image thumbnail would wait till galleryQuery resolves, probably it would not freeze anymore. How can I make createPages wait till galleryQuery completes?

gatsby-query-issue-1

const path = require(`path`)
const _ = require("lodash")

exports.createPages = async ({ graphql, actions }) => {
  const { createPage } = actions
  const collectionTemplate = path.resolve("src/components/Collection/index.tsx")
  const tagTemplate = path.resolve("src/components/Tag/index.tsx")
  const operaTemplate = path.resolve(`src/components/Opera/index.tsx`)
  const tagsLocale = [
    {
      original: "categoryEn",
      translation: "tags",
    },
    {
      original: "categoryIt",
      translation: "categorie",
    },
  ]

  let collection = []
  let galleryAll = []
  let favicon = []
  let tagsEn = []
  let tagsIt = []

  // Collection query
  const collectionQuery = graphql(`
    {
      collection: allJson {
        totalCount
        edges {
          node {
            parent {
              ... on File {
                relativeDirectory
              }
            }
            price
            titolo
            categoryIt
            categoryEn
          }
        }
      }
    }
  `).then(result => {
    if (result.errors) {
      console.log(results.errors[0].message)
    }
    console.log("resolved collectionQuery")
    collection = result.data.collection.edges
    return collection
  })

  // Gallery query
  const galleryQuery = graphql(`
    {
      galleryAll: allImageSharp {
        edges {
          node {
            fluid(
              sizes: "(orientation: portrait) 50vw, (orientation: portrait) and (min-width: 750px) 30vw, (orientation: landscape) 30vw,(orientation: landscape) and (min-width: 1170px) 18vw,(orientation: landscape) and (min-width: 1800px) 16vw"
            ) {
              aspectRatio
              base64
              presentationHeight
              presentationWidth
              sizes
              src
              srcSet
            }
          }
        }
      }
    }
  `).then(result => {
    if (result.errors) {
      console.log(results.errors[0].message)
    }
    console.log("resolved galleryQuery")
    galleryAll = result.data.galleryAll.edges
    return galleryAll
  })

  // Favicon query
  const faviconQuery = graphql(`
    {
      favicon: imageSharp(original: { src: { regex: "/favicon/" } }) {
        original {
          width
          height
          src
        }
      }
    }
  `).then(result => {
    if (result.errors) {
      console.log(results.errors[0].message)
    }
    console.log("resolved favicon query")
    favicon = result.data.favicon.original
    return favicon
  })

  // TagEn query
  const tagsEnQuery = graphql(`
    {
      tagsEn: allJson {
        group(field: categoryEn) {
          totalCount
          fieldValue
        }
      }
    }
  `).then(result => {
    if (result.errors) {
      console.log(results.errors[0].message)
    }
    console.log("resolved tagsEn query")
    tagsEn = result.data.tagsEn.group
    return tagsEn
  })

  // TagIt query
  const tagsItQuery = graphql(`
    {
      tagsIt: allJson {
        group(field: categoryIt) {
          totalCount
          fieldValue
        }
      }
    }
  `).then(result => {
    if (result.errors) {
      console.log(results.errors[0].message)
    }
    console.log("resolved tagsIt query")
    tagsIt = result.data.tagsIt.group
    return tagsIt
  })

  await Promise.all([
    collectionQuery,
    galleryQuery,
    faviconQuery,
    tagsEnQuery,
    tagsItQuery,
  ]).then(([collection, galleryAll, favicon, tagsEn, tagsIt]) => {

      // Create index / collezione pages
      const itemPerPage = 20
      const numPages = Math.ceil(collection.length / itemPerPage)
      Array.from({ length: numPages }).forEach((_, i) => {
        createPage({
          path: i === 0 ? `/` : `/${i + 1}`,
          component: collectionTemplate,
          context: {
            limit: itemPerPage,
            skip: i * itemPerPage,
            numPages,
            currentPage: i + 1,
            galleryAll: galleryAll,
            favicon: favicon,
            tagsEn: tagsEn,
            tagsIt: tagsIt,
          },
        })
      })

      collection.forEach(({ node }) => {
        createPage({
          path: node.parent.relativeDirectory,
          component: operaTemplate,
          context: {
            titolo: node.titolo,
            relativeDirectory: node.parent.relativeDirectory,
            galleryAll: galleryAll,
            favicon: favicon,
            tagsEn: tagsEn,
            tagsIt: tagsIt,
          },
        })
      })

      const uniqTags = key => {
        let uniqTags = []
        collection.forEach(({ node }) => {
          if (_.get(node, key)) {
            uniqTags = _.concat(uniqTags, node[key])
          }
        })
        return _.uniq(uniqTags)
      }
      const generatePages = (tags, original, translation) => {
        tags.forEach(tag => {
          // Count how many objects for each tag
          const filtered = collection.filter(item =>
            item.node[original].includes(tag)
          )
          const numPages = Math.ceil(filtered.length / itemPerPage)
          Array.from({ length: numPages }).forEach((value, i) => {
            createPage({
              path:
                i === 0
                  ? `/${translation}/${_.kebabCase(tag)}/`
                  : `/${translation}/${_.kebabCase(tag)}/${i + 1}`,
              component: tagTemplate,
              context: {
                limit: itemPerPage,
                skip: i * itemPerPage,
                numPages,
                currentPage: i + 1,
                tag,
                galleryAll: galleryAll,
                favicon: favicon,
                tagsEn: tagsEn,
                tagsIt: tagsIt,
              },
            })
          })
        })
      }
      tagsLocale.map(item => {
        const tags = uniqTags(item.original)
        generatePages(tags, item.original, item.translation)
      })
  })
}

@gvocale can you share a reproduction so that the issue can be looked at? And find a solution or an alternative to your issue?

@jonniebigodes sure, here it is the reproduction https://github.com/gvocale/gatsby-image-experiment

It somehow seems to get stuck at update schema. In this reproduction I have 3 createPage functions.

If I run only one createPage, then everything works fine and the 7000 images get generated fairly quickly. When I run two createPage functions, update-schema takes 60s and the generate thumbnail is paused / frozen until update-schema is finished.

When I run all three createPage update-schema takes much longer, and the whole build seems stuck.

@gvocale first of all, sorry for the "radio silence"(pardon the bad pun). But i was testing your code and trying to fully pinpoint your issue.

Moving on, in this comment i'm going to enumerate where your issue actually happens and if you don't mind some considerations.

I cloned your reproduction code and installed the dependencies and even before checking where the actual issue happens, one thing was brought to my attention.
As you can see with the image below, the build is breaking in production and in development is showing that message, for based on a rough estimate around 50% of the images used in your code.

gvocale_develop_build

Left is development and right is production build.

Upon checking the images in question i can see that there's actually nothing wrong with it, other than the filename being extremely long, vscode for me complains and as i've shown you gatsby will to.

If i can offer a suggestion, shorten the filenames, i know that it might be a daunting situation, but one that might help you in the future.

Moving on, upon testing your code i was able to pinpoint it to this piece of code:

// Create tags pages
    const uniqTags = key => {
      let uniqTags = []

      const concatTags = (node) => {
        if (node[key]) {
          uniqTags = uniqTags.concat(node[key])
        }
      }

      collection.forEach(({ node }) => {
        concatTags(node)
      })

      return [...new Set(uniqTags)]
    }
    const generatePages = (tags, original, translation) => {
      tags.forEach(tag => {
        // Count how many objects for each tag
        const filtered = collection.filter(item =>
          item.node[original].includes(tag)
        )
        const numPages = Math.ceil(filtered.length / itemPerPage)
        Array.from({ length: numPages }).forEach((value, i) => {
          console.log(`createPage ${tag}`)
          createPage({
            path:
              i === 0
                ? `/${translation}/${toKebabCase(tag)}/`
                : `/${translation}/${toKebabCase(tag)}/${i + 1}`,
            component: tagTemplate,
            context: {
              limit: itemPerPage,
              skip: i * itemPerPage,
              numPages,
              currentPage: i + 1,
              tag,
              galleryAll: galleryAll,
              tagsEn: tagsEn,
              tagsIt: tagsIt,
            },
          })
        })
      })
    }
    tagsLocale.map(item => {
      const tags = uniqTags(item.original)
      generatePages(tags, item.original, item.translation)
    })

I issued gatsby develop and while starting dinner, i left the code "simmer" (pardon the bad pun), for about an hour and it was stuck there, removing that part of the code would run to completion, with the exception of the image issue.
Below is a screnshot after an hour of the code running.

gvocale_develop_hangs

If i understand correctly and feel free to correct me if i'm wrong, but you're implementing a set of pages with internationalization with a set of unique tags correct?

It would be wise to go back to that piece of code and possibly approach the problem from another prespective.

Now for some considerations if you don't mind.
Starting from the top, i saw that you're aliasing your graphql queries which is actually a good move on your part, but if you're already taking advantage of it, why not issue a single graphql query with all the data as it's already aliased. Transforming your createPages api call into something like the following:

exports.createPages = ({ graphql, actions }) => {
  const { createPage } = actions
  const collectionTemplate = path.resolve("src/components/Collection/index.tsx")
  const tagTemplate = path.resolve("src/components/Tag/index.tsx")
  const operaTemplate = path.resolve(`src/components/Opera/index.tsx`)
  const tagsLocale = [
    {
      original: "categoryEn",
      translation: "tags",
    },
    {
      original: "categoryIt",
      translation: "categorie",
    },
  ]
  return graphql(
    `
      {
        collection: allIndexJson {
          totalCount
          edges {
            node {
              parent {
                ... on File {
                  relativeDirectory
                }
              }
              price
              titolo
              categoryIt
              categoryEn
            }
          }
        }
        galleryAll: allImageSharp {
          edges {
            node {
              fluid(
                sizes: "(orientation: portrait) 50vw, (orientation: portrait) and (min-width: 750px) 30vw, (orientation: landscape) 30vw,(orientation: landscape) and (min-width: 1170px) 18vw,(orientation: landscape) and (min-width: 1800px) 16vw"
              ) {
                aspectRatio
                base64
                presentationHeight
                presentationWidth
                sizes
                src
                srcSet
              }
            }
          }
        }
        tagsEn: allIndexJson {
          group(field: categoryEn) {
            totalCount
            fieldValue
          }
        }
        tagsIt: allIndexJson {
          group(field: categoryIt) {
            totalCount
            fieldValue
          }
        }
      }
    `
  ).then(result => {
    if (result.errors) {
      throw result.errors
    }
    const { data } = result
    const { collection, galleryAll, tagsEn, tagsIt } = data
    const itemPerPage = 20

    const numPages = Math.ceil(collection.totalCount / itemPerPage)
    // Create index pages
    for (let index = 0; index < numPages; index++) {
      const nextItem = index + 1
      createPage({
        path: index === 0 ? "/" : `/${nextItem}/`,
        component: collectionTemplate,
        context: {
          limit: itemPerPage,
          skip: index * itemPerPage,
          numPages: numPages,
          currentPage: nextItem,
          galleryAll: galleryAll,
          tagsEn: tagsEn,
          tagsIt: tagsIt,
        },
      })
    }
    // Create opera pages
    const {edges}= collection
    edges.forEach(({node}) => {
      createPage({
        /* path:`/${node.parent.relativeDirectory}/`, */
        path:node.parent.relativeDirectory,
        component:operaTemplate,
        context:{
          titolo:node.titolo,
          relativeDirectory:node.parent.relativeDirectory
        }
      })
    });

    // Create tags pages

  })
}

In the code block above, you'll see that i've also made some changes to the way the pages are created, based on your reproduction, which from what i thing are a bit more streamlined and also easier to read.

I would like to point out the following lines in your code:

collection.forEach(({ node }) => {
      console.log(`createPage ${node.titolo}`)
      createPage({
        path: node.parent.relativeDirectory,
        component: operaTemplate,
        context: {
          titolo: node.titolo,
          relativeDirectory: node.parent.relativeDirectory,
        },
      })
    })

From my read on it it would seem that you're using the relative directory where the json files are housed and use it to create the paths for some pages. With that i would like to bring to your attention what is happening to those said paths/pages.

gvocale_paths

The approach is sound, but some "treatment" for the paths in question is needed. For me Gatsby would not show me the path/page in question as is. Even if / was prefixed and postfixed.

Also upon checking your templates something came to my attention.

For instance in ./src/Components/Opera/index.tsx
```typescript
import React from "react"
let styles = require("./styles.module.scss")// not a good practice you can use import './styles.module.scss' directly or like mentioned here https://medium.com/@thetrevorharmon/how-to-make-a-super-fast-static-site-with-gatsby-typescript-and-sass-3742c00d4524

const Opera = ({ pageContext }) => {

return (


{pageContext.titolo}



)
}

export default Opera

````

Feel free to provide feedback, so that we can close this issue or continue to work on it till we find a solution. And sorry for the extremely long comment.

Hi @jonniebigodes! Thank you for taking the time to download the repo and run it. Very appreciated.

I didn't use import * as styles from './styles.module.scss' because Typescript was marking it as error, but the blog post you shared links to a solution for it https://medium.com/@thetrevorharmon/how-to-silence-false-sass-warnings-in-react-16d2a7158aff. I am going to update it to use import for the scss.

I am surprised you're getting Error: input file is missing. When I run the build or develop on my machine I don't run into that error. Those image files as well seem present in the github repo.

Thank you for pinpointing the issue to the code creating the tags pages. I will try locally to remove and rewrite that code as well and see if it helps.

You're correct, I am using internationalization (2 languages) and I was trying to create translated slugs for the tags:
en/lamps
it/lampade

I used to use just one large query, but still the build was failing so later on I refactored it to separate different queries as I was experimenting with it.

About the page path, is the issue you're pointing having a B\bologna... rather than B/bologna? If so, on my mac the paths are coming out correct using /.

Thanks again for your help. I'll incorporate your feedbacks and concentrate my attention on the code generating the tags.

Meanwhile, I have managed to get the site working using a different approach. I've moved just the image query away from gatsby node and into layout.tsx. There I have a hook doing a static query just for the image gallery, then put that image gallery into a useContext and use it in all components that need the gallery. In this way the build generates thumbnails without freezing soon after the start.

Depending on how large the site is I have to repeat the build a couple of times, because it would break at some point while generating the thumbnails with Segmentation fault: 11. I am developing multiple sites using this same template, but with different content. The ones that have 100 pages or so (before internationalization) may complete the build correctly at first try. The ones that have 1000 to 10000 pages (before internationalization) usually run into Segmentation fault: 11 few times while generating thumbnails. I am having issues publishing to netlify as well for some of them.

@gvocale no need to thank, glad i was able to shed some insights on your issue.

A couple of things.
On my end those errors are due to windows filesystem handling. I just mentioned them as it be relevant to you or your team to be aware that the issue might happen.

Now regarding the following:

Meanwhile, I have managed to get the site working using a different approach. I've moved just the image query away from gatsby node and into layout.tsx. There I have a hook doing a static query just for the image gallery, then put that image gallery into a useContext and use it in all components that need the gallery. In this way the build generates thumbnails without freezing soon after the start.

If i'm reading this correct, you're using the images query, meaning all the images query used in gatsby-node.js to a hook and it will be used by a component. Based on this only, it would seem that this is really not a good practice, why? If my understanding of hooks and Gatsby is correct and also my math, that will run n*m, (n being the pages that use the component in question and consequentially consume the data coming from the hook) and m( the number of images), for small cases it should work, but the problem happens when alot of pages are being consumed and with that the query is running constantly and it can start throwing those said errors as it's not probably freeing up resources in the right timeframe and already another iteration is running. I might be completely wrong about this and probably someone more knowledgeable might provide more insight on this, but it's what it seems that is happening, also bear in mind that there's alot of moving parts working in here. My recomendation for the time being, would be to leave the images query in gatsby-node.js and as you already creating the pages dynamically, "feed" them the absolute minimum information for the image to be displayed and necessary information to be shown. With this you remove graphql from part of the equation and leave the components as purely presentational and with that avoid the complications you're having now.

Hiya!

This issue has gone quiet. Spooky quiet. 👻

We get a lot of issues, so we currently close issues after 30 days of inactivity. It’s been at least 20 days since the last update here.

If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contributefor more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 💪💜

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks again for being part of the Gatsby community!

@jonniebigodes I'm still haven't managed to speed up the largest of the sites I have, with 10000 pages. To further test what's slowing the build down, I am creating a new fresh bare bone Gatsby project. It creates 9923 pages, with just a title and blank html, out of an allIndexJson graphql query. No extra plugins, no images.

This is the performance I'm getting when building locally:

createPages - 4924 s / 1h36m
It starts fast, then progressively becomes slower and slower.

run page queries — 33526s / 9h00 / 0.30 queries/second
My page has no static query. What query is this meant to be?

Is this performance normal for a 10000 pages build, or is something wrong? Could you please have a look at the repo and see if you spot anything suspicious?

https://github.com/gvocale/gatsby-large-site-demo

@gvocale thank you for the repro, that should be very helpful to work out what’s going on.

@wardpeet or @sidharthachatterjee do one of you have a chance to investigate this?

Hey again!

It’s been 30 days since anything happened on this issue, so our friendly neighborhood robot (that’s me!) is going to close it.

Please keep in mind that I’m only a robot, so if I’ve closed this issue in error, I’m HUMAN_EMOTION_SORRY. Please feel free to reopen this issue or create a new one if you need anything else.

As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks again for being part of the Gatsby community!

I'm having a similar issue however not on a large site.

I implemented createPages api in gatsby-node with markdown files. The page template use to create these pages got a page query based on a context variable passed down to the createPage function.

No matter if it's only 1 or 10 pages generated with between 10 to 15 images per page, the develop command go through successfully however the build command finishes before the generating image thumbnails process.

Screenshot 2019-08-01 at 17 34 43

I tried gatsby clean and removed node_module folder before reinstalling and fresh build but always get stuck there. (the percentage of images completed is random between 85% and 99% generally and doesn't throw any error)

I can't figure why the sharp process don't complete before gatsby finishes building.

What has worked for me:
_Downgrade node.js_

  1. Delete node_modules & package-lock.json & yarn.lock
  2. Use nvm(node version manager) and switch to node version 10.18.1:
    nvm install 10.18.1
    nvm use 10.18.1

  3. yarn install

  4. gatsby clean
  5. gatsby develop / gatsby build

Same happed to me on node 12.

First it wasn’t installing because of watchpack, so I bumped gatsby to latest version (2.20.6) and now I’m getting stuck at:

[============================]   6.098 s 17/17 100% Generating image thumbnails

@i-kk workaround solved the issue for me.

Is this still happening with latest sharp?

@wardpeet, unfortunately, yes.

EDIT: Sorry that was not really a constructive answer. So, after onPostBuild it looks like the thumbnails are being generated again. After downgrading to Node version 10.18.1 this issue was resolved. The problem that persists is that the images are not displaying on gatsby build, but only on gatsby develop. It seems like it happens after more than 5 images get loaded.

I'm currently running into this issue trying to get Gatsby to build on a vanilla Ubuntu server. Running Node 12 there. Each time the build hits "generating image previews" and starts processing (what seems to be) the first PNG, it dies with a WorkerError with the stack only mentioning:

  • jobs-manager.js:314 exports.enqueueJob
    [site]/[gatsby]/dist/utils/jobs-manager.js:314:23

  • task_queues.js:97 processTicksAndRejections
    internal/process/task_queues.js:97:5

Same for me on Node 12. Till now managed to build the bundle only with downgraded Node version to 10.x

Was this page helpful?
0 / 5 - 0 ratings

Related issues

hobochild picture hobochild  Â·  3Comments

Oppenheimer1 picture Oppenheimer1  Â·  3Comments

magicly picture magicly  Â·  3Comments

ferMartz picture ferMartz  Â·  3Comments

theduke picture theduke  Â·  3Comments