We run a documentation aggregation website that pulls in documentation about hundreds of thousands of packages from a variety of services, and I was interested in rebuilding it to use Gatsby. However, after taking 20min to throw together a proof of concept, I realized that it quickly falls over because of how many pages we try to generate.
During the create pages build step, I get the following heap out of memory error. Have others run into this while building large sites? Are there any workarounds? Or is Gatsby just not the right tool for this use case? Thanks a bunch!
<--- Last few GCs --->
[7767:0x104000c00] 22408 ms: Mark-sweep 1270.7 (1430.3) -> 1269.8 (1436.3) MB, 553.0 / 0.0 ms (average mu = 0.144, current mu = 0.034) allocation failure scavenge might not succeed
[7767:0x104000c00] 22987 ms: Mark-sweep 1277.6 (1436.3) -> 1276.0 (1437.8) MB, 552.1 / 0.0 ms (average mu = 0.106, current mu = 0.046) allocation failure scavenge might not succeed
<--- JS stacktrace --->
==== JS stack trace =========================================
Security context: 0x1a2a1761d969 <JSObject>
0: builtin exit frame: concat(this=0x1a2a8136b119 <JSArray[17150]>,0x1a2a8136c0e1 <String[23]: /page/example>,0x1a2a8136b119 <JSArray[17150]>)
1: pages(aka pages) [0x1a2acfeefaa1] [/Users/jaredsilver/projects/poc-rdocs/node_modules/gatsby/dist/redux/machines/page-component.js:112] [bytecode=0x1a2ab76bc199 offset=96](this=0x1a2a842825b1 <undefined>,0x1a2a8136b139 <Object map = ...
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0x10003a9d9 node::Abort() [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
2: 0x10003abe4 node::FatalTryCatch::~FatalTryCatch() [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
3: 0x10019ed17 v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
4: 0x10019ecb4 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
5: 0x1005a5882 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
6: 0x1005a4838 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
7: 0x1005a2443 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
8: 0x1005aecbc v8::internal::Heap::AllocateRawWithLightRetry(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
9: 0x1005aed3f v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
10: 0x10057d3c6 v8::internal::Factory::NewFixedArrayWithFiller(v8::internal::Heap::RootListIndex, int, v8::internal::Object*, v8::internal::PretenureFlag) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
11: 0x1002299c4 v8::internal::Builtin_Impl_ArrayConcat(v8::internal::BuiltinArguments, v8::internal::Isolate*) [/Users/jaredsilver/.nvm/versions/node/v11.0.0/bin/node]
12: 0x2df2feecfcdd
13: 0x2df2fee8e458
Abort trap: 6
System:
OS: macOS High Sierra 10.13.3
CPU: (8) x64 Intel(R) Core(TM) i7-4870HQ CPU @ 2.50GHz
Shell: 3.2.57 - /bin/bash
Binaries:
Node: 11.0.0 - ~/.nvm/versions/node/v11.0.0/bin/node
Yarn: 1.10.1 - /usr/local/bin/yarn
npm: 6.4.1 - ~/.nvm/versions/node/v11.0.0/bin/npm
Languages:
Python: 2.7.10 - /usr/bin/python
Browsers:
Chrome: 74.0.3729.157
Firefox: 59.0.2
Safari: 11.0.3
npmPackages:
gatsby: ^2.6.0 => 2.6.0
gatsby-plugin-react-helmet: ^3.0.12 => 3.0.12
npmGlobalPackages:
gatsby-cli: 2.6.0
gatsby-node.js: We made a source plugin that creates 100,000 nodes of format { name: name } and then edited the primary gatsby-node.js file to generate a page for each node.
For larger sites like this, you'll want to increase the default 1.4gb memory limit for node https://medium.com/@vuongtran/how-to-solve-process-out-of-memory-in-node-js-5f0de8f8464c
Node 12 also removed this built-in limitation so you could try upgrading to it as well.
We also have benchmark sites that we use for testing https://github.com/gatsbyjs/gatsby/tree/master/benchmarks — you can set there arbitrary number of pages.
Thanks, Kyle! Upgrading to Node 12 worked. Looking forward to generating hundreds of thousands of pages with Gatsby! 😄
Unfortunately, we still run into the out of memory issue on Node 12 if we're generating a lot of pages :(
Is it correct to say that the createPages api holds all of the pages in memory as it's building? If so, any workarounds for this? (E.g. we tried building different groups of pages in different plugins to see if that would make them build "incrementally" as opposed to all in the same step.)
An application with this many pages just might not be the right use case for a static site generator, much to my chagrin.
With how many pages?
People have split up sites into chunks to get around problems e.g. https://www.gatsbyjs.org/blog/2019-01-28-building-a-large-ecommerce-website-with-gatsby-at-daniel-wellington/
Everything is kept in memory. The trick to getting past this is probably allowing gatsby to store data in a DB e.g. sqlite.
More than 2,500,000 pages 😄But we can't get past 25,000 in our POC.
We'll keep using Gatsby for everything else, but this one probably should use a more traditional setup.
Haha woah. Yeah we're working that direction but not quite there sadly.
How far could you push the benchmark sites? I can get 100k+ page sites to build pretty easily. The exact upper limit depends on the complexity of pages obviously.
Most helpful comment
Thanks, Kyle! Upgrading to Node 12 worked. Looking forward to generating hundreds of thousands of pages with Gatsby! 😄