Next.js: Garbage explosion (memory leak type issue)

Created on 7 Apr 2020  路  5Comments  路  Source: vercel/next.js

Bug report

After update of NextJS node garbage collector (scavenger) in SSR service started to run very frequently up to consuming all CPU and basically making service non-responsive.

Describe the bug

While tracking down the issue we isolated that it is caused by some change in https://github.com/zeit/next.js/releases/tag/v9.0.6-canary.1 . I can't see any obvious culprit in there but all versions canary.0 and below are stable and canary.1 and later are not. It is happening also in latest stable version 9.3.4.

The issue isn't classic memory leak where something would stay in memory for too long but it is that a lot of garbage is generated in new space and immediately collected. We concluded that because total memory usage is stable, increasing size of new space (--max-semi-space-size) increases time it takes exhaust the cpu, there are no major leftovers between heap snapshots and the CPU is consumed by scavenger (minor gc that runs only in new space) which frequency is increasing very rapidly. See the screenshot below.

In profiler or memory debugger we haven't found any clear suspect the only thing that didn't seem right and were memory allocations within require, see attached screenshot. (Also head@reduceComponents but we ruled them out / fixed by replacing that with helmet)

To Reproduce

Unfortunately I wasn't able to reproduce it anywhere else then in production environment, but that is probably due to shape of traffic.

Expected behavior

Not crash the service

Screenshots

newrelic

require

System information

  • OS: node 12.16.0 (tested also on v13), inside docker & kubernetes
  • Version of Next.js: >= 9.0.6-canary.1
  • Memory limit is set to 1GB and max-old-space-size to 900MB

Additional context

Here are our .babelrc.js, next.config.js, tsconfig.json https://gist.github.com/jakubriedl/d5c306d5cf2862ea468387681eb3a53d

Some main packages we combine next with

  • Apollo
  • StyledComponents
  • Custom server using express
  • Typescript

Also all our pages are SSR and cached on CDN where possible, we don't use SSG at this moment.

Most helpful comment

Doing some initial investigating seeing if https://github.com/zeit/next.js/issues/11526 is related to this although the provided reproduction on that repo doesn't seem to be able to be reproduced either as after over 6000 requests with concurrency of 50 to a running Next.js instance on v9.3.3 with Node.js v12.16.1 the memory seems to never go above 60MB outside of a docker container on Mac OS 10.14.6

We are continuing to investigate, although it doesn't seem to be easily reproducible on a minimal Next.js application

Update
We are able to reproduce https://github.com/zeit/next.js/issues/11526 and are investigating the fix now and does seem related to this issue also

All 5 comments

Our team is also seeing this in production when upgrading from 9.0.5 to 9.3.1.

It's even worse when not using a CDN via assetPrefix -- the service would last about 20 minutes before crashing due to CPU and memory overload.

@tom-con can you see in attached files something that is bit non-standard and we both have it in common?

We're going to investigate this, it'd be useful to get full access to your application though so that we can trace it. Feel free to reach out on twitter.com/timneutkens about that.

@tom-con can you see in attached files something that is bit non-standard and we both have it in common?

@jakubriedl I looked through the gist and found nothing that stood out. Here are the similarities:

  • Both running next in docker
  • Both using ['next/babel'] preset
  • styled-jsx (you're on 4.3.2 | we are on 4.4.0)

Beyond that we have some devDependencies alike, such as storybook -- but I can't conceive of them causing the problem.

@timneutkens Thank you, today I'll be trying to recreate the issue. I'll reach out to you when I have it.

Doing some initial investigating seeing if https://github.com/zeit/next.js/issues/11526 is related to this although the provided reproduction on that repo doesn't seem to be able to be reproduced either as after over 6000 requests with concurrency of 50 to a running Next.js instance on v9.3.3 with Node.js v12.16.1 the memory seems to never go above 60MB outside of a docker container on Mac OS 10.14.6

We are continuing to investigate, although it doesn't seem to be easily reproducible on a minimal Next.js application

Update
We are able to reproduce https://github.com/zeit/next.js/issues/11526 and are investigating the fix now and does seem related to this issue also

Was this page helpful?
0 / 5 - 0 ratings