Gatsby: Loading chunks while a new release is deployed

Created on 20 Oct 2019  路  57Comments  路  Source: gatsbyjs/gatsby

Description

I think there is a structural issue with the lazy loading of page chunks.
I came across various ChunkLoadErrors in Sentry reports, and just now it happened to me in Chrome, and it happened when I had loaded a page, then deployed a new version (via Netlify), and then tried to navigate to another page without a full page reload: the new deployed chunks have different paths, so the Gatsby Links can't find the ones it knew... throwing 404s and resulting in not being able to navigate!

Steps to reproduce

You can easily reproduce this by deploying, going to the website, changing code & deploying again, and then trying to navigate to another page.

Otherwise, let me know and I'll setup a test website.

Expected result

The navigation should always work.

Actual result

Nothing happens when the visitor tries to navigate to another page.

Possible fixes

A fix could be to fallback to normal navigation (full page reload), instead of letting the visitor on the same page without any reaction...

Now, I read on other issues that it's supposed to fail because maybe the user if offline and we can't detect it: I disagree, there is a difference between a failed fetch because of network issues) and a 404, and it can be detected in Javascript (and maybe use NavigatorOnLine if available, too).

Environment

  System:
    OS: macOS 10.14.6
    CPU: (8) x64 Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
    Shell: 5.3 - /bin/zsh
  Binaries:
    Node: 10.15.3 - ~/.nvm/versions/node/v10.15.3/bin/node
    Yarn: 1.19.0 - /usr/local/bin/yarn
    npm: 6.10.3 - ~/.nvm/versions/node/v10.15.3/bin/npm
  Languages:
    Python: 2.7.16 - /usr/local/bin/python
  Browsers:
    Chrome: 77.0.3865.120
    Firefox: 69.0
    Safari: 13.0.2
  npmPackages:
    gatsby: ^2.15.29 => 2.15.29
    gatsby-image: ^2.2.24 => 2.2.24
    gatsby-plugin-google-analytics: ^2.1.20 => 2.1.20
    gatsby-plugin-manifest: ^2.2.20 => 2.2.20
    gatsby-plugin-netlify: ^2.1.17 => 2.1.17
    gatsby-plugin-offline: ^3.0.11 => 3.0.11
    gatsby-plugin-react-helmet: ^3.1.10 => 3.1.10
    gatsby-plugin-sharp: ^2.2.28 => 2.2.28
    gatsby-plugin-sitemap: ^2.2.16 => 2.2.16
    gatsby-plugin-styled-components: ^3.1.8 => 3.1.8
    gatsby-plugin-typescript: ^2.1.11 => 2.1.11
    gatsby-source-contentful: ^2.1.45 => 2.1.45
    gatsby-source-filesystem: ^2.1.29 => 2.1.29
    gatsby-transformer-inline-svg: ^0.0.1 => 0.0.1
    gatsby-transformer-sharp: ^2.2.20 => 2.2.20

Related issues

11194

2954

not stale confirmed webpacbabel bug

Most helpful comment

I just want to mention we just published new version of gatsby ([email protected]) with bit improved loader resilience (PR that I mentioned - https://github.com/gatsbyjs/gatsby/pull/18051 was merged). It will not fix all the issues, but large chunk of them are handled.

We also re-added some e2e tests to verify navigation correctness in bunch of scenarios when resources fail to load (for whatever reason - network issues, adblockers, mid-deploy scenarios)

All 57 comments

We are having a similar issue, although different cause. We are running behind an authentication proxy, and when the users session expires, navigation fails due to ChunkLoadErrors.

Is this possible related to a regression from this change?https://github.com/gatsbyjs/gatsby/pull/2996

Same issue here.

To replicate (as of v2.16.1):

  1. Clone the base Gatsby starter.
  2. Run gatsby build.
  3. Run gatsby serve.
  4. Go to localhost:9000/, open network tools and hover over the "Go to page 2" link and see the network requests get made for component---src-pages-page-2-js-[HASH].js.
  5. Go to your ./public/ folder and delete component---src-pages-page-2-js-[HASH].js.
  6. Go back to localhost:9000/ do not refresh and hover over the "Go to page 2" link. This time you will see a 404 for loading the chunk.

The above steps are what a user would experience if they were navigating a site as a new build was deployed. The user will go to click something and nothing will happen - only a page refresh will fix the issue in those cases.

I would be happy to help fix but might need pointing in the right direction as I haven't poked around in the Gatsby code to much.

/cc @pieh

I've also got this issue since upgrading gatsby from v2.15.34 to v2.17.1
Rolling back to v2.15.37 works with no problem :)
In my case, the gatsby target in package.json was ^2.15.28.
Calling yarn upgrade has updated the package to v2.17.1, resulting in this issue.

Having same issues with netlify and AWS Cloudfront.

loadComponent is never caught

https://github.com/gatsbyjs/gatsby/blob/90f0401399a573b5f8bede570a019169a05bd972/packages/gatsby/cache-dir/loader.js#L195

I guess we don't even need to call loadComponent if the current hash compilation has changed.

I'm not sure navigation should keep comparing hash, instead having this information from the loadPage call.

https://github.com/gatsbyjs/gatsby/blob/90f0401399a573b5f8bede570a019169a05bd972/packages/gatsby/cache-dir/navigation.js#L99

The e2e test is skipped :thinking:

https://github.com/gatsbyjs/gatsby/blob/90f0401399a573b5f8bede570a019169a05bd972/e2e-tests/production-runtime/cypress/integration/compilation-hash.js#L37

@pieh, Do you need help to move forward ?

We're seeing this same issue as well.

Looks like I'm facing a similar issue as this as well, running gatsby 2.15.33 and hosted on Netlify.

Following are the logs in the console from Firefox:

Error: Loading CSS chunk 1 failed.
(/styles.372f89c282666a74bcb5.css) webpack-runtime-a06d2033887a729e229a.js:1:1913
Source map error: Error: request failed with status 404
Resource URL: https://www.websitename.com/webpack-runtime-a06d2033887a729e229a.js
Source Map URL: webpack-runtime-a06d2033887a729e229a.js.map

Edit:
Found a similar issue while browsing starters on Gatsby website:
screenshot 45

This is major bug, which contradicts the main goal of CI/CD flow.

Can anyone identify a version of Gatsby that doesn't exhibit this bug? Rolling back to 2.15.37 as suggested by @jjang16 didn't seem to work for us. In fact we're running 2.15.18 which is also affected.

Or rather is this an issue with one of Gatsby's dependencies?

Sorry, I don't have much more to contribute at this stage apart from questions.

Strange that the tests were disabled because they were failing, isn't that the idea of tests?? 馃槄I'd have thought that a regression which breaks a test (disabled or not) would necessitate a pretty hasty hotfix! @pieh I see you're assigned to this one - do you have the context to address it or do you need some help getting to the bottom of it?

@pieh @ryami333 @editkid I can only guess at this point, but I think it could be issue with gatsby-plugin-offline

@JustFly1984 we have the issue but don't have gatsby-plugin-offline enabled

@johndaskovsky thank you for clarification, this is even worst than I thought.

Yeah, no gatsby-plugin-offline for us either - it's not that.

@sever1an can you check if you have same issue?

We're running an e-commerce site with Gatsby on Netlify and have noticed a number of these ChunkLoadError logs in Sentry whenever we deploy. I've tested this with the official Gatsby starter and it seems to have the same behavior.

I wonder if more people are having this issue and simply aren't noticing because they aren't monitoring client errors or don't have a lot of active sessions during deployment. For now, we're watching real-time sessions and deploying at odd hours to avoid the issue but it would be great not to have to worry about it.

Minimal Steps to Reproduce

I've uploaded a screen recording of the process here.
edit: video no longer available

  1. Create a new gatsby site e.g., gatsby new gatsby-issue-18866
  2. Publish it to Netlify
  3. Navigate to the Netlify link, but do not hover over or click the "Go to page 2" link
  4. Keep the window open and deploy some changes to ./src/pages/page-2
  5. Wait for Netlify to process the build.
  6. Now hover or click the "Go to page 2" link

Result

The link does not function and throws a ChunkLoadError on each hover and click.

The output looks similar to this:

webpack-runtime-fcfa123cba71df322157.js:1 GET https://cocky-jackson-37ee43.netlify.com/component---src-pages-page-2-js-9091273db8b77923343d.js net::ERR_ABORTED 404
webpack-runtime-fcfa123cba71df322157.js:1 Uncaught (in promise) ChunkLoadError: Loading chunk 5 failed.
(error: https://cocky-jackson-37ee43.netlify.com/component---src-pages-page-2-js-9091273db8b77923343d.js)
    at Function.u.e (https://cocky-jackson-37ee43.netlify.com/webpack-runtime-fcfa123cba71df322157.js:1:2172)
    at Object.component---src-pages-page-2-js (https://cocky-jackson-37ee43.netlify.com/app-41cbafd5610a9d24601a.js:1:87364)
    at r.loadComponent (https://cocky-jackson-37ee43.netlify.com/app-41cbafd5610a9d24601a.js:1:78493)
    at https://cocky-jackson-37ee43.netlify.com/app-41cbafd5610a9d24601a.js:1:76405

I wonder if more people are having this issue and simply aren't noticing because they aren't monitoring client errors.

I think so, yes.

We've noticed ChunkLoadError happen many days after a deploy, likely due to people leaving our website open in their browser for prolonged periods. It's especially problematic when we deploy backwards-incompatible changes to our server-side API and need all our clients to refresh.

We are also running an e-commerce with Gatsby on Netlify (which is a lovely experience overall i must say!) but we too, have been running into a bunch of not so nice experiences for customers who has been in the middle of the checkout flow, as it relies on the user being navigated between different pages, which breaks when we do a new deploy (and was not built with this issue in mind)

Since there are users in the checkout flow basically all the time, there is no 'safe' window where we can do a deploy without putting ourselves in a really bad spot. We've basically had to revert to avoid doing deploys as often, which is obviously a not so nice experience.

I'm not sure this is all on Gatsby though, e.g. if Netlify would avoid deleting all old files (ie. the chunks would still be there) on new deploys this wouldn't be an issue i believe? But it feels like Gatsby should have a safe fallback for this?

We have same issues both on netlify and AWS Cloudfront/S3.

The only fix I have been able to implement has been to remove all use of gatsby-link and just use anchor tags. Full page reloads are not nice but nothing can be done until this is fixed.

@GooBall in our case page reload is not acceptable, cos we have authenticated state. and Safari drops cookies for cross-domain requests.

Our site's design relies heavily on page transitions hence removing gatsby-link is not an option for us either.

I've done a bit of investigation into this myself as it's causing a lot of alerts for us when we deploy, due to active user on our website, and sometimes days later, which I assume is users who had opened the site before a deploy and continued to use it some time later like @editkid describes.

Here's the PR that I believe introduced this bug: https://github.com/gatsbyjs/gatsby/pull/16686

If you look at the CI logs for this PR, the disabled test in e2e-tests/production-runtime/cypress/integration/compilation-hash.js#L37 that @DevSide has pointed out was skipped, so I guess this bug was missed.

The most recent change to packages/gatsby/cache-dir/loader.js prior to the one from the PR links was run against the now disabled test, and passed.

Unfortunately I don't know enough about this side of WebPack or Gatsby to be of much further use, but hopefully this is of some help @pieh

~E: we've rolled back to Gatsby ~2.16.0 and it appears that we're no longer experiencing this issue.~ The issue persists.

Sorry to "at" you @KyleAMathews, but there hasn't been any comment or activity from @pieh on this since it was assigned to him a couple of months ago, so I'm worried it's just flying under the radar because of the "assigned" status. Is there someone else who might be able to check this out? There seems to be a reasonable number of people affected by this, and it's easily and reliably reproducible with the latest version of Gatsby (including any given "starter").

@KyleAMathews please fix this issue

Apologies for the silence here, folks. We're currently on holiday but will be sure to take a look at this once we're back.

@DSchau This issue actually impacting everybody. It's just not everybody noticing.

FWIW, we just upgraded from 2.17.7 to 2.18.17, and now we're seeing a deluge of errors like this in Sentry. We do not use gatsby-link.

It's not as easy as it seems to be.

Depending on the volume of pages, updates can take a while and Gatsby doesn't know if all files necessary were uploaded. Actually, it can't predict the right moment to reload for now.
Also, some people may want to reload when navigation occurred but others would prefer a user button to manually do it.

A likely solution would be to create a custom hook like "onLoadPageFailed" which give the maximum amount of information to let the dev do what they want.

There is clearly an error, and any errors can be handled with fallback.

I would rather redirect user to maintenance page while trying to navigate to requested page several times, and make a page reload after 5 unsuccessful attempts. Would be useful if Gatsby.js could provide similar to 404 page functionality for maintenance page with internationalization support and createPages API support. currently it is a bit hacky to provide internationalization (react-intl) support for 404 page.

Also I'm worried there could be issues for client-side only routes.

One more thing: currently we are using AWS Cloudfront, and deploy consist of 5 steps before error:

  • clear /dist dir
  • build to /dist dir
  • sync /dist dir with s3 and deleting outdated files from s3
  • creation of invalidation fro CloudFront
  • invalidation take couple minutes to update cache worldwide on edge locations.

  • at this point all users using website starting getting errors on navigation, which we track in Sentry

in Cloudfront case there is definitely all files updated, and we could reload page on navigation without page preload to update offline service-worker cache and UI. I could bet the same concerns netlify, cos there is docker container swap happens on build process.

Please correct me if I'm wrong.

We've seen similar issues on a non-Gatsby app before when using the skipWaiting option for service workers. With skipWaiting the new service worker installs and become active straight away, which clears out old asset caches. If the already loaded page references one of these assets that are now deleted, it will error. Afaik, the Gatsby offline plugin uses skipWaiting by default.

The way we "fixed" it was by not using skipWaiting and rather show the user an "Updates Available" notification that prompts them to refresh for the changes to take effect. A great fix would be if the already loaded code could hot replace any old chunk references with new ones, but that sounds like a rabbit hole no-one wants to go down.

Thanks, although I'm not using any service workers and it still happens.

We're also having big problems with this. I can see that this issue is in the "Prioritized" column in the Roadmap, but I'm not sure that this means. Does anyone know whether it's something that might be adressed in the near future?

What about just resorting to full page reload when there's an error within navigation? It would solve most issues. It's probably most cases of chunk load errors.

@josepjaume yes, it's what I proposed in the issue description, we should fallback to full page reload

This seems to be another reason people are lead to attempting the fixes detailed in https://github.com/gatsbyjs/gatsby/issues/15080 and is related to my last comment in that thread

Just a note - there is work being done on it in https://github.com/gatsbyjs/gatsby/pull/18051 (the description of the PR is outdated tho as I shifted the direction of the PR and didn't update it). and I needed to make changes to proposed e2e changes there to assert that user IS able to navigate to a page even if some resources fail to load), but runtime is pretty important thing and we need to be careful about making changes in navigation code so it takes time and lot of manual testing (on top of those e2e tests that are there now).

I had call today with @wardpeet and @blainekasten about this issue and the work in progress that I did so far. There were valid concerns raised about some of changes that I made. There will be more changes coming there (including just some code readability stuff so it would be easier to iterate on that in the future)

So it seems the best solution, for now, is to do a page reload when the error is thrown.

Does anyone have a suggestion about how best to do this? I am trying to use an Error Boundary in gatsby-browser that wraps the app. However, in trying to simulate the issue locally it isn't actually being caught by this.

I just want to mention we just published new version of gatsby ([email protected]) with bit improved loader resilience (PR that I mentioned - https://github.com/gatsbyjs/gatsby/pull/18051 was merged). It will not fix all the issues, but large chunk of them are handled.

We also re-added some e2e tests to verify navigation correctness in bunch of scenarios when resources fail to load (for whatever reason - network issues, adblockers, mid-deploy scenarios)

"large chunk of them" 馃槀

Currently I can鈥檛 test 41 version, cos it contains 2 typescript errors and fails on test

@pieh We upgraded to 2.19.41 yesterday, but are still getting ChunkLoadErrors quite frequently. If we can provide more information, please let me know.

We're also hitting this on our site using gatsby and hosted in Netlify.

That said. This looks like an age-old problem. I've seen this problem before with many other systems and I'm reluctant to call this a gatsby issue. Basic website versioning says you change hashes when the file changes so it busts the client cache.

If your website is navigating via JS (like gatsby sites do) then you either need to signal to the running JS code that there's a new version available so it can either force reload or do something "smarter" to load the new page chunks.

The usual way I've solved this in the past was to leave the last version up for some time (few days was sufficient...though storage for static sites is generally cheap enough that you could do months if you wanted). That reduced the possibility that anyone still has a running JS page with the old code references. With a straight S3 hosted site this is relatively easy to do in a CI/CD system by setting expirations for the older pages. That's outside of gatsby's purview though.

I kinda think of this more of a Netlify issue. But...if there's a solution that can be done via Gatsby, I'm all for it too since I'd prefer we didn't have users hitting this either. :)

I was going to add a "Me too" comment. I upgraded to the latest of everything, hoping a transient dead-page error was fixed based on recent changes. Things to worse: nearly every page would display, then disappear.

Turns out it was an AWS error, swallowed-up by my CI: An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.

One put of one .js and my site silently broke. Yikes.

So the cause is found, but goodness, I thought Gatsby was cool because it worked without JS. Ironically, my site would have worked with JS disabled, but was broken with JS.

Do you mean the key is wrong because it's running old site code from a previous deploy? To my knowledge, Gatsby doesn't deal with AWS at all so it doesn't really sound like a problem with Gatsby. :)

@stuckj Correct, I was about to blame Gatsby but then realized it was a buried AWS warning. Thought I'd share that as FYI for others.

That said, an SSG which loads the generated HTML, then erases it because a .js couldn't be found, possibly has a flaw in the strategy.

Hiya!

This issue has gone quiet. Spooky quiet. 馃懟

We get a lot of issues, so we currently close issues after 30 days of inactivity. It鈥檚 been at least 20 days since the last update here.
If we missed this issue or if you want to keep it open, please reply here. You can also add the label "not stale" to keep this issue open!
As a friendly reminder: the best way to see this issue, or any other, fixed is to open a Pull Request. Check out gatsby.dev/contribute for more information about opening PRs, triaging issues, and contributing!

Thanks for being a part of the Gatsby community! 馃挭馃挏

not stale!

It's been eight months since this issue was opened and I'm curious if anyone's found a good way to work around it?

I can see Netlify is claiming that "More than half of all Gatsby sites are deployed on Netlify.". As far as I can tell there's no way to force Netlify to maintain cache from previous builds. Does that mean that most production Gatsby sites break all active and recent sessions on deploy?

@wardpeet @pieh is there any progress on this issue?

@wardpeet @pieh please pay attention to this issue

Also seeing this error

Get this issue a lot

In my team, we've got used to clean the cache after each deployment.

I'm also seeing loads and loads of this error in my sentry dashboard

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Oppenheimer1 picture Oppenheimer1  路  3Comments

jimfilippou picture jimfilippou  路  3Comments

theduke picture theduke  路  3Comments

benstr picture benstr  路  3Comments

totsteps picture totsteps  路  3Comments