I got a scenario here, that I would like to discuss and hopefully create a source for docs, for what a developer should really think about to prevent causing hash changes.
I'm making something, that might seem like a very small change in my code, which has quite large effect on long-term caching.
I'm writing this issue in docs, because I'm sure it's expected, just that I want to be able to understand why this is happening, and hopefully help others as well.
Example repo is here (using [email protected] and [email protected]):
https://github.com/jouni-kantola/webpack-output-by-build-type/tree/why-hash-change
Here goes the scenario:
1) I build:
| Asset | Size | Chunks | |
|----------------------------------------|-----------|--------|-----------|
| 0.9bbf83cf4b0bde5cdd21.dev.js.map | 512 bytes | 0, 6 | [emitted] |
| 0.9bbf83cf4b0bde5cdd21.dev.js | 756 bytes | 0, 6 | [emitted] |
| 2.a01031186175de379984.dev.js | 455 bytes | 2, 6 | [emitted] |
| 3.64a0d1a6257d07bf9c8e.dev.js | 455 bytes | 3, 6 | [emitted] |
| vendor.72eb1607924636054b21.dev.js | 254 kB | 4, 6 | [emitted] |
| app.d67c37259cfc27adf14e.dev.js | 16.9 kB | 5, 6 | [emitted] |
| 1.b255a8ecec8dd054c7f3.dev.js | 756 bytes | 1, 6 | [emitted] |
| 1.b255a8ecec8dd054c7f3.dev.js.map | 512 bytes | 1, 6 | [emitted] |
| 2.a01031186175de379984.dev.js.map | 424 bytes | 2, 6 | [emitted] |
| 3.64a0d1a6257d07bf9c8e.dev.js.map | 430 bytes | 3, 6 | [emitted] |
| vendor.72eb1607924636054b21.dev.js.map | 335 kB | 4, 6 | [emitted] |
| app.d67c37259cfc27adf14e.dev.js.map | 18.9 kB | 5, 6 | [emitted] |
| index.html | 6.35 kB | | [emitted] |
2) I change module-a.js from:
import is from 'is-thirteen';
export const a = () => console.log('module a says 12 + 1 === 13', is(12 + 1).thirteen());
to:
export const a = () => console.log('hello world');
3) I build yet again:
| Asset | Size | Chunks | |
|----------------------------------------|-----------|--------|-----------|
| 0.3f19b2e7372b486647de.dev.js.map | 512 bytes | 0, 6 | [emitted] |
| 0.3f19b2e7372b486647de.dev.js | 756 bytes | 0, 6 | [emitted] |
| 2.a01031186175de379984.dev.js | 455 bytes | 2, 6 | [emitted] |
| 3.64a0d1a6257d07bf9c8e.dev.js | 455 bytes | 3, 6 | [emitted] |
| vendor.503164abc295ee5b7ca1.dev.js | 254 kB | 4, 6 | [emitted] |
| app.928d94b1c06014c2234b.dev.js | 16.9 kB | 5, 6 | [emitted] |
| 1.84b1d26493dcab2a97e9.dev.js | 429 bytes | 1, 6 | [emitted] |
| 1.84b1d26493dcab2a97e9.dev.js.map | 370 bytes | 1, 6 | [emitted] |
| 2.a01031186175de379984.dev.js.map | 424 bytes | 2, 6 | [emitted] |
| 3.64a0d1a6257d07bf9c8e.dev.js.map | 430 bytes | 3, 6 | [emitted] |
| vendor.503164abc295ee5b7ca1.dev.js.map | 335 kB | 4, 6 | [emitted] |
| app.928d94b1c06014c2234b.dev.js.map | 18.9 kB | 5, 6 | [emitted] |
| index.html | 6.35 kB | | [emitted] |
So to what I think needs to be documented:
59 to 85, and that caused three bundles to be cache busted. In a larger app that could be quite a lot of code that needs to be shipped again to the end-user. Is there anyway to prevent this small change, from having this large effect?I think I've done the most important bit, to extract the manifest, but still it's very easy to get chunks cache busted. There are loads of blogs discussing this, but I still believe the truly nitty gritty is missing to be documented.
If I replace require('webpack-chunk-hash') with webpack.HashedModuleIdsPlugin() all seems to play nice.
Only changed bundle hash after that is 1.f948e0e64b866b43bc18.dev.js to 1.41c92261bb86681d218a.dev.js. Rest stays the same.
So I removed the inlining of the manifest to generate a manifest file, and it seems like it updates fine.
1.41c92261bb86681d218a.dev.js -> 1.f948e0e64b866b43bc18.dev.js
manifest.9262854228a18bb6d1d6.dev.js -> manifest.ff60e3d222a7b3d2143b.dev.js
Rest stays the same.
All bundles grew a bit, but in the long-run I think that definitely is worth it.
Anything else I need to be aware of, or is the change that should be recommended in https://webpack.js.org/guides/caching/#deterministic-hashes? That is from webpack-chunk-hash/webpack-md5-hash to webpack.HashedModuleIdsPlugin.
@bebraw and @okonet, do you have anything you want to add?
Btw, the reason why I extracted the manifest is because of this issue, when manifest has _not_ been inlined:
https://github.com/erm0l0v/webpack-md5-hash/issues/9
@jouni-kantola HashedModuleIdsPlugin does most of the work (easier than module ids). I would recommend using NamedModulesPlugin (same but without hashing) during development (better output). Extracting manifest in a way or another becomes crucial when you split bundles (otherwise manifest changes can invalidate something you don't want).
There's one more option - records. Storing records across builds is apparently the ultimate option. You get an extra file to track in your repository then. The nice thing is that this works reliably with code splitting.
I think webpack should come with stronger defaults and treat numbered module ids as a special case that's enabled through a plugin rather than vice versa. I would default with NamedModulesPlugin myself. If webpack knew something about build targets, I would pick the hashed variant for production. In fact I might go as far as to merge these plugins as one and expose hashing through a plugin option. They are essentially the same after all apart from that tiny bit. Less to remember.
I'm working with updating the docs at the moment. I realised I got things a bit backwards after-all yesterday.
Yes, use HashedModuleIdsPlugin to generate IDs that preserves over builds. However, webpack-chunk-hash/webpack-md5-hash needs to be used to base the file hashes on file contents.
So this is the simplest config I'm suggesting:
var path = require('path');
var webpack = require('webpack');
var ChunkManifestPlugin = require('chunk-manifest-webpack-plugin');
var WebpackChunkHash = require('webpack-chunk-hash');
module.exports = {
entry: {
vendor: './src/vendor.js',
main: './src/index.js'
},
output: {
path: path.join(__dirname, 'build'),
filename: '[name].[chunkhash].js',
chunkFilename: '[name].[chunkhash].js'
},
plugins: [
new webpack.optimize.CommonsChunkPlugin({
name: ["vendor", "manifest"],
minChunks: Infinity,
}),
new webpack.HashedModuleIdsPlugin(),
new WebpackChunkHash(),
new ChunkManifestPlugin({
filename: "chunk-manifest.json",
manifestVariable: "webpackManifest"
})
]
};
Directly related issue: https://github.com/webpack/webpack/issues/1479
(which in-turn links to the resource: https://github.com/dmitry/webpack-hash-test)
All right. I went through a whole bunch of scenarios to check what updated hash.
I use:
HashedModuleIdsPluginWebpackChunkHashCommonsChunkPlugin for vendor chunkimport()This is what I came up with:
https://gist.github.com/jouni-kantola/1c1e2bfaebf30de50d1b6a71b869da13
I'll try to document it in docs, in some kind of overview. I'm probably leaving out the files and only describe list the actions.
I also tried adding recordsPath, recordsPath: path.resolve(__dirname, './recordsPath.json').
@sokra: Wouldn't at least the long-term caching for code splitted chunks be fixed if they weren't prefixed with the ordering number?
Related: #131.
I'll try to document it in docs, in some kind of overview. I'm probably leaving out the files and only describe list the actions.
@jouni-kantola are you still interested in pr-ing some updates or a new article? It seems this is somewhat covered in the caching guide but could probably be improved.
@jouni-kantola I found your Gist very helpful, thanks fore that. Can I ask why you added WebpackChunkHash to your set? Is HashedModuleIdsPlugin documented anywhere? Is it considered best practice to use name in chunkFilename: 'js/[name]-[chunkhash].js' or not?
I think between @timse's post and the caching guide we can probably push this past the finish line. We should definitely link to @timse's post, and maybe this tool for testing, from that guide and make any other changes necessary to bring it up to date. Once that happens, I think this issue can be closed.
Hello! Sorry for not getting back to you guys in quite some time. The reason why I created this issue and went through the steps in the gist is to make developers aware that many things will affect long-term caching.
If @timse's post is the baseline for creating long-term cacheable assets, I think that should be the caching docs from now on. Added to that I think many steps could be elaborated a bit more, just to show the effect of i.e. an extra module dependency.
Basically I just repeated what @skipjack had already renamed the issue to 馃槈
Regarding official webpack blogs posts that basically are written to be docs, I think they should be created in the docs repo directly. Redundant information is one of the reasons https://github.com/webpack/webpack/issues/1315 is what it is. Many sources of truth just adds to confusion (/cc @TheLarkInn & @sokra).
@jouni-kantola no worries, and I'm glad we agree on how it should be updated. I've been cleaning up issues and trying to turn the remaining ones into actionable items that we can continue knocking off the list.
Regarding official webpack blogs posts that basically are written to be docs, I think they should be created in the docs repo directly. Redundant information is one of the reasons webpack/webpack#1315 is what it is. Many sources of truth just adds to confusion.
I agree although I do think the blog serves a slightly different purpose for more loose discussion and walkthroughs. However, we need to be better about first, _finishing the backlog_, and second, _keeping things up to date_ and _staying on top it_ so we don't find ourselves in the same situation of a huge backlog of things that need to be documented or resolved. The problem is, webpack is a big tool with a lot of things that need doing and, as with all open source work, people only have so much time to spare. I'm hopeful that we'll get there but it is definitely taking some time.
Re this issue though, yeah I think we basically need to just review the caching guide, see what's missing, and add that. Then this issue, and probably a whole list of other ones on the main repo, can be closed out.
Going to try to tackle this now...
@jouni-kantola so I think I've pretty much covered everything in #1436. I didn't mention webpack-chunk-hash within the guide as it seems we can use [chunkhash] within chunkFilename without any extra plugins in webpack 3. Could you remind me why that plugin was/is necessary?
@skipjack: When I was testing the hashes were more consistent with the extra hashing plugins. However after a while I've seen this resulting in runtime errors. Better to leave the hashing only to webpack's substitutions.
@jouni-kantola yeah I agree -- I think the simpler we can keep that guide the better. That's why I started from scratch with TheDutchCoder/webpack-guides-code-examples#17 and tried to only use the plugins that seemed necessary while testing. I think if any issues arise with the current guide we should first try to reproduce them in the examples repo, and then decide if it's a common enough issue to mention in the guide.
Most helpful comment
@jouni-kantola
HashedModuleIdsPlugindoes most of the work (easier than module ids). I would recommend usingNamedModulesPlugin(same but without hashing) during development (better output). Extracting manifest in a way or another becomes crucial when you split bundles (otherwise manifest changes can invalidate something you don't want).There's one more option -
records. Storing records across builds is apparently the ultimate option. You get an extra file to track in your repository then. The nice thing is that this works reliably with code splitting.I think webpack should come with stronger defaults and treat numbered module ids as a special case that's enabled through a plugin rather than vice versa. I would default with
NamedModulesPluginmyself. If webpack knew something about build targets, I would pick the hashed variant for production. In fact I might go as far as to merge these plugins as one and expose hashing through a plugin option. They are essentially the same after all apart from that tiny bit. Less to remember.