I have recently switched over to using Webpack as my front-end build tool of choice. I had previously used Gulp with Aurelia projects before webpack was a thing. I was hoping that the learning curve with Webpack would be easier and the configuration easier to read and understand...

TL:DR — Use webpack v4.26.1 released on the 25th Nov 2018 along with the optimization.splitChunks options: chunks: 'all', splitChunks.minSize and splitChunks.maxSize. If you are using HTTP/2 and want more control over the chunk creation, or you are fussy about chunk names, use chunks: 'initial' and specify your own named cacheGroups accordingly.

UPDATE 2018-12-04: Webpack v4.27.0 has just been released, it fixes a remaining issue in relation to maxSize and introduces is a slight behaviour change, it is explained in an addendum at the bottom of this post.

At first glance a fully featured Webpack configuration file looks rather intimidating without first reading some introductory tutorials and documentation to gain and understanding of what it does.

This Blog post will focus on a single key area of the Webpack configuration, how to split your bundle into smaller chunks. If you are new to webpack I would recommend reading some tutorials and documentation https://webpack.js.org/concepts/ before continuing to read this so that you have a basic understanding of the build system. There are many good blog posts to be found with a quick search covering webpack basics so I will not attempt to reinvent the wheel here.

There has been many changes between Webpack v3 and v4 so ideally try to find material that covers Webpack v4. Many configs that work in Webpack v3 can also work in v4 but are not optimal and may be deprecated.

Many of the examples below rely on at least Webpack v4.16.0

What is meant by the term "chunk" ?

A "chunk" is a file which forms part of your app's JS bundle, they are created by the bundle splitting process, without bundle splitting, all of your app's code would be in a single JS bundle file. Each chunk is a typically a JS file containing one or more of your app's JavaScript modules. Each of the chunks need to be linked via script tags into your index.html file which is usually achieved by using the "HtmlWebpackPlugin". Note: A chunk can also be a CSS file if using "MiniCssExtractPlugin".

Since Webpack v4, the order of the chunks listed in the index.html does not matter because the Webpack runtime does not execute the code until all of the initial (initial is explained below) chunks have been downloaded. I have been unable to find a reference to this in any documentation, but testing by commenting out a non essential chunk shows this to be the case as it breaks the app, no JS code is executed.

So what's wrong with a single JS file bundle? Well there are two things, cacheability and reliability. Most JavaScript Frameworks such as React, Angular and Aurelia are quite large, having all the app and framework code in a single JS file means that any small update to the published app's code would require the single large JS bundle file to be downloaded again to deliver the updated code to the browser. If the code were spread across many smaller files, we can take advantage of the web browser cache. Also, downloading large files over slower and less reliable connections such as mobile data is more prone to failure than smaller files.

SplitChunks Plugin

Since webpack v4, the CommonsChunkPlugin has been removed in favour of the SplitChunksPlugin and the configuration is provided by the webpack configuration options: optimization.splitChunks.

Announcement: https://gist.github.com/sokra/1522d586b8e5c0f5072d7565c2bee693

Documentation link: https://webpack.js.org/plugins/split-chunks-plugin/

In the docs:

By default it only affects on-demand chunks, because changing initial chunks would affect the script tags the HTML file should include to run the project.

That statement is only relevant for projects that do not use HtmlWebpackPlugin to inject the correct script tags into index.html.

The term initial chunks refers to chunks created for modules that are imported statically which is the most common use: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import

The term on-demand chunks refers to chunks created for asynchronously loaded (lazy loaded) modules which are loaded using dynamic imports: https://developers.google.com/web/updates/2017/11/dynamic-import

With this in mind there are 2 important things to be done:

  • Make use of HtmlWebpackPlugin so that we can split initial chunks without having to keep updating index.html to avoid breaking the app when adding modules.

  • Update the Webpack config so that SplitChunksPlugin affects 'initial' as well and on-demand (async) chunks.

The following config will split both 'initial' and 'async' chunks:

optimization: {
    runtimeChunk: true,
    moduleIds: 'hashed',
    splitChunks: {
        chunks: 'all' // options: 'initial', 'async' , 'all'
    }
}

Note: The 'runtimeChunk: true' and 'moduleIds: "hashed"' settings are important for long term cacheability: https://developers.google.com/web/fundamentals/performance/webpack/use-long-term-caching Note that the HashedModuleIdsPlugin / 'optimization.hashedModuleIds' mentioned in the above link was deprecated in Webpack v4.16.0 https://webpack.js.org/configuration/optimization/#optimization-moduleids

The above config will result in (depending on your project) a separate 'vendors' chunk containing everything from the 'node_modules' dir, this means that any JavaScript Framework code is kept separate from your app code. This behaviour is explained by looking at the default cacheGroups config: https://webpack.js.org/plugins/split-chunks-plugin/#optimization-splitchunks

When I say depending on your project, if the sum of all the node modules is less than 30k, the separate 'vendors' chunk is not created. This is determined by the splitChunks.minSize setting (default is 30000) sets a rule that says a chunk should not be created unless it is going to end up at least minSize. Note that the sizes are compared to the module sizes before any minification takes place.

The cacheGroups config determines how Webpack will be split the bundle into multiple chunks, the default cacheGroups provide a minimum setup. As long as chunks: 'all' is used, projects that do not use dynamic imports will end up with a 'main' chunk and a separated 'vendors' chunk. The problem with this is that the vendors chunk is usually the largest part of the bundle and need to be split further to avoid large chunks.

Another important config is to specify the naming of the chunks: https://webpack.js.org/configuration/output/#output-filename

output: {
    filename: '[name].[chunkhash].bundle.js'
    chunkFilename: '[name].[chunkhash].chunk.js'
  },

The above config names the chunks so that we can see which part of the optimization.splitChunks config they were created from and includes a hash based on the content of each chunk. The chunkhash is used as a cache-busting technique to force the browser to re-download any chunks which contain modules that have been changed since the last build.

What's wrong with large chunks?

  • Downloading large files over unreliable connections such are mobile data is more likely to fail.

  • Webpack supports building bundles for long term cacheability, this doesn't work well if all modules are in a single chunk.

So why not have each module in it's own chunk? Too many chunks will result in a longer download time as under HTTP/1.1 as the browser can only make 6 parallel requests. Further requests will add a small delay due to the latency of the connection to the server, each request uses up a small amount of time before the actual file data is sent to the browser, this can add up to become significant with a large number of requests. Each request also has some overhead in terms of HTTP headers and protocol data.

HTTP/2

Using HTTP/2 means that many chunks can be downloaded in parallel without incurring delays. This is because HTTP/2 uses Multiplexing, avoiding the latency and additional data overhead of multiple requests.

With this in mind it is clear that the Webpack splitChunks config needs to be optimised for either HTTP/1.1 or HTTP/2.

Optimisation of HTTP/1.1 means limiting the chunks to ideally no more than 6 (although much more than 6 is needed to create a notable difference in overall download time).

Optimisation for HTTP/2 means creating many chunks, but not too many that the overall size of the bundle is affected too much. (splitting incurs a small bundle size overhead).

OK what next?

Before webpack v4.15.0 was released in July, the only way to split the bundle into smaller chunks using 'splitChunks' was to create additional cacheGroups to group modules together in a number of separate chunks. The following config example splits jQuery into a separate chunk named 'vendor.jquery', with the remaining node_modules caught by the built-in catch-all 'vendors' chunk.

optimization: {
    runtimeChunk: true,
    moduleIds: 'hashed',
    splitChunks: {
        chunks: 'all' // options: 'initial', 'async' , 'all'
        cacheGroups: {
            jquery: {
                test: /[\\/]node_modules[\\/]jquery[\\/]/,
                name: 'vendor.jquery',
                enforce: true, // create chunk regardless of the size of the chunk
                priority: 90
            }
        }
    }
}

Note: The priority needs to be zero or greater so that it is processed before the built-in cacheGroups which have negative priority. The 'jquery.test' property is specifying an regular expression to match the path of the module.

Alternatively we can disable the built-in cacheGroups and create our own catch-all 'vendors' cacheGroup:

optimization: {
    runtimeChunk: true,
    moduleIds: 'hashed',
    splitChunks: {
      chunks: 'initial',
      cacheGroups: {
        default: false, // disable the built-in groups, default & vendors (vendors is overwritten below)
        jquery: {
            test: /[\\/]node_modules[\\/]jquery[\\/]/, // matches /node_modules/jquery/
            name: 'vendor.jquery',
            enforce: true,
            priority: 90
        }
        vendors: {
          test: /[\\/]node_modules[\\/]/, // matches /node_modules/
          name: 'vendors',
          priority: 10,
          enforce: true, // create chunk regardless of the size of the chunk
        }
      }
    }
  }

Depending on your project, the above config should result in 4 chunks: 'runtime', 'main' (app chunk) , 'vendor.jquery' and 'vendors'. Other large modules could also be split out from the 'vendors' chunk by using the above method.

I have recently done some testing on the impact of various splitChunks configs using both HTTP/1.1 and HTTP/2, the summary of which can be read in this comment here: https://github.com/aurelia/cli/issues/969#issuecomment-438692909

But that sounds rather tedious?

Fortunately, Webpack v4.15.0 introduced a new option: 'splitChunks.MaxSize' which proves to be very useful. The option specifies a limit on the size of a single chunk, if the single chunk consists of multiple modules it is split into smaller parts in an attempt to satisfy maxSize. An additional hash is added to the chunk filenames to make them unique. If a single module is larger than maxSize it will end up in its own chunk.

https://webpack.js.org/plugins/split-chunks-plugin/#splitchunks-maxsize

The recommendations I am going to make here are based on my own testing along with other investigation work.

When testing the maxSize option, I discovered and reported some issues which have now been resolved in the Webpack v4.26.1 release: https://github.com/webpack/webpack/issues/8407

It is worth reading through the above issue discussion as there is still an edge case remaining that may be encountered.

Using splitChunks.maxSize

The most basic configuration looks like this:

optimization: {
    runtimeChunk: true,
    moduleIds: 'hashed',
    splitChunks: {
        chunks: 'all', // options: 'initial', 'async' , 'all'
        maxSize: 200000 // size in bytes
    }
}

As in the first example at the beginning of this post, if the sum of all the node modules is bigger than minSize (default 30000 bytes), the above config would result in a separate 'vendors' chunk containing everything from the 'node_modules' dir. But now with maxSize, if this chunk is bigger than 200KB (195.3KB to be exact) it will be split into multiple 'vendors' chunks.

The create more meaningful names you will want to specify some named cache groups, for example:

optimization: {
    runtimeChunk: true,
    moduleIds: 'hashed',
    splitChunks: {
        hidePathInfo: true, // prevents the path from being used in the filename when using maxSize
        chunks: 'initial', // options: 'initial', 'async' , 'all'
        //minSize: 30000, // default is 30000 bytes
        maxSize: 200000, // size in bytes
        cacheGroups: {
            default: false, // disable the built-in groups, default & vendors (vendors is overwritten below)
            vendors: { // picks up everything from node_modules as long as the sum of node modules is larger than minSize
                test: /[\\/]node_modules[\\/]/,
                name: 'vendors',
                priority: 19,
                enforce: true, // causes maxInitialRequests to be ignored, minSize still respected if specified in cacheGroup
                minSize: 30000 // use the default minSize
            },
            vendorsAsync: { // vendors async chunk, remaining asynchronously used node modules as single chunk file
                test: /[\\/]node_modules[\\/]/,
                name: 'vendors.async',
                chunks: 'async',
                priority: 9,
                reuseExistingChunk: true,
                minSize: 10000  // use smaller minSize to avoid too much potential bundle bloat due to module duplication.
            },
            commonsAsync: { // commons async chunk, remaining asynchronously used modules as single chunk file
                name: 'commons.async',
                minChunks: 2, // Minimum number of chunks that must share a module before splitting
                chunks: 'async',
                priority: 0,
                reuseExistingChunk: true,
                minSize: 10000  // use smaller minSize to avoid too much potential bundle bloat due to module duplication.
            }
        }
    }
}

Notice in the above that config that chunks: 'initial' is being used, this is because asynchronously loaded (lazy loaded) modules are picked up separately by the 'vendorsAsync' and 'commonsAsync' cacheGroups. This is especially important when using dynamic module imports.

The 'vendors' cacheGroup states that if all of the statically imported node modules add up to 30KB or more, separate them from the app chunk and place them all in a 'vendors' chunk. The config also states that if this 'vendors' chunk is bigger then 200KB, split it up into several 'vendors' chunks. Using 'enforce: true', causes splitChunks.maxInitialRequests, splitChunks.maxAsyncRequests and splitChunks.minSize to be ignored, however if minSize is also specified on the cacheGroup, it is respected in the decision to create the chunk and also used in the splitting logic when maxSize is used.

The 'vendorsAsync' cacheGroup in the above example covers any node modules which are imported in your app's asynchronously loaded modules.

The 'commonsAsync' cacheGroup in the above example is similar to the built-in 'default' cacheGroup shown here: https://webpack.js.org/plugins/split-chunks-plugin/#optimization-splitchunks which covers any of your app's modules which are imported (think shared) by at least 2 of your app's asynchronously loaded modules. (any modules from 'node_modules' dir have already been picked up by 'vendorsAsync')

What are async chunks?

Specifying chunks: 'async' on a cacheGroup means that the cacheGroup can only contain modules that are imported inside asynchronously loaded modules. The wording "imported inside asynchronously loaded modules" is significant because the actual asynchronously loaded module itself, which is created by using a dynamic import such as import('some-module' /* webpackChunkName: 'some-module' */) is always put into it's own async chunk using the chunk name specified in the magic comment.

Effectively, an 'async' cacheGroup contains modules that are statically imported inside dynamically (asynchronously) loaded modules. If a module is dynamically imported inside a dynamically imported module, it would automatically end up in it's own separate async chunk in the same way that it's parent module would be, regardless of any splitChunks configuration.

If we didn't specify an 'async' cacheGroup in the above example, anything imported inside the dynamic (async) 'some-module' would remain part of the 'some-module' async chunk (subject to maxSize). In the example above, the 'commonsAsync' cacheGroup allows these imports (depending on minSize) to be split out if another asynchronously loaded module uses the same import, this prevents duplication of modules across async chunks.

In the above example, the 'vendorsAsync' cacheGroup does not care about how many app modules share a node module, the default minChunks is 1, this means that a node module imported inside an asynchronously loaded app module will always be put into the 'vendorsAsync' chunk, subject to minSize. This is an opinionated decision to keep vendor code separate from app code regardless of whether or not it is shared.

minSize and sync chunks

By using a smaller minSize on the async cacheGroups of 10KB, means modules will be split out and placed into the 'vendorsAsync' / 'commonsAsync' chunks only if larger than 10KB rather than the default 30KB. This means that app modules smaller than the 10KB minSize will instead be included and duplicated across async chunks when used in more than 1 asynchronously loaded module.

In the above example, the minSize setting on the async cacheGroups is a trade-off between module duplication and the number of requests required to get all of the required chunks when an asynchronously loaded module is used. A typical use case of asynchronously loaded modules are the views in an Aurelia/Angular app, which if configured as such, would be requested when navigating to a view.

Navigating to an async view would only require a single request if all the view's dependencies were included in the view's async chunk, the trade-off being that the async chunk would be larger and contain modules that may already have been downloaded as part of other async chunks.

Having the same module placed in more than 1 chunk is detrimental to bundle size and network traffic, so a lower minSize was used in the above example so that it only happens for the smaller modules. Tuning minSize for async cacheGroups can prevent the creation of small async chunks, preventing multiple requests being required when using and navigating around the app. Multiple requests are only a concern for HTTP/1.1, using HTTP/2 would mean that only 1 request would be required to retrieve the required async chunks.

Note: 'prefetch' can be used to instruct the browser to request async chunks up-front during browser idle time, but that is out of the scope of this post, but maybe covered in a future post.

Optimising the bundle for HTTP/2

Optimising for HTTP/2 is really more a case of enabling better cacheability and more reliability on unreliable connections.

This is because, for an app bundle that has been optimised for HTTP/1.1, using HTTP/2 won't really make any noticeable difference to overall bundle download time. A difference would only start to be measurable if the bundle consisted of a great many small files.

To adjust the Webpack splitChunks config for HTTP/2 we should adjust minSize / maxSize to create more chunks and also add some config to place the larger node modules into their own named chunks.

From the webpack docs:

maxSize options is intended to be used with HTTP/2 and long term caching. It increase the request count for better caching. It could also be used to decrease the file size for faster rebuilding.

With more splits, updating a single module will result in less code having to be downloaded after re-publishing the app because there will be fewer other modules in the same chunk file which contains the updated module.

This example uses 10KB and 40KB for the min and max sizes and also creates individual splits for node modules:

optimization: {
    runtimeChunk: true,
    moduleIds: 'hashed',
    splitChunks: {
      hidePathInfo: true, // prevents the path from being used in the filename when using maxSize
      chunks: 'initial', // default is async, set to initial and then use async inside cacheGroups instead
      maxInitialRequests: Infinity, // Default is 3, make this unlimited if using HTTP/2
      maxAsyncRequests: Infinity, // Default is 5, make this unlimited if using HTTP/2
      // sizes are compared against source before minification
      minSize: 10000, // chunk is only created if it would be bigger than minSize
      maxSize: 40000, // splits chunks if bigger than 40k, added in webpack v4.15
      cacheGroups: { // create separate js files for bluebird, jQuery, bootstrap, aurelia and one for the remaining node modules
        default: false, // disable the built-in groups, default & vendors (vendors is overwritten below)

        // generic 'initial/sync' vendor node module splits: separates out larger modules
        vendorSplit: { // each node module as separate chunk file if module is bigger than minSize
          test: /[\\/]node_modules[\\/]/,
          name(module) {
            // Extract the name of the package from the path segment after node_modules
            const packageName = module.context.match(/[\\/]node_modules[\\/](.*?)([\\/]|$)/)[1];
            return `vendor.${packageName.replace('@', '')}`;
          },
          priority: 20
        },
        vendors: { // picks up everything else being used from node_modules that is less than minSize
          test: /[\\/]node_modules[\\/]/,
          name: 'vendors',
          priority: 19,
          enforce: true // create chunk regardless of the size of the chunk
        },

        // generic 'async' vendor node module splits: separates out larger modules
        vendorAsyncSplit: { // vendor async chunks, create each asynchronously used node module as separate chunk file if module is bigger than minSize
          test: /[\\/]node_modules[\\/]/,
          name(module) {
            const packageName = module.context.match(/[\\/]node_modules[\\/](.*?)([\\/]|$)/)[1];
            return `vendor.async.${packageName.replace('@', '')}`;
          },
          chunks: 'async',
          priority: 10,
          reuseExistingChunk: true,
          minSize: 5000 // only create if 5k or larger
        },
        vendorsAsync: { // vendors async chunk, remaining asynchronously used node modules as single chunk file
          test: /[\\/]node_modules[\\/]/,
          name: 'vendors.async',
          chunks: 'async',
          priority: 9,
          reuseExistingChunk: true,
          enforce: true // create chunk regardless of the size of the chunk
        },

        // generic 'async' common module splits: separates out larger modules
        commonAsync: { // common async chunks, each asynchronously used module as a separate chunk files
          name(module) {
            // Extract the name of the module from last path component. 'src/modulename/' results in 'modulename'
            const moduleName = module.context.match(/[^\\/]+(?=\/$|$)/)[0];
            return `common.async.${moduleName.replace('@', '')}`;
          },
          minChunks: 2, // Minimum number of chunks that must share a module before splitting
          chunks: 'async',
          priority: 1,
          reuseExistingChunk: true,
          minSize: 5000 // only create if 5k or larger
        },
        commonsAsync: { // commons async chunk, remaining asynchronously used modules as single chunk file
          name: 'commons.async',
          minChunks: 2, // Minimum number of chunks that must share a module before splitting
          chunks: 'async',
          priority: 0,
          reuseExistingChunk: true,
          enforce: true // create chunk regardless of the size of the chunk
        }
      }
    }
  },

Here we set maxInitialRequests and maxAsyncRequests to infinity so that those settings do not prevent 'vendorSplit' cacheGroup from working as intended, this wasn't required in the previous example where 'enforce: true' was used.

The 'vendorSplit' cacheGroup creates chunk for each node module, but still respecting minSize so that we do not end up with too many very small chunks. This is important because with each chunk comes a small amount of overhead in overall bundle size, so splitting for cacheability results in diminishing returns when balancing against bundle size.

Any node modules that are smaller than minSize are gathered up by the 'vendors' cacheGroup as used in the previous example, this time with 'enforce: true' and no minSize, so that the chunk is always created, this keeps node modules always separate from the main app chunk.

The other 'async' cacheGroups in the above example are configured similarly to create individual chunks for each module and then gather up the remaining small modules into a single chunk (or multiple if it ends up larger than maxSize).

The only downside to creating many smaller chunks is a small increase in overall bundle size, so as usual we have a compromise to balance out.

Depending on the size of your project, min and max size will need to be tuned to achieve the desired number of chunks and degree of module separation.

Summary

Webpack is a powerful and flexible tool, understanding it properly requires some effort but the reward is well worth it. Big improvements have been made with the release v4.x and further improvements are continuously being made.

Working examples of the configs covered here can be found in the following repo: https://github.com/chrisckc/aurelia-cli-skeleton-typescript-webpack

The 'lazy-loaded-views' branch demonstrates the async chunks config.

Update / Addendum

UPDATE 2018-12-04: Webpack v4.27.0 has been released: https://github.com/webpack/webpack/releases/tag/v4.27.0

The last remaining issue from my issue log here: https://github.com/webpack/webpack/issues/8407 has been resolved in: https://github.com/webpack/webpack/pull/8451 and released in v4.27.0

The change means that when maxSize is specified on a cacheGroup, minSize from splitChunks is now respected when splitting the chunk, previously this required minSize to also be specified on the cacheGroup.

Next Post Previous Post