Wasm-bindgen: Build size considerations

Created on 21 Dec 2018  ยท  12Comments  ยท  Source: rustwasm/wasm-bindgen

No doubt build size is an ongoing consideration, for example in #826 and #1078 - so I apologize if this is a bit redundant.

However - I didn't see some of the following specific questions covered in those issues, so figured it's worth opening a new one...

In a basic app using just the minimum of what's needed to render a quad on the screen via webgl, I'm seeing around a 66KB output wasm. That is including some js_sys and web_sys stuff. It's after cargo build --release and wasm_bindgen

So - first question is, what can one expect for a final output using some average requirements from wasm_bindgen, js_sys and web_sys - is this number in the right ballpark?

Secondly - I am structuring things using workspaces. To get it to compile I had to include "rlib" in the local crate dependencies. Will that add bloat? If so - is there a recommended way around it?

Here's some of the Cargo.toml snippets, adapted with generic names (note: I haven't tried wee_alloc - not sure if that will make a major difference)

root: ./Cargo.toml

[workspace]
members = [ "crates/*", "examples/*" ]

[profile.release]
lto = true

library: ./crates/foo/Cargo.toml

[package]
name = "foo"
edition = "2018"

[lib]
crate-type = ["cdylib", "rlib"]

[profile.release]
lto = true

[dependencies]
wasm-bindgen = "0.2.29"
js-sys = "0.3.6"

[dependencies.web-sys]
version = "0.3.6"
features = [
  'CanvasRenderingContext2d',
  'WebGlRenderingContext',
  'WebGl2RenderingContext',
  'HtmlCanvasElement',
  'WebGlProgram',
  'WebGlShader',
  'Window'
]

app: ./examples/foo-demo/Cargo.toml

[package]
name = "foo-demo"
edition = "2018"

[lib]
crate-type = ["cdylib"]

[profile.release]
lto = true

[dependencies]
wasm-bindgen = "0.2.29"
js-sys = "0.3.6"
foo = { path = "../../crates/foo" }

[dependencies.web-sys]
version = "0.3.6"
features = [
    'WebGlRenderingContext',
    'WebGl2RenderingContext',
    'WebGlProgram',
    'WebGlBuffer',
    'HtmlCanvasElement',
    'console',
    ]
file-size

Most helpful comment

Ok cool, thanks! Looks like with the current commit LTO is indeed happening. Following these instructions to analyze the binary we see:

$ twiggy top -n 10 integration_tests_bg.wasm
 Shallow Bytes โ”‚ Shallow % โ”‚ Item
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
          7431 โ”Š    10.67% โ”Š data[0]
          7086 โ”Š    10.17% โ”Š core::fmt::float::float_to_decimal_common_shortest::h28e660428e55af7b
          6939 โ”Š     9.96% โ”Š dlmalloc::dlmalloc::Dlmalloc::malloc::hcd84cc7763e958b1
          6014 โ”Š     8.63% โ”Š core::fmt::float::float_to_decimal_common_exact::h09a7972005228cb0
          5801 โ”Š     8.33% โ”Š "function names" subsection
          4609 โ”Š     6.62% โ”Š core::num::flt2dec::strategy::dragon::mul_pow10::h3c162bf0fec5df7c
          3209 โ”Š     4.61% โ”Š load_assets
          2833 โ”Š     4.07% โ”Š data[1]
          2450 โ”Š     3.52% โ”Š <wasm_bindgen::JsValue as core::fmt::Debug>::fmt::ha5bdcf489482357c
          2020 โ”Š     2.90% โ”Š dlmalloc::dlmalloc::Dlmalloc::free::h241af8935332b649
         21213 โ”Š    30.45% โ”Š ... and 219 more.
         69605 โ”Š    99.92% โ”Š ฮฃ [229 Total Rows]

One of the huge parts that stands our here is floats! Printing floats is no easy task (nor parsing!), and it looks like that may be coming through Debug for JsValue (which transitively includes float printing). If you remove instances of debugging JsValue.

After applying a small patch I get:

$ twiggy top -n 10 integration_tests_bg.wasm
 Shallow Bytes โ”‚ Shallow % โ”‚ Item
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
          6939 โ”Š    23.92% โ”Š dlmalloc::dlmalloc::Dlmalloc::malloc::h66c8dcaaa3bebf05
          4566 โ”Š    15.74% โ”Š "function names" subsection
          3656 โ”Š    12.60% โ”Š load_assets
          2020 โ”Š     6.96% โ”Š dlmalloc::dlmalloc::Dlmalloc::free::h79fb8feb59e45f48
          1686 โ”Š     5.81% โ”Š dlmalloc::dlmalloc::Dlmalloc::dispose_chunk::h36073841969c5319
          1388 โ”Š     4.78% โ”Š core::fmt::Formatter::pad::hd92048ae484e93e9
           883 โ”Š     3.04% โ”Š data[0]
           761 โ”Š     2.62% โ”Š pure3d_webgl::shader::compile_source::h2d2c462bbaaf0cc7
           545 โ”Š     1.88% โ”Š data[2]
           408 โ”Š     1.41% โ”Š integration_tests::start_ticker::_$u7b$$u7b$closure$u7d$$u7d$::h7d251a52f0e95145
          6104 โ”Š    21.04% โ”Š ... and 165 more.
         28956 โ”Š    99.81% โ”Š ฮฃ [175 Total Rows]

Over a 50% reduction in size!

All 12 comments

In a basic app using just the minimum of what's needed to render a quad on the screen via webgl, I'm seeing around a 66KB output wasm.

There's things you can do to make that smaller, but for a standard non-optimized build that sounds about right to me.

Keep in mind that almost all of that is from the Rust stdlib, so it's a one-time cost (it's not linear): as you increase the size of your app, the filesize will grow very slowly.

Thanks - so neither rlib nor duplication of dependencies is adding anything?

@dakom I don't know, I'll let others weigh in on that.

Thanks for opening an issue here, always good to have more records of what's going on! Whenever first optimizing for wasm size I'd recommend reviewing https://rustwasm.github.io/book/game-of-life/code-size.html and https://rustwasm.github.io/book/reference/code-size.html which should help out diagnose some problems. Sounds like you've already got all the really-low-hanging fruit out of the way though!

The average size of a wasm app sort of depends on what you're doing, but 66kb does seems a little high for simplistic operations. You can typically optimize here and there for libstd usage and whatnot, but that'll typically shave off 20kb or so in the limit (and is often very hard to fully achieve that).

I actually think that the rlib part may afffect this though in a weird fashion. I believe Cargo ignores LTO settings (one of the biggest wins to file size) if rlib as a crate-type is present, and that's because there's a bug in the compiler where it'll generate an error if rlib + LTO is present. Can you test out removing the rlib and/or refactoring the crate so the cdylib is a standalone crate and see if that improves things?

Other than that though some analysis along the lines of what the book says would be the next best bet to dig in here, figuring out which crate is contributing the most to file size (aka which function is the largest). If possible it'd be best if we could poke around the code, but can definitely understand if that's not feasible!

If possible it'd be best if we could poke around the code

For sure! It's an ongoing spare-time project (and my way of learning Rust - I'm a newbie), so there will be a fair amount of bikeshedding and changes over time, but it'll probably remain fairly basic for the next week or so, at least as far as the overall workspace setup is concerned:

https://github.com/dakom/pure3d

Specifically, "rlib" is included here: https://github.com/dakom/pure3d/blob/master/crates/webgl/Cargo.toml#L16

I also tried moving all the code into one crate which allowed me to remove "rlib" in the list and it unfortunately didn't make a difference to the file size.

Ok cool, thanks! Looks like with the current commit LTO is indeed happening. Following these instructions to analyze the binary we see:

$ twiggy top -n 10 integration_tests_bg.wasm
 Shallow Bytes โ”‚ Shallow % โ”‚ Item
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
          7431 โ”Š    10.67% โ”Š data[0]
          7086 โ”Š    10.17% โ”Š core::fmt::float::float_to_decimal_common_shortest::h28e660428e55af7b
          6939 โ”Š     9.96% โ”Š dlmalloc::dlmalloc::Dlmalloc::malloc::hcd84cc7763e958b1
          6014 โ”Š     8.63% โ”Š core::fmt::float::float_to_decimal_common_exact::h09a7972005228cb0
          5801 โ”Š     8.33% โ”Š "function names" subsection
          4609 โ”Š     6.62% โ”Š core::num::flt2dec::strategy::dragon::mul_pow10::h3c162bf0fec5df7c
          3209 โ”Š     4.61% โ”Š load_assets
          2833 โ”Š     4.07% โ”Š data[1]
          2450 โ”Š     3.52% โ”Š <wasm_bindgen::JsValue as core::fmt::Debug>::fmt::ha5bdcf489482357c
          2020 โ”Š     2.90% โ”Š dlmalloc::dlmalloc::Dlmalloc::free::h241af8935332b649
         21213 โ”Š    30.45% โ”Š ... and 219 more.
         69605 โ”Š    99.92% โ”Š ฮฃ [229 Total Rows]

One of the huge parts that stands our here is floats! Printing floats is no easy task (nor parsing!), and it looks like that may be coming through Debug for JsValue (which transitively includes float printing). If you remove instances of debugging JsValue.

After applying a small patch I get:

$ twiggy top -n 10 integration_tests_bg.wasm
 Shallow Bytes โ”‚ Shallow % โ”‚ Item
โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€
          6939 โ”Š    23.92% โ”Š dlmalloc::dlmalloc::Dlmalloc::malloc::h66c8dcaaa3bebf05
          4566 โ”Š    15.74% โ”Š "function names" subsection
          3656 โ”Š    12.60% โ”Š load_assets
          2020 โ”Š     6.96% โ”Š dlmalloc::dlmalloc::Dlmalloc::free::h79fb8feb59e45f48
          1686 โ”Š     5.81% โ”Š dlmalloc::dlmalloc::Dlmalloc::dispose_chunk::h36073841969c5319
          1388 โ”Š     4.78% โ”Š core::fmt::Formatter::pad::hd92048ae484e93e9
           883 โ”Š     3.04% โ”Š data[0]
           761 โ”Š     2.62% โ”Š pure3d_webgl::shader::compile_source::h2d2c462bbaaf0cc7
           545 โ”Š     1.88% โ”Š data[2]
           408 โ”Š     1.41% โ”Š integration_tests::start_ticker::_$u7b$$u7b$closure$u7d$$u7d$::h7d251a52f0e95145
          6104 โ”Š    21.04% โ”Š ... and 165 more.
         28956 โ”Š    99.81% โ”Š ฮฃ [175 Total Rows]

Over a 50% reduction in size!

Whoa, that's great!

Just pushed a commit with the changes and updated build pipeline which also adds wasm-opt into the production build, brought it down to ~17K :D

Thanks for taking the time to go through and show how to improve it!

Off-topic: can't say I totally understand the patch yet ... gotta learn more about ? and how it differs from .and(Ok(())) or unwrap... if there's a good article about all that I'd appreciate it (tried googling and found this which I'm reading through now)

@dakom my understanding is that calling .unwrap and .expect on JsValue's means that you are relying on the Debug implementation for JsValue which includes some code to print out floats which was bloating your binary.

By instead just returning the Result JsValue's debug implementation got optimized away as completely unused dead code and was no longer present in your binary.


So, in short, you stopped using JsValue's Debug implementation by no longer calling methods that rely on it.

Oh, very cool - I see now in the wasm_bindgen guide that returning a Result from wasm will throw an exception... so now I moved all the error handling 100% outside of wasm, including getting rid of the "on_error" callback I had.

Shaved another ~1K off the build :D

Not sure if I love this approach since it introduces the need to distinguish between expected errors and bugs on the JS side, but it's cool to know it exists and can help save a few bytes (at least in this example so far)

(as an aside - I also went back to brush up on the ? operator from The Rust Book and that's awesome! I'm still not sure which style I like better, but so far - keeping everything flat is super clean and easy to reason about it, so for now I do prefer the ? - so thanks @alexcrichton for the style hint on that possibility too!)

@dakom would just keep in mind that that ~30Kb was mostly (to my understanding) a one time cost of pulling in some bits of std::format.

That's important to recognize because if the extra Kb aren't an issue for your application you can save a lot of headache on worrying about whether anything you do is pulling in things like float parsing / formatting.

So.. in short.. just calling out that that extra weight should be mostly constant. In case you find yourself in a situation where it's getting cumbersome to avoid std::format.

Cheers!

Similarly, it's really hard to completely avoid formatting in significant apps. At some point you likely will need formatting (in some form or other), and then suddenly you get a big increase in file size.

So developing your app under the assumption that you won't need formatting could give a nasty surprise later.

Ok glad things worked out, and thanks for the assistance as well @chinedufn! Sounds like this is mostly solved now though so I'm going to close.

Was this page helpful?
0 / 5 - 0 ratings