Revision 57b1a57c5e221d5d66a34df9e6152c45de8da561 added an ASCII bypass of TextEncoder for passing strings from JS to Wasm. From my perspective as a TextEncoder developer, the need to add such a special case is a performance bug on the TextEncoder side. Therefore, I made TextEncoder.encodeInto faster in Firefox Nightly especially for ASCII strings.
Notably, for ropes deeper than one level, charCodeAt linearizes the string in Firefox but TextEncoder no longer does. If the string is a more than one-level concatenation of relatively long ASCII pieces, I expect TextEncoder.encodeInto to now outperform wasm-bindgen's manual ASCII path.
Could you, please, re-run the benchmarks that motivated the manual TextDecoder bypass in wasm-bindgen in Firefox Nightly to see if the manual by-pass can be removed or if the manual bypass should be limited to tiny strings (where JIT inlining of charCodeAt may still be a win)?
(FWIW, there's also a similar TextDecoder optimization in flight for Firefox.)
CC @RReverser
Hmm I might need to defer this to someone else as I'm working on something else now, but I think the difference was noticeable on V8 too, so fix in one engine doesn't necessarily fix it. I might be wrong and happy for someone to show otherwise by re-running benchmarks that were included in original PR.
Using this patch on current master using today's Firefox nightly built from Built from https://hg.mozilla.org/mozilla-central/rev/19704452bd548d0c36d601d609cfcfe2c3e0caa2 and using these benchmarks sourced from this directory I'm getting:
| Benchmark | Current master | Master with patch |
|-----------|--------------|------------------|
| Pass ascii_small to/from wasm-bindgen |1,739,343/s ±1.33% | 1,028,842/s ±2.03%|
|Pass ascii_medium to/from wasm-bindgen |1,553,687/s ±1.37% | 990,149/s ±2.37%|
|Pass ascii_number to/from wasm-bindgen |1,461,455/s ±1.38% | 988,315/s ±2.01%|
|Pass ascii_date to/from wasm-bindgen |1,182,276/s ±1.25% | 845,042/s ±2.17%|
|Pass ascii_url to/from wasm-bindgen |1,179,832/s ±1.82% | 891,663/s ±1.95%|
|Pass ascii_link to/from wasm-bindgen |1,202,142/s ±0.83% | 887,017/s ±1.59%|
|Pass unicode to/from wasm-bindgen |406,366/s ±3.59% | 365,823/s ±2.78%|
@alexcrichton Nice! For what it's worth, my original benchmarks from the PR are still here: https://github.com/RReverser/wasm-bindgen-string-benches.
Either way, can you please check and show results with Chrome as well? As stated in the original message, Firefox was already getting just 45% speed up from my original patch, whereas Chrome was getting 80%; I wonder if it still benefits from these changes.
1,739,343/s ±1.33% | 1,028,842/s ±2.03%
So these are something per second and, therefore, higher is better? I.e. the bypass is still faster than encodeInto? Even for the Unicode case? (Seem weird that the ASCII prefix in the Unicode case makes that big a difference.)
If so, the results are disappointing from my point of view. :-(
Since these strings are literals and not the result of JavaScript programs putting them together by concatenation, etc., they don't end up testing rope issues at all. Also, testing round-tripping can be noisy, since encodeInto doesn't produce garbage but decode does.
Do you happen to know what the usage patterns are like? Do strings that get passed to Wasm usually originate from DOM API return values without further processing (not ropes). Or from arbitrary JS code (potentially ropes)?
@RReverser I think this issue is largely about Firefox, so it's probably not worth checking Chrome until we think we want to change it for Firefox. Or at least that's what I'm personally gonna do!
@hsivonen right yeah, these are operations/section where the higher numbers are better. And yeah these are showing that the current code is still faster than unconditionally calling encodeInto.
As for usage patterns, I'm not really entirely sure myself. I think that usage is general enough though that there's not really a typical pattern, but rather there's a lot of different use cases for where strings come from and such.
I'm going to close this since the numbers above I think show that this is still a beneficial optimization to have.
Most helpful comment
Using this patch on current master using today's Firefox nightly built from Built from https://hg.mozilla.org/mozilla-central/rev/19704452bd548d0c36d601d609cfcfe2c3e0caa2 and using these benchmarks sourced from this directory I'm getting:
| Benchmark | Current master | Master with patch |
|-----------|--------------|------------------|
| Pass ascii_small to/from wasm-bindgen |1,739,343/s ±1.33% | 1,028,842/s ±2.03%|
|Pass ascii_medium to/from wasm-bindgen |1,553,687/s ±1.37% | 990,149/s ±2.37%|
|Pass ascii_number to/from wasm-bindgen |1,461,455/s ±1.38% | 988,315/s ±2.01%|
|Pass ascii_date to/from wasm-bindgen |1,182,276/s ±1.25% | 845,042/s ±2.17%|
|Pass ascii_url to/from wasm-bindgen |1,179,832/s ±1.82% | 891,663/s ±1.95%|
|Pass ascii_link to/from wasm-bindgen |1,202,142/s ±0.83% | 887,017/s ±1.59%|
|Pass unicode to/from wasm-bindgen |406,366/s ±3.59% | 365,823/s ±2.78%|