Emscripten: Evaluate hex encoding for embedding files?

Created on 2 Oct 2018  Â·  5Comments  Â·  Source: emscripten-core/emscripten

It was just pointed out to me in a Hacker News thread that hex might be more efficient than base64 after gzip/brotli compression for SINGLE_FILE's embedding.

His test showed that a hex-encoded wasm file was 65% the size of a base64-encoded wasm file after compression. My own test with a random 4.7 MB png I had sitting around showed that it didn't make a difference — with brotli they were all about the same (hex having the edge over base64 by 215 B and over the compressed original file by ~2.5 KB) and gzip was similar (in this case hex was ~600 KB bigger) — so it's unclear what the impact would be.

Just calling attention to this on the off chance that it would actually be an improvement, as I don't recall hex being considered when originally writing the SINGLE_FILE PR, and the previous discussion on encoding efficiencies (https://github.com/kripken/emscripten/pull/3326#issuecomment-91352434) has no mention of hex.

balls.png

help wanted

All 5 comments

Well a PNG file isn't a good example as it's already highly compressed. As @jedisct1 said in #7213 the base64 encoding means that repeated opcode sequences are being obscured, but with a PNG image there shouldn't be any repetitions left.

I hadn't thought of this before, but it makes sense. Patterns in the original are preserved, Huffman encoding means using only 16/256 symbols won't be a big problem, and decoding should be simple as you can pre-allocate a buffer half the string length. Use lowercase a-f to make the most benefit of Huffman encoding. Or be extra crazy and use "etnris" for optimal Huffman benefit. (This is from an old analysis we did of the most used characters in jQuery for UglifyJS. If you wanted to try such an approach, we could do a fresh analysis of Emscripten output.)

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 7 days. Feel free to re-open at any time if this issue is still relevant.

Booh, don't close me!

This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.

🥕

Was this page helpful?
0 / 5 - 0 ratings