Node: v12.17.0 zlib.gzipSync changed compression result

Created on 28 May 2020  路  4Comments  路  Source: nodejs/node

Probably related to: https://github.com/nodejs/node/pull/31201

I don't know if this is a real world issue, but...

  • Version: v12.17.0
  • Platform: MacOS (19.5.0 Darwin)
  • Subsystem: zlib

What steps will reproduce the bug?

Compressing a large string in v12.17.0 changes the result, compared to the result in v12.16.3

Also attaching the (zipped) test.js file...
test.js.zip

This is the code that shows what I mean:

const zlib = require('zlib')

function compress (toCompr) {
  return zlib.gzipSync(toCompr).toString('base64')
}

function decompress (toDecompr) {
  return zlib.gunzipSync(Buffer.from(toDecompr, 'base64')).toString()
}

const obj = {}
for (let index = 0; index < 20000; index++) {
  obj[`a${index}`] = {
    b: index,
    c: {
      [`c${index}`]: index / 0.1234,
      d: {
        [`d${index}`]: index / 19
      }
    }
  }
}

const str = JSON.stringify(obj)
console.log('string length: ' + str.length) // 1692782

const compressed = compress(str)
console.log('compressed length: ' + compressed.length) // 543272 for v12.16.3 and 543276 for v12.17.0

const compressionProducedWith_12_16_3 = '...6FO/Begf79B1Gh7XZu1BkA' // this is just the end of the output... (because it's too big...)
const compressionProducedWith_12_17_0 = '...14r0D//gNRoe12btQZAA=='
// on v12.16.3 produces: compressionProducedWith_12_16_3 => ends with ...6FO/Begf79B1Gh7XZu1BkA
// on v12.17.0 produces: compressionProducedWith_12_17_0 => ends with ...14r0D//gNRoe12btQZAA==

console.log('is decompression working? (v12.16.3)', decompress(compressionProducedWith_12_16_3) === str) // true
console.log('is decompression working? (v12.17.0)', decompress(compressionProducedWith_12_17_0) === str) // true

The decompression seems to work with both versions, vice-versa... so there should be no big problem.

But if someone compares the compressed output this is a problem.

Additional information

Shouldn't this be a major version regarding semver?

All 4 comments

I'm not sure why anyone would be comparing compressed output like that, especially when different zlib compression parameters could easily change the resulting output. The only important check should be that the data can be decompressed successfully and that the resulting plaintext matches what you expect.

That's just what I have observed, feel free to close it.

@mscdex

I'm not sure why anyone would be comparing compressed output like that, especially when different zlib compression parameters could easily change the resulting output.

For example, we used bin compare to verify js port of zlib, with all possible options. See https://github.com/nodeca/pako/issues/193

It would be nice to understand what exactly caused change of deflate binary layer. It's very stable, and there is no much room for deviations. I can imagine only one reasonable possibility - if you slice input for multithread processing, then out will be not single stream, but sequence of properly-finished streams (with flushed end on each).


I'm not going to cry about breaking changes :). My case is specific. But i appreciate any help about difference in deflate output.

@puzrin As for the why, you would have to ask that question to the Chromium team as we are using their zlib fork.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sandeepks1 picture sandeepks1  路  3Comments

stevenvachon picture stevenvachon  路  3Comments

akdor1154 picture akdor1154  路  3Comments

mcollina picture mcollina  路  3Comments

vsemozhetbyt picture vsemozhetbyt  路  3Comments