Victoriametrics: Cannot start after update v1.31.1⇢v1.31.3 with error "unexpected tail left after unmarshaling"

Created on 24 Dec 2019  ·  13Comments  ·  Source: VictoriaMetrics/VictoriaMetrics

Describe the bug

Cannot start VictoriaMetrics (single node) after update v1.31.1v1.31.3 with error unexpected tail left after unmarshaling. Installation on AWS ECS docker with bind-mounted data volume.

↩️Workaround: rollback to v1.31.1

The whole error text:

panic: FATAL: unrecoverable error when merging small parts in the partition "/data/data/small/2019_12": error when merging parts to "/data/data/small/2019_12/tmp/15E359E2B9656648": cannot merge 5 streams: [/data/data/small/2019_12/163272_58790_20191224154344.480_20191224154834.255_15E3590D1A4F1792 /data/data/small/2019_12/670520_59950_20191224152901.968_20191224154354.109_15E3590D1A4F175B /data/data/small/2019_12/5824198_40834_20191224073850.360_20191224100848.782_15E0EA0C59B42523 /data/data/small/2019_12/7704696_64181_20191224100605.164_20191224125118.986_15E0EA0C59B42CBC /data/data/small/2019_12/9531486_79228_20191224124943.113_20191224153217.241_15E0EA0C59B43380]: cannot unmarshal and calibrate scale for blocks to be merged: cannot unmarshal 163 timestamps from len(src)=31 bytes: cannot unmarshal nearest delta2 data after zstd decompression: unexpected tail left after unmarshaling 163 items; tail size=158, value=0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

To Reproduce

  1. Run database (single node) v1.31.1 in Docker CT, data volume bind-mounted
  2. Upgrade docker container to v1.31.3
  3. Application cannot start with error described above

Expected behavior

Application works as before upgrade

Version

  • Before upgrade: victoria-metrics-20191215-231551-tags-v1.31.1-0-g96ff8d9a
  • After upgrade: victoria-metrics-20191224-105707-tags-v1.31.3-0-g5cb8d977
bug

Most helpful comment

Thanks for help, @Cyan4973 and @klauspost !

I found and fixed the issue in github.com/valyala/gozstd. It was triggered when the following conditions are met:

  • Decompress is called on a block without single segment property.
  • Destination buffer capacity is too small to hold the decompressed data.

In this case streaming decompression was called on the destination buffer expanded to its capacity, so the resulting decompressed data has been prefixed with garbage with the length equal to the original destination buffer capacity. See the corresponding commit with the fix. The fix is available starting from gozstd release v1.6.4 .

The fix will be available in the next release of VictoriaMetrics after the commit 17a4dc978218e3abea04e14c45ddb55ffbf1f748 .

All 13 comments

This could be related to accidental switch from cgo-based library for zstd compression to pure Go library for GOARCH=amd64 in VictoriaMetrics releases v1.31.0 , v1.31.1 and v1.31.2 . All the previous VictoriaMetrics releases and the latest v1.31.3 release for GOARCH=amd64 use cgo-based library for zstd compression.

It looks like compressed block has been created by pure Go zstd library - github.com/klauspost/compress/zstd in VictoriaMetrics v1.31.1 and this block is incorrectly unpacked by cgo-based zstd library - github.com/valyala/gozstd. The error message contains the following details about the block:

  • Compressed block size - 31 bytes. The block is compressed by github.com/klauspost/compress/zstd.
  • The size of the original uncompressed block is at least 163 bytes, since each item is represented by at least one byte.
  • github.com/valyala/gozstd successfully unpacks the block compressed by github.com/klauspost/compress/zstd, but the resulting unpacked data size exceeds the original data size by 158 bytes. The unexpected tail contains zero bytes.

cc'ing @klauspost , so he could spot possible discrepancy in zstd codecs according to the details above.

Temporary workarounds for the issue:

1) Continue using VictoriaMetrics v1.31.1 or v1.31.2 until the root cause of the issue is detected and fixed.
2) Switch to newer VictoriaMetrics releases and manually remove source folders with incorrect blocks, which are mentioned in the error message. Note that these folders must be removed while VictoriaMetrics is stopped. This would remove parts of data on the time ranges mentioned in folder names. For instance, the folder /data/data/small/2019_12/9531486_79228_20191224124943.113_20191224153217.241_15E0EA0C59B43 contains data on the time range 2019-12-24 12:49:43 - 2019-12-24 15:32:17.

Having the block in question would help a lot.

Added logging of hex contents of both packed and unpacked blocks. @glebsam , could you build VictoriaMetrics from commit 4ba4abe666fd7223dec7e29cff2029d96c46ca98 according to these docs and try reproducing the issue on this build? It should show more details about the issue in the log output, which will help detecting the root cause of the issue.

I will try, as soon as the issue will happen again. So far 3 of 3 installations successfully works with v1.31.3

(for the context, after downgrade to v1.31.1 I cannot find mentioned blocks on file system, so I rolled out v1.31.3 back and it works so far)

FYI, the improved error logging has been added to v1.31.4, so it is recommended updating to this version.

@klauspost , our user caught the error with the improved information:

panic VictoriaMetrics@/lib/storage/partition.go:859 FATAL: unrecoverable error when merging small parts in the partition "/victoria-metrics-data/data/small/2019_12": error when merging parts to "/victoria-metrics-data/data/big/2019_12/tmp/15E39FF2D1DB439A": cannot merge 3 streams: [/victoria-metrics-data/data/small/2019_12/722777418_15961984_20191225125100.000_20191225131804.731_15E2A1BB0D59CBBD /victoria-metrics-data/data/small/2019_12/1084559679_16460769_20191225121700.000_20191225125640.711_15E2A1BB0D59C660 /victoria-metrics-data/data/small/2019_12/2399414764_18362695_20191225110200.000_20191225122316.375_15E2A1BB0D59BE4D]: cannot unmarshal and calibrate scale for blocks to be merged: cannot unmarshal 149 timestamps from len(src)=24 bytes: cannot unmarshal nearest delta2 data after zstd decompression: unexpected tail left after unmarshaling 149 items from 184 bytes; tail size=25; src=E2D403030200B0A62699BA4760141868F7470101000000000000000000000000C0A907DFD4030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000; tail=00000000000000000000000000000000000000000000000000; src_zstd=28B52FFD00007D000038C0A907DFD40300015407022B0E02

It has the following information:

  • zstd block in hex, which has been been crated by pure Go library for zstd: 28B52FFD00007D000038C0A907DFD40300015407022B0E02
  • It has been unpacked by cgo-based library for zstd into the following hex block:
    E2D403030200B0A62699BA4760141868F7470101000000000000000000000000C0A907DFD4030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
  • The unpacked block has unexpected tail with 25 zero bytes.

Cool. I will take a look at it. I suspect an implementation detail, as this very likely would have been caught by the fuzz tests.

So my initial hunch is that the data was somehow in the input - otherwise a crc error would have been returned.

ok, using zstd -d in.ref.zst produces 152 bytes from the input, same as Go zstd. So it seems like something is up with valyala/gozstd.

My guess is that it has problems because "single segment" is not set and no output size is given in the frame header. This is perfectly valid, but may trip up something. IIRC zstd does this automatically on small blocks so this may be why you have not seen this before.

"Single Segment" is optional and output size is really just a decoder hint and will in most cases just add overhead. I did however intend to enable this automatically for small blocks, but flipped a compare wrong. I will fix that, but the decoder should really be fixed.

github.com/valyala/gozstd uses ZSTD_decompressDCtx for block decoding. It also uses ZSTD_getFrameContentSize for determining decoded block size. Probably, this function returns unexpected results for small blocks without single segment. If this function returns ZSTD_CONTENTSIZE_UNKNOWN, then stream decompression is applied. Probably there is a bug with stream decompression in gozstd if missing single segment leads to ZSTD_CONTENTSIZE_UNKNOWN return value from ZSTD_getFramContentSize function.

See the corresponding gozstd source code for details.

Upstream zstd version used - 1.4.4 .

cc'ing @Cyan4973 , zstd author, for the input.

This looks like a plausible scenario.

I'm unfortunately unable to go through the gozstd's streaming decoder implementation and hunt for the bug, due to lack of go fluency.

All I can tell is that the C streaming interface does not need nor care about single segment property, and should always be able to regenerate original data, whatever the sizes of input and output buffers.

A classical issue is forgetting to ensure that the decoder properly flushed all data, by checking the return value of the streaming function (must be 0 to indicate the end of frame).
However, given that the symptom is an _excess_ of bytes, rather than an early truncation, this is not consistent with missed flushed bytes.

Another blind guess scenario : the wrapper may pick a wrong value to determine the amount of bytes generated. This is ZSTD_outBuffer.pos, not to be confused with ZSTD_outBuffer.size, nor the return value of the function.
But as stated, this is merely a blind guess, not a conclusion derived from the code.

Thanks for help, @Cyan4973 and @klauspost !

I found and fixed the issue in github.com/valyala/gozstd. It was triggered when the following conditions are met:

  • Decompress is called on a block without single segment property.
  • Destination buffer capacity is too small to hold the decompressed data.

In this case streaming decompression was called on the destination buffer expanded to its capacity, so the resulting decompressed data has been prefixed with garbage with the length equal to the original destination buffer capacity. See the corresponding commit with the fix. The fix is available starting from gozstd release v1.6.4 .

The fix will be available in the next release of VictoriaMetrics after the commit 17a4dc978218e3abea04e14c45ddb55ffbf1f748 .

The bugfix is available in v1.31.5.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

v98765 picture v98765  ·  3Comments

prdatur picture prdatur  ·  3Comments

Serrvosky picture Serrvosky  ·  3Comments

oOHenry picture oOHenry  ·  4Comments

genericgithubuser picture genericgithubuser  ·  4Comments