borg did no compression with auto,lzma,6

Created on 23 Mar 2017  路  14Comments  路  Source: borgbackup/borg

Looking back on my logs, I just realized that the first backup I did into my "system" repo (basically, everything except /home) didn't do any compression. Right before that, the first backup of /home into my "home" repo that was launched using the same script did do compression. I tested redoing the system backup in a fresh repo, and it does do significant compression (> 50%), so it's not because the data isn't compressible. Moreover, I can't reproduce this, with the same borg version and command line. That's what is really puzzling to me. I'm 99% sure I was using the prebuilt binary for 1.1.0b3 when these backups ran. Any ideas what could have caused this?

First comes the logged output from when the bad backup ran and then the output of borg info for the bad one. Then the same two things for the good one.

borg create --stats --compression auto,lzma,6 "--exclude-from=/etc/borg/volumes/system.exclude" "/Backups/borg/system.borg::jdc-system-20170225-13:34:33" / 
------------------------------------------------------------------------------
Archive name: jdc-system-20170225-13:34:33
Archive fingerprint: 92fa914204c5ae24ec3d4e4f4f8e814bc7f076b3bffc21b2a289ebb013c834d9
Time (start): Sat, 2017-02-25 13:34:33
Time (end):   Sat, 2017-02-25 13:58:46
Duration: 24 minutes 13.06 seconds
Number of files: 340225
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               13.87 GB             13.87 GB             13.08 GB
All archives:               13.87 GB             13.87 GB             13.08 GB

                       Unique chunks         Total chunks
Chunk index:                  286790               328115
------------------------------------------------------------------------------
# borg info '/Backups/borg/system.borg::jdc-system-20170225-13:34:33'
Archive name: jdc-system-20170225-13:34:33
Archive fingerprint: 92fa914204c5ae24ec3d4e4f4f8e814bc7f076b3bffc21b2a289ebb013c834d9
Comment: 
Hostname: jdc
Username: root
Time (start): Sat, 2017-02-25 13:34:33
Time (end):   Sat, 2017-02-25 13:58:46
Duration: 24 minutes 13.06 seconds
Number of files: 340225
Command line: borg create --stats --compression auto,lzma,6 --exclude-from=/etc/borg/volumes/system.exclude /Backups/borg/system.borg::jdc-system-20170225-13:34:33 /
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               13.87 GB             13.87 GB              2.79 GB
All archives:              335.69 GB            307.93 GB             21.23 GB

                       Unique chunks         Total chunks
Chunk index:                  375649              9483069
borg create --stats --compression auto,lzma,6 "--exclude-from=/etc/borg/volumes/home.exclude" "/Backups/borg/home.borg::jdc-home-20170225-12:44:33" /home 
------------------------------------------------------------------------------
Archive name: jdc-home-20170225-12:44:33
Archive fingerprint: ae9b39e1207273862bd284850eef9f6229effe16c90d1e158202c26ed64d238e
Time (start): Sat, 2017-02-25 12:44:34
Time (end):   Sat, 2017-02-25 13:34:21
Duration: 49 minutes 46.86 seconds
Number of files: 198455
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               13.78 GB              7.42 GB              6.48 GB
All archives:               13.78 GB              7.42 GB              6.48 GB

                       Unique chunks         Total chunks
Chunk index:                  164581               189165
------------------------------------------------------------------------------
# borg info '/Backups/borg/home.borg::jdc-home-20170225-12:44:33'
Archive name: jdc-home-20170225-12:44:33
Archive fingerprint: ae9b39e1207273862bd284850eef9f6229effe16c90d1e158202c26ed64d238e
Comment: 
Hostname: jdc
Username: root
Time (start): Sat, 2017-02-25 12:44:34
Time (end):   Sat, 2017-02-25 13:34:21
Duration: 49 minutes 46.86 seconds
Number of files: 198457
Command line: borg create --stats --compression auto,lzma,6 --exclude-from=/etc/borg/volumes/home.exclude /Backups/borg/home.borg::jdc-home-20170225-12:44:33 /home
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               13.78 GB              7.42 GB            858.78 MB
All archives:              299.99 GB            175.12 GB              8.86 GB

                       Unique chunks         Total chunks
Chunk index:                  213232              4388611
bug

Most helpful comment

I ran a test using a mix of jpg files and text files, and found that auto compression is working like it should. In particular, using "auto,zlib" or "auto,lzma" uses only marginally more space, but saves a lot of time. Unencrypted, to and from a local SSD drive. The table shows the compressed size and time for each compression option I tried. The first line shows the uncompressed size.

  none       119.54MB  1.7s
  lz4        105.26MB  1.8s
  auto,zlib  101.26MB  2.6s
  zlib       100.06MB  5.7s
  auto,lzma  100.58MB 10.8s
  lzma        98.91MB 34.8s

I also verified that the debug output makes sense.

All 14 comments

As another data point, after the first backup, I switched to auto,zlib, and new chunks seem to be compressed properly. But, I don't think it's simply a problem with lzma, since if I start with a fresh repo and use auto,lzma,6, it works fine.

@jdchristensen I'll have a look if I find something in the code that could cause this. Sounds a bit like a heisenbug though. :)

Guess I have it. PR coming soon.

Ah, so if just one chunk was found to be incompressible, none after that were compressed? Glad to hear this is a real bug and not just my silliness!

@jdchristensen exactly. thanks for finding that.

It's strange to me that this bug didn't reveal itself more frequently. I have many files that are gzipped, or jpg, or compressed pdf, that should be incompressible, and yet my overall compression ratio is always very good, except for that one backup in which no compression happened at all.

Oh, I get it. Probably whatever compression was chosen for the first chunk was used for every chunk. And that one backup I did happened to have an incompressible first chunk, but the others happened to have a compressible first chunk, which is not surprising.

@ThomasWaldmann Let me know if you want me to test a fix.

@jdchristensen well, there is PR #2332 which hopefully fixes this, but it did not get reviewed yet and thus is not merged yet into master. there are some other (few) PRs pending yet, blocking a b4 release.

now merged into master...

Thanks, I'm testing now. (I wasn't sure before if your commit was ready for testing.)

Can someone tell me how to enable the compression logging output?

--debug --debug-topic file-compression

I ran a test using a mix of jpg files and text files, and found that auto compression is working like it should. In particular, using "auto,zlib" or "auto,lzma" uses only marginally more space, but saves a lot of time. Unencrypted, to and from a local SSD drive. The table shows the compressed size and time for each compression option I tried. The first line shows the uncompressed size.

  none       119.54MB  1.7s
  lz4        105.26MB  1.8s
  auto,zlib  101.26MB  2.6s
  zlib       100.06MB  5.7s
  auto,lzma  100.58MB 10.8s
  lzma        98.91MB 34.8s

I also verified that the debug output makes sense.

\o/ thanks for testing / measuring.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zatricky picture zatricky  路  3Comments

auanasgheps picture auanasgheps  路  5Comments

rugk picture rugk  路  5Comments

htho picture htho  路  5Comments

pierreozoux picture pierreozoux  路  4Comments