Borg: Is there a way to manually repair this repository, losing minimal data?

Created on 9 May 2020  路  8Comments  路  Source: borgbackup/borg

Have you checked borgbackup docs, FAQ, and open Github issues?

I've checked the docs and FAQ. There are too many issues to look over every single one, but I was not able to find a similar issue when searching through issues marked "question" or those mentioning borg check.

Is this a BUG / ISSUE report or a QUESTION?

Question, perhaps a feature request.

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

borg 1.1.11

Operating system (distribution) and version.

Debian 10 buster, x86_64

Hardware / network configuration, and filesystems used.

An overlay FS combining an external hard drive and a FUSE google drive mount. I realize this may be the source of the problem, however I'd still like to know if I can fix it without recreating the repo from the beginning.

How much data is handled by borg?

Repo is currently 4.4TiB, eventually it needs to store ~13TiB of data.

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg-env/bin/borg -v -p check --repair /root/borg-overlay

Describe the problem you're observing.

Borg check --repair failed to repair. I'm wondering what the best way would be to keep the progress so that I don't have to copy over the 4.4TB again

To be clear, I don't care about data loss in the borg repo for now. I only care about keeping my progress towards a complete, working borg archive. If I could somehow delete emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint.3 while still keeping all the deduplicated segments of data that would seem to be ideal, but I don't know enough about the internals of borg to even say that would make sense.

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Any attempt to borg create reliably fails with a similar error message, I'm sorry I don't have the exact message. The borg check command takes multiple days to run, and so I have not run it again.

However, since it says Completed repository check, no problems found., I suspect that if I could skip to the archive consistency check, it could quite quickly reproduce the problem.

Additional info

I believe the corruption happened when the drive used for borg's cache filled up completely, and writes failed. This was a simple PEBKAC, and the cache is configured to write elsewhere now.

The archive is encrypted using repokey-blake2.

Include any warning/errors/backtraces from the system logs

Completed repository check, no problems found.
Starting archive consistency check...
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade (1/12)
Analyzing archive [email protected] (2/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt3 (3/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt4 (4/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt5 (5/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt6.checkpoint (6/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint (7/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint.1 (8/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint.2 (9/12)
Analyzing archive emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint.3 (10/12)
Data integrity error: Segment entry checksum mismatch [segment 2710, offset 4293464]
Traceback (most recent call last):
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 4529, in main
    exit_code = archiver.run(args)
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 4461, in run
    return set_ec(func(args))
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 166, in wrapper
    return method(self, args, repository=repository, **kwargs)
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 335, in do_check
    verify_data=args.verify_data, save_space=args.save_space):
  File "/root/borg-env/lib/python3.7/site-packages/borg/archive.py", line 1233, in check
    self.rebuild_refcounts(archive=archive, first=first, last=last, sort_by=sort_by, glob=glob)
  File "/root/borg-env/lib/python3.7/site-packages/borg/archive.py", line 1620, in rebuild_refcounts
    for item in robust_iterator(archive):
  File "/root/borg-env/lib/python3.7/site-packages/borg/archive.py", line 1562, in robust_iterator
    for chunk_id, cdata in zip(items, repository.get_many(items)):
  File "/root/borg-env/lib/python3.7/site-packages/borg/remote.py", line 1083, in get_many
    for key, data in zip(keys, self.repository.get_many(keys)):
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1109, in get_many
    yield self.get(id_)
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1103, in get
    return self.io.read(segment, offset, id)
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1470, in read
    size, tag, key, data = self._read(fd, self.put_header_fmt, header, segment, offset, (TAG_PUT, ), read_data)
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1506, in _read
    segment, offset))
borg.helpers.IntegrityError: Data integrity error: Segment entry checksum mismatch [segment 2710, offset 4293464]

Platform: Linux 2esrever 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64
Linux: debian 10.3
Borg: 1.1.11  Python: CPython 3.7.3 msgpack: 0.5.6
PID: 24284  CWD: /root
sys.argv: ['borg-env/bin/borg', '-v', '-p', 'check', '--repair', '/root/borg-overlay']
SSH_ORIGINAL_COMMAND: None
question

Most helpful comment

I was able to fix this by doing this in the python REPL: (actually did a lot more trial and error, but if I needed to do it again this should be all that's needed)

from borg.logger import setup_logging
from borg.repository import LoggedIO
setup_logging()
lo = LoggedIO("/root/borg-overlay", 524288000, 1000)
lo.recover_segment(2710, lo.segment_filename(2710))

Where /root/borg-overlay is the repo, 524288000 is the max_segment_size from the config file, 1000 is the segments_per_dir config, and 2710 is the segment that it was complaining about.

borg check --archives-only /root/borg-overlay then reported no problems found.


I write this in the hopes that it might help other people who have the same problem, but I don't think this solves the issue. Shouldn't borg check --repair have run recover_segment?

All 8 comments

you could try to delete the checkpoint with the corrupt segment.

Just tried that, unfortunately it's the same error:

# borg-env/bin/borg delete /root/borg-overlay 'emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint.3'
Data integrity error: Segment entry checksum mismatch [segment 2710, offset 4293464]
Traceback (most recent call last):
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 4529, in main
    exit_code = archiver.run(args)
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 4461, in run
    return set_ec(func(args))
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 166, in wrapper
    return method(self, args, repository=repository, **kwargs)
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 1265, in do_delete
    return self._delete_archives(args, repository)
  File "/root/borg-env/lib/python3.7/site-packages/borg/archiver.py", line 1309, in _delete_archives
    with Cache(repository, key, manifest, progress=args.progress, lock_wait=self.lock_wait) as cache:
  File "/root/borg-env/lib/python3.7/site-packages/borg/cache.py", line 380, in __new__
    return local()
  File "/root/borg-env/lib/python3.7/site-packages/borg/cache.py", line 374, in local
    lock_wait=lock_wait, cache_mode=cache_mode)
  File "/root/borg-env/lib/python3.7/site-packages/borg/cache.py", line 467, in __init__
    self.sync()
  File "/root/borg-env/lib/python3.7/site-packages/borg/cache.py", line 851, in sync
    self.chunks = create_master_idx(self.chunks)
  File "/root/borg-env/lib/python3.7/site-packages/borg/cache.py", line 805, in create_master_idx
    fetch_and_build_idx(archive_id, decrypted_repository, archive_chunk_idx)
  File "/root/borg-env/lib/python3.7/site-packages/borg/cache.py", line 710, in fetch_and_build_idx
    for item_id, (csize, data) in zip(archive.items, decrypted_repository.get_many(archive.items)):
  File "/root/borg-env/lib/python3.7/site-packages/borg/remote.py", line 1083, in get_many
    for key, data in zip(keys, self.repository.get_many(keys)):
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1109, in get_many
    yield self.get(id_)
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1103, in get
    return self.io.read(segment, offset, id)
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1470, in read
    size, tag, key, data = self._read(fd, self.put_header_fmt, header, segment, offset, (TAG_PUT, ), read_data)
  File "/root/borg-env/lib/python3.7/site-packages/borg/repository.py", line 1506, in _read
    segment, offset))
borg.helpers.IntegrityError: Data integrity error: Segment entry checksum mismatch [segment 2710, offset 4293464]

Platform: Linux 2esrever 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64
Linux: debian 10.4 
Borg: 1.1.11  Python: CPython 3.7.3 msgpack: 0.5.6
PID: 29074  CWD: /root
sys.argv: ['borg-env/bin/borg', 'delete', '/root/borg-overlay', 'emanlooc@2020-03-31-before-stretch-buster-upgrade-attempt-6.checkpoint.3']
SSH_ORIGINAL_COMMAND: None

I was able to fix this by doing this in the python REPL: (actually did a lot more trial and error, but if I needed to do it again this should be all that's needed)

from borg.logger import setup_logging
from borg.repository import LoggedIO
setup_logging()
lo = LoggedIO("/root/borg-overlay", 524288000, 1000)
lo.recover_segment(2710, lo.segment_filename(2710))

Where /root/borg-overlay is the repo, 524288000 is the max_segment_size from the config file, 1000 is the segments_per_dir config, and 2710 is the segment that it was complaining about.

borg check --archives-only /root/borg-overlay then reported no problems found.


I write this in the hopes that it might help other people who have the same problem, but I don't think this solves the issue. Shouldn't borg check --repair have run recover_segment?

Did you ever change the segments_per_dir value after already having some contents in the repo?

What's strange here:

  • the commandline and output in your first log / traceback shows that you ran a full borg check --repair (including repository and archives checks)
  • the repository check reads all objects stored in the segment files and checks the crc32 - if there is an issue, it should internally get an IntegrityError there and it should (in --repair mode) call recover_segment for the problematic segment, outputting "attempting to recover "
  • but you do not have that in the repository check part, only later in the archives check part

Did you ever change the segments_per_dir value after already having some contents in the repo?

No, definitely didn't change that. I'm not sure why it was having such strange behaviour.

Just had this happen a few days ago with a separate borg repo with a very similar setup. The one thing I noticed in common in every case was that it was preceded by a full disk on the system running borg, causing borg to fail (as expected).

Can you reproduce it? If so, open a new ticket with reproduction steps, preferably a script working with relative, local paths.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

auanasgheps picture auanasgheps  路  5Comments

unlandm picture unlandm  路  4Comments

enkore picture enkore  路  5Comments

pierreozoux picture pierreozoux  路  4Comments

rugk picture rugk  路  3Comments