this is summarizing what ideas came up regarding to this:
PR #3918, #3960, #3961 seem to speed up processing a bit
if we would put xattrs / acls / bsdflags into the files cache (we already have the chunks list there), we would not need to read them from the filesystem IF we can reliably detect changes. is comparing ctime and inode_number enough for this? mtime and size are obviously not helpful with that.
if we put that stuff into the files cache, we might produce a resource usage issue for some systems that have a lot of xattrs/acls, but not a lot of memory. so there should be an option to not put it into the files cache and fallback to reading it from disk in that case.
there could be another option not to read and back up xattrs / acls at all. we already have that for bsdflags. if there are technical issues (see #3952) or you're sure there are no acls/xattrs or you don't need them for restore (but you have a megaton of small files), this likely is useful even if there is faster xattr/acl code.
keep in mind that xattrs might be big, see #1681 - we shouldn't store big stuff into files cache.
A minor speedup was achieved by #3918, #3960, #3961 (about 5% for total time, for a 2nd backup of lots of small files (/usr/share) using a cold cache).
Thanks for the improvements!
Are these and following related commits (syscall reduction) too intrusive for backporting to 1.1 branch?
@FabioPedretti The overall speed improvement isn't big, but the change is rather big, so I guess it is better not to backport, but test it a while in 1.2 alpha, beta, rc testing before it enters a production version.
How big the bsdflags overhead is has to be re-evaluated after the PR #4043 is merged (we open the file there rather early anyway to work with the FD).
--nobsdflags and see if there is any significant difference now.There is no --noacls or --noxattrs (yet?), but commenting the stuff out in the code should be relatively easy (search for stat_attrs).
I do not see a big difference with --nobsdflags. Test on /home with little changes in between:
Without --nobsdflags:
Duration: 9 minutes 13.67 seconds
Number of files: 1024125
RemoteRepository: 2.04 MB bytes sent, 6.20 kB bytes received, 85 messages sent
With --nobsdflags
Duration: 8 minutes 7.70 seconds
Number of files: 1024129
RemoteRepository: 1.68 MB bytes sent, 6.33 kB bytes received, 67 messages sent
borg 1.1.9
Ubuntu 14.04 Virtual machine
Quite outdated server AMD server CPU, I leave out the hardware specs
Its still a bit unclear to me why borgbackup needs 9 minutes in my situation. For comparison:
cd /home; time ls -lR > /dev/null
real 0m16.602s
user 0m5.781s
sys 0m10.616s
Well, in your case that was already more than 10% (with borg 1.1.9) which is significant for a minor feature.
The code in master is already quite different from 1.1-maint though, this is why I did some benchmarks last night to see the effect of --bsdflags in old and new code.
I used kernel source trees (maybe unclean with some object files also) on a fast ssd as input data, the repo was on a ramdisk. I did 2 benchmark runs for each item, this is why there are 2 measurements for each.
master 2019/4 normal:
cold cache, first backup: 3:32.43 3:22.26
cold cache, second backup: 0:58.75 0:59.37
hot cache, first backup: 2:01.97 2:01.92
hot cache, second backup: 0:52.60 0:53.03
master 2019/4 --nobsdflags:
cold cache, first backup: 3:15.51 3:12.80
cold cache, second backup: 0:56.56 0:57.08
hot cache, first backup: 2:00.14 2:03.57
hot cache, second backup: 0:51.34 0:51.27
1.1-maint 2019/4 normal:
cold cache, first backup: 3:21.41 3:15.48
cold cache, second backup: 0:50.79 0:51.48
hot cache, first backup: 2:15.50 2:01.91
hot cache, second backup: 0:45.79 0:46.21
1.1-maint 2019/4 --nobsdflags
cold cache, first backup: 3:11.88 3:20.33
cold cache, second backup: 0:47.47 0:48.87
hot cache, first backup: 1:54.98 1:56.05
hot cache, second backup: 0:42.05 0:41.90
So, there is still a little slowdown by archiving bsdflags, even with the FD based approach.
On SSD, with a dataset like above, there isn't much reason to use --nobsdflags for performance reasons, but it could be used to avoid troubles with bsdflags in some circumstances.
@cruftex : with 1.1 backupping big NFS mounts I experienced much bigger slowdown with bsdflags, see #3239.
@ThomasWaldmann : is there a reason for the master slowdown vs 1.1? I would have expected master to be faster given new FD approach and lack of compaction.
@FabioPedretti It might be that the little gain due to the FD-based approach is eaten up by doing more at another place. It is likely very little gain due to caching effects.
The main advantage of the FD-based approach is less race conditions, not performance.
It would be good to isolate the reason why master is slower, and see if the effect of the relevant commit(s) can be reduced. New issue?
@jdchristensen yeah, new ticket please. can't promise i'll work on it though.
same benchmark, now for "no acls" and "no xattrs, no acls":
master 2019/4 no acls no bsdflags:
cold cache, first backup: 3:30.52
cold cache, second backup: 0:56.31
hot cache, first backup: 1:59.86
hot cache, second backup: 0:49.49
master 2019/4 no xattrs no acls no bsdflags:
cold cache, first backup: 3:15.08
cold cache, second backup: 0:52.63
hot cache, first backup: 1:58.79
hot cache, second backup: 0:44.91
1.1-maint 2019/4 no acls no bsdflags
cold cache, first backup: 3:22.98
cold cache, second backup: 0:45.47
hot cache, first backup: 1:59.02
hot cache, second backup: 0:40.88
1.1-maint 2019/4 no xattrs no acls no bsdflags
cold cache, first backup: 3:09.40
cold cache, second backup: 0:38.48
hot cache, first backup: 1:45.77
hot cache, second backup: 0:33.91
somehow all the "first backup" (esp. cold cache) values seem to vary quite a lot.
bench.sh:
export BORG_REPO=/run/user/1000/repo
SRC=/mnt/fast/kernel
alias drop="sudo bash -c 'echo 3 > /proc/sys/vm/drop_caches'"
bench_create ()
{
/usr/bin/time -v borg create --nobsdflags "$@" 2>&1 | grep wall.clock
}
borg init -e none
drop
echo cold cache, first backup:
bench_create ::1 $SRC
drop
echo cold cache, second backup:
bench_create ::2 $SRC
borg delete ::1
borg delete ::2
borg compact $BORG_REPO # only valid / needed for master branch
borg create ::0 $SRC 2> /dev/null
borg delete ::0
borg compact $BORG_REPO # only valid / needed for master branch
echo hot cache, first backup:
bench_create ::1 $SRC
echo hot cache, second backup:
bench_create ::2 $SRC
borg delete
@jdchristensen #4498
@ThomasWaldmann Thanks, I was going to create a ticket, but didn't have the time yet. Glad to see you are working on it!
I used a slightly modified script, it's here: https://gist.github.com/FabioPedretti/350e18f79134a58eacdcb0793bb36735
Output with 1.1.9 using the content of linux-5.1-rc4.tar.gz with my use case: share on NFS, repo on SSHFS:
default cold cache, first backup: 7:27.12
default cold cache, second backup: 3:28.67
default hot cache, first backup: 7:25.16
default hot cache, second backup: 2:50.20
nobsdflags cold cache, first backup: 5:58.58
nobsdflags cold cache, second backup: 1:13.94
nobsdflags hot cache, first backup: 6:17.71
nobsdflags hot cache, second backup: 2:05.55
For some reason (likely due to other systems using the NFS share as well as the SSHFS, which are out of my control and make the test not reliable) nobsdflags hot cache is slower than cold cache.
I'll try with 1.2 as soon as a binary is released.
I'll try with 1.2 as soon as a binary is released.
Comparison done here. In 1.2.0a6 default is about as fast as nobsdflags, but they are slower than 1.1.9.
TODO: add --noacls and --noxattrs for admins who have tons of small files and are sure they don't need to back up acls or xattrs.