This is a fresh bucket and fresh upload to B2
restic versionrestic 0.7.3
compiled with go1.9 on freebsd/amd64
restic -r $RESTIC_REPO backup -o b2.connections=10 /share_data/
enter password for repository:
scan [/share_data]
scanned 108338 directories, 270163 files in 0:12
[6:07] 0.01% 138.515 KiB/s 49.643 MiB / 467.293 GiB 55 / 378501 items 0 errors ETA 982:31:51
[26:27] 0.01% 32.031 KiB/s 49.643 MiB / 467.293 GiB 55 / 378501 items 0 errors ETA 4248:49:03
B2
Hey, what upstream bandwidth do you have available? What's your location (network-wise)? B2 is not the fastest service, the latency to the API (at least from Europe) is pretty high.
How did you run restic exactly?
Did you check the bandwidth used with external tools?
What throughput do you get using other programs like rclone?
anecdote: I back up two identical servers, one in New Jersey and one in Germany. The former takes about 20 minutes and the latter 90 minutes backing up to B2.
Hi,
Adding my feedback here. My server is in Amsterdam, upload speed is about 100Mb/s.
Restic version 0.7.3
I'm backing up my nextcloud data folder, about 65-70GB of various size unencrypted files
I created to repositories, one local and one on B2. The local one seems to process the data at ~5MB/s
restic backup -r ./resticbackup /srv/nextcloud
scanned 6320 directories, 47379 files in 0:00
[2:53] 1.12% 4.326 MiB/s 748.373 MiB / 65.470 GiB 619 / 53699 items 0 errors ETA 4:15:47
When I run the same initial backup to B2, after a short burst, the speed quickly goes down
I've waited a few minutes for the speed to go down after the burst and it sat around 1MB/s. It's confirmed by using the tool "nethogs" to monitor bandwith by process, restic seems to oscillate between 500KB/s and 2000KB/s

I then used rclone with no special options to upload a similar folder structure as the backup repository and got around the same speeds

It oscillated mostly between 500 and 1500 KB/s during my tests
So imho the issue is not with restic itself but with B2.
Edit:
I've re-tried the backup with a huge B2 connections (-o b2.connections=50) and the speed drastically increased.
I'm currently backup-ing at nearly 10MB/s (between 6.5MB and 10MB per second) so the limit seems to be per-connection to the API
That's interesting, thanks for reporting back!
I think I found my issue as I tried a different ISP and have received very different results. I wonder, is there logic in restic to detect "dead" connections?
B2 is known to have limited bandwidth per connection, I've seen this same issue with other backup software as well. The solution is always to use multiple connections to upload. I'm glad I found this thread because I didn't know about the b2.connections option. Where are all the options documented?
I second the solution with b2.connections, from 0.5 to 15 MB/s is a nice relief :-)
Restic definitely has potential but not sure if it is ready for prime time yet especially with Backblaze.
When I back up to Backblaze B2 with duplicity it is blazingly fast.
--------------[ Backup Statistics ]--------------
StartTime 1511881648.02 (Tue Nov 28 10:07:28 2017)
EndTime 1511881650.63 (Tue Nov 28 10:07:30 2017)
ElapsedTime 2.60 (2.60 seconds)
SourceFiles 10915
SourceFileSize 7991047699 (7.44 GB)
Backing up to Azure using restic is fast but not quite as good as duplicity:
[0:00] 96 directories, 10819 files, 7.441 GiB
scanned 96 directories, 10819 files in 0:00
[0:08] 100.00% 0B/s 7.441 GiB / 7.441 GiB 10915 / 10915 items 0 errors ETA 0:00
duration: 0:08, 861.53MiB/s
And now for the slowest of the group, is backing up to Backblaze B2 using restic:
scanned 4 directories, 2781 files in 0:00
[0:23] 100.00% 24.855 MiB/s 571.663 MiB / 571.663 MiB 2785 / 2785 items 0 errors ETA 0:00
duration: 0:23, 24.60MiB/s
I don't have enough storage on my account to backup the 7.4GB directory on the last test as in the first two examples, but either way you can see that using duplicity to backup to Backblaze is the fastest.
And to even get restic down to 23 seconds, I had to use the connections options and specify 500 b2 connections.
@CurtWarfield can you please retry with rclone and report back?
@CurtWarfield 2.6GiB/s to B2 seems kind of high. Can you confirm this is for a dataset that is not already partially backed up? It takes longer than 2.5s for me just to walk a stat directory tree of about that size.
If duplicity really is getting that kind of performance I would like to run some tests.
No kurin, that speed with duplicity is on a re-sync with no new data.
I just ran a new 570MB directory backup to B2 using rclone as requested. The new backup took 10 minutes total which seems quicker than with restic.
Are the restic backups listed also resyncs?
yes kurin. Only takes duplicity 2.6 seconds for a re-sync and restic on Azure takes 8 seconds and restic on B2 takes 23 seconds.
I just checked the B2 re-sync with rclone. rclone is more in line with the Azure performance.
2017/11/28 16:58:20 INFO :
Transferred: 0 Bytes (0 Bytes/s)
Errors: 0
Checks: 2781
Transferred: 0
Elapsed time: 8.7s
Okay, gotcha. I'm not an expert on restic internals (I'm here for B2 muckery). I think there's performance to squeeze out of the actual data transfer but that wouldn't help for your examples.
However, your examples probably do highlight the most common use case, which is backing up a dataset that is 95%+ unchanged, with lots of small files. It would be nice if restic could quickly identify and skip these files.
We should probably split this into two bugs, one being specific to B2 bandwidth, the other being restic handling unchanged data (if that doesn't already exist).
It does skip the files, it's just takes restic 12 times longer on B2 than duplicity does.
Unfortunately the initial backup is also slow too. Even using the b2 connections options.
The B2 bandwidth is still too slow using restic, not just for handling unchanged data but for the initial backup too.
Do we have any data on how long it takes Duplicity to talk to B2? 570MB in 10min is ~8mbps, which is already in line with the speeds I see from restic to b2 today.
I'm going to run the same test for a new back up with duplicity and post results here.
Thanks!
The proof is in the pudding !!!!
So I backed up the same directory from my server to the SAME B2 bucket using restic and duplicity.
Restic actually took about 12 minutes to back up
Duplicity only took 4 minutes to back up the same directory to the same B2 bucket
Last full backup date: none
Reuse configured PASSPHRASE as SIGN_PASSPHRASE
No signatures found, switching to full backup.
--------------[ Backup Statistics ]--------------
StartTime 1511908655.75 (Tue Nov 28 17:37:35 2017)
EndTime 1511908907.31 (Tue Nov 28 17:41:47 2017)
ElapsedTime 251.57 (4 minutes 11.57 seconds)
SourceFiles 2785
SourceFileSize 599784317 (572 MB)
NewFiles 2785
NewFileSize 599784317 (572 MB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 2785
RawDeltaSize 599432061 (572 MB)
TotalDestinationSizeChange 297380160 (284 MB)
Errors 0
Running my backup script the 2nd time on the B2 bucket was again so much faster than restic. So it's not Backblaze that is slow. Out of all my cloud services I use, Backblaze is quicker than all of them. That includes AWS and Google Cloud.
Last full backup date: Tue Nov 28 17:37:34 2017
Reuse configured PASSPHRASE as SIGN_PASSPHRASE
--------------[ Backup Statistics ]--------------
StartTime 1511909417.28 (Tue Nov 28 17:50:17 2017)
EndTime 1511909417.76 (Tue Nov 28 17:50:17 2017)
ElapsedTime 0.48 (0.48 seconds)
SourceFiles 2785
SourceFileSize 599784317 (572 MB)
NewFiles 0
NewFileSize 0 (0 bytes)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 0
RawDeltaSize 0 (0 bytes)
TotalDestinationSizeChange 716 (716 bytes)
Errors 0
That's ~20Mbps, which is totally achievable today in restic with B2 (I just got 750Mbps from a GCE instance on a single large file).
I believe most of this is in restic's handling of directory trees. But that doesn't account for the difference between Azure and B2.
@CurtWarfield sorry for not having asked this earlier: Which version of restic did you use in your tests? What's the output of restic version?
The background is that with restic 0.8.0 which was released just a few days ago we've added a local cache for metadata. Before that (e.g. with restic 0.7.3) it would by default query the B2 API for each directory it encounters, at least for non-initial backups. So that's slow just because the B2 API has a high latency for getting small chunks of data.
Would you mind re-running your tests with restic 0.8.0 or the master branch?
You can find compiled binaries of the restic master branch here: https://beta.restic.net/?sort=time&order=desc
@fd0 pretty sure he is using 0.8.0. It will be interesting to see what it does when he uses the build from the master branch.
Hello,
Yes I am using 0.8.0 :-)
Hm, I've had at least one integration test on Travis fail due to a too slow B2 backend test, so I suppose something at B2 is/was wrong.
I would like to switch all my backups from duplicity to restic but B2 is too slow right now :-(
Just did some more testing. Duplicity and rclone all about the same on AWS, Google Cloud, Azure, and B2. It's only B2 that has slow restic performance. So far now I need to stick with either duplicity or rclone for my B2 buckets. But for my AWS, Google cloud, and Azure buckets I can convert to restic.
How many b2 connections do you have restic configured to use?
On Nov 29, 2017 8:29 PM, "Curt Warfield" notifications@github.com wrote:
Just did some more testing. Duplicity and rclone all about the same on
AWS, Google Cloud, Azure, and B2. It's only B2 that has slow restic
performance. So far now I need to stick with either duplicity or rclone for
my B2 buckets. But for my AWS and Google cloud I can convert to restic.—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/restic/restic/issues/1383#issuecomment-348053433, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AHuypeS33viZDG2ols54qpdD-4m9hLhBks5s7gTfgaJpZM4QFFnO
.
I've tried everything from 1 to 500
@CurtWarfield can you paste the invocation and output of restic for each service?
Oh wait I see you have one from Azure above.
Not sure if this is helpful, but here is an additional data point
restic 0.8.0 (v0.8.0-35-g9d0f13c4)
built from source with go 1.9.2 linux/arm on rpi3 using -p 4 option
./restic -o b2.connections=100 backup ~/bin
Starts off with saturated upload bandwidth ( ~2-3 MiB/s) and quickly falls to 100-200 KiB/s for remainder of upload.
_scanned 6 directories, 68 files in 0:00
[2:19] 100.00% 137.554 KiB/s 18.672 MiB / 18.672 MiB 74 / 74 items 0 errors ETA 0:00
duration: 2:19, 0.13MiB/s_
The bursty behavior there is caused by the fact that the b2 client lib buffers the output before sending it (because it has to send the sha1 hash as part of the request). To restic, this looks like a _really fast_ connection and then a single Write() or Close() that blocks for a while.
For my case it runs at line speed for exactly 45 megs and they drops speed. I get that this is buffering, but is this a limitation of restic?
I don't believe so but I'm not positive. I can say that all the tests I tried from my VPS (few large files, many small files, a medium amount of large files) were unable to produce a marked difference to B2 compared to GCS, either for fresh uploads or unchanged data. B2 is the slower one, but it is slower by a constant amount, usually 2-5s, because roundtrips to B2 (e.g. for grabbing a lock, or recording a snapshot) aren't as fast.
.13MiB/s is just over 1Mbps, which isn't great but is believable for a home connection, and this number should be fairly accurate at the end of the upload.
There could also be connection issues that restic is automatically recovering from. If you can produce a debug log (remember to scrub PII) I could take a look at that.
Can you clarify how this debug log is supposed to be created?
Yes, sorry, if you are able to build from the source, checkout the "v0.8.0" tag and then follow the instructions in https://github.com/restic/restic/blob/master/CONTRIBUTING.md for building a debug binary,
Be aware that the logfile will have sensitive information including auth tokens. Run sed -i -e '/Authorization:.*/d' /tmp/restic-debug.log to remove auth info, and sed -i -e 's/\(GET\|POST\) [^ ]\+/\1 <redacted>/' -e '/X-Bz-File-Name/d' restic-debug.log to remove filenames, if those are sensitive.
Kurin,
I've attached debug logs showing the slow B2 performance.
Thanks Curt. @fd0 should take a look at this as well in case there's something going on within restic. I do see a lot of B2 roundtrips at a glance; it looks like there are many data chunks it's sending.
One thing I noticed is 282 requests for "Range: 0-0", which I suspect is effectively a Stat call. I could well believe that a couple hundred of these are responsible for pushing a backup from seconds to minutes.
I can't see anything out of the ordinary, sorry. Did you try --limit-upload yet? We've just recently merged a change that improves limiting the bandwidth a lot... Maybe there's some bufferbloat somewhere, so some HTTP requests are delayed...
Can you run the bufferbloat test here? http://www.dslreports.com/speedtest
fd0,
Can you specify how to use the --limit-upload option?
Uhm, like the help says:
--limit-upload int limits uploads to a maximum rate in KiB/s. (default: unlimited)
For the DSL connection at home I have 40Mbit upstream bandwidth (so ~5MiB/s), so I could specify --limit-upload 4500 to limit the upstream bandwith to ~4.4MiB/s and not saturate the upstream completely.
I tried different combinations of --limit-upload and it actually makes it worse !
I wonder if restic is triggering some sort of ISP traffic shaping. I got annoyed enough and switched from a DSL provider to cable and no longer have any issues with upload.
I would doubt there is any sort of ISP traffic shaping because it's slow at work (enterprise grade internet) and at home (300/20 TWC). No difference between either location. Plus remember duplicity is blazing fast for Backblaze (B2)
Are there other things that can be done to determine why there is such a performance discrepancy?
Maybe someone can take a look at the duplicity code to see what calls they are doing to have such fast performance with B2
@curtwarfield Since neither I nor @kurin can reproduce what you're seeing, is it maybe possible to get temporary access to a Linux VM or container on your network so I can try to reproduce it there? Otherwise I don't have any idea on how to debug this further...
Unfortunately I don't have a way to make that happen from my location. So from your location, syncing to B2 performs normally?
Yes, I can easily saturate my upstream bandwidth (Germany) to B2 (in the US).
I second this. Location: Poland, Orange, FTTH.
@curtwarfield Is there any way @fd0 can get temporary access for reproducing the problem? Not sure how else we could progress debugging this.
Just sign up to BackBlaze, you get 10GB free storage. Not sure if they charge you for API Calls but either way its much cheaper than AWS.
I'm afraid it's a B2 issue. I've tried the same with their official client and the speed was pathetic. Outside US is useless.
er1z, I respectfully disagree because when I use duplicity with GPG encryption for B2 it's blazing fast. If it was a B2 issue, it would be slow with duplicity as well. Don't get me wrong. I think restic is a great solution for other cloud back-ends, just not ready for B2 support yet.
@er1z I also disagree. It may be an issue with B2, it may also be a corner case not correctly handled somewhere (B2 client library, Go standard library, Linux kernel, ...) which triggers this behavior, but we don't know that yet.
Let me summarise this issue:
So, we need to find out what duplicity does differently from restic and which influencing factors (region, network, ...) there are. @kurin and I tried to do that, but without having access to such a networking situation and being able to debug a bit, we won't get any further information.
Does anybody have any idea on how to debug this further? Otherwise, I'm inclined to close this issue for now. Thoughts?
I'm in the UK, and for me B2 for me is quicker on the initial upload of a snapshot (the 1st one) of 230GB but if I CTRL+C the upload and then start it again it is really slow progress once it gets to the upload stage.
I enabled debugging and use DEBUG_FILES=b2.go, but the output of debugging just stops and the process appears to hang (or its just going so slow). I have a very memory limited system (512MB) I'm on, so maybe it's that.
I tried using DEBUG_LOG=/tmp/restic-debug.log but that just swamped my disk IO and everything slowed up so much the system crashed. It's a ReadyNAS R102 so not the best spec box.
I'm using:
$ restic version
debug enabled
restic 0.8.1 (2debb5c)
compiled with go1.9.2 on linux/amd64
Other than b2.go what other files could I get relevant debug info from?
That said, I have a fully updated Ubuntu 16.04 box and that is on the same network and that runs just fine... but its only uploading 8GB of data so not really doing much.
The times I've had too little memory I've gotten a crash instead of just a hang, so I don't know if this is relevant to you, but you can try to make restic more aggressively GC by setting the environment variable GOGC to e.g. 20 when you run restic. That's what I use in a memory constrained VM where I'm running restic.
After all was said today's fresh backup to B2 is going nicely. I deleted the old Bucket and started from fresh. Even with debug logging going to a tmp file things are moving nicely. I update my restsic binary from master via a jenkins build, so whenever you updated the code, my next backup will use that version. Could the "Relax backend timeout test" change have helped??
Over the last 18hrs, I've uploaded 65GB, that's about 8Mbps. So far I've spent $0.24 on Class B Transactions and $0.10 on storage.
In my debug logs I saw a lot of "404 Not Found" errors, followed by an upload of that exact same object. It is the same in the log posted earlier in this thread.
Surely that can't be right?
That's the way we're using the B2 client library right now:
https://github.com/restic/restic/blob/4eb9df63cf416c9654224cffc841a8e16a179915/internal/backend/b2/b2.go#L199-L205
I'm not sure if this is the most efficient way. We're also making sure that the file does not already exists before writing to it.
Maybe @kurin can have a look where the class B transactions come from...
24c on class B transactions is ~600k transactions, which seems high. I know that restic chunks data, but if you assume ~5MB/chunk there should only be ~13k chunks in a 65GB data set. The only class B transactions are "stat"-like functions (either get_file_info or download_file, which we use instead of get_file_info for good reasons that escape me atm). So issuing 600k of them during an upload session does seem like a lot. @fd0, does restic do any background maintenance while uploading chunks? Juggling locks etc?
During upload, there's not much going on. A bit of lock juggling, but that's like one file every five minutes or so, should be negligible. For each file that's to be uploaded, Attrs() is called, followed by writing the file itself. Hm. Shouldn't be that many transactions...
I may have found a redundant call to one of the class B APIs. I'll try to confirm and get a fix out tonight.
The restic nightlies have some fixes in them that may help with this issue: https://beta.restic.net/restic-v0.8.2-19-g9f060576/
I'd be interested to see what difference this makes, if any.
For the record, the PRs are: #1634 #1623
I think this issue is resolved with 0.8.2 or later, so I'm going to close it for now. Please leave a comment should you still experience very slow uploads with B2 with 0.8.2 or later. Thanks!
I can confirm that 0.8.2 is quickest for me. The upload speed varies with B2, just the same now as it does with S3.
I'm currently uploading 0.01% every 18 seconds. I have 224GB I'm uploading.
Its taken 21hrs to upload 62% which is about 138GB.
Hey now !!! I just tested 0.8.3 and it's finally working as expected !!!! Great job to the developers.
I'm still not sure why people didn't have issues with 0.8.2 but that version was definitely slow for B2. But I am pleasantly pleased with 0.8.3. I just backed up 600MB consisting of 2800 files in only 3 minutes 20 seconds !!!!! Again kudos and GREAT JOB to all those responsible. You gotta winner !
Awesome, thanks for the feedback :)
I'm using restic since about two weeks, backing up to B2 (restic v0.8.3). I have about 300GB of data in about 41.000 files stored on a Raspberry Pi 3 B+ which has a 2TB Western Digital USB disk attached. My internet connection is LTE, I have upload speeds about 20-40 Mbit/s. The first 240GB went fine in under a day. As I'm currently rebuilding my whole network and servers, I had to stop the backup at about 240GB. Since then I tried to start it a couple of times. The first 240GB take about 3-4 hours, there no uploads are done, i think it only checks the files and if they exist in the B2 bucket. That's ok for a raspberry i think, as i guess that this is all about calculating hashes which is probably pretty slow on a Pi. As soon as it gets to fresh files where uploads are done, it's really slow. That means about 15 hours for 1GB (the first 240GB took nearly the same time). I checked which files restic has open using lsof. There where a couple of small 3-5MB files, and one large 2,5GB file. restic worked through those small files the last three days, and also the 2,5GB file is done. Now it's working on only large files, each about 3-6GB in size, with the same "speed".
This sounds like @richard-scott commented: https://github.com/restic/restic/issues/1383#issuecomment-365862475 - his backup also was slow when it reached uploads. I guess I could solve it by deleting all snapshots of that server, and starting over like richard-scott did. But as long as my backup is in that slow behaviour, maybe I can give some debugging information?
Could a dstat 20 be helpful?

The first 240GB take about 3-4 hours, there no uploads are done, i think it only checks the files and if they exist in the B2 bucket.
Roughly, yes.
There where a couple of small 3-5MB files, and one large 2,5GB file. restic worked through those small files the last three days, and also the 2,5GB file is done. Now it's working on only large files, each about 3-6GB in size, with the same "speed".
This may be caused by several different things:
You can test the following:
iostat. Does it change over time, when restic reaches the point where it reads new files?Richards observations may be related, but that was with a much older version of restic (0.8.1), we've improved handling B2 a lot since then.
If this still is an issue, please open a new issue so we can discuss it.
@netdesk Have you tried setting this:
--option b2.connections=50
I think the default is something like 5 parallel connections.
I tried the latest dev version as suggested by @fd0. It was a bit faster (and I like the new output details) and the rest of the 300GB are done now after one day, so I can't really reproduce it. But before (regular 0.8.3) lsof showed ten open files, using the dev version it opened only 2-3 files at the same time. I don't know how much connections to B2 it opened. I did log iostat and vnstat every minute during the backup with the dev version. Gonna check the files and report in a new issue if I find something.
Just now I tried a restic check which failed because it said, that a lock is in place which was created five days ago. That was about the day and time when the first try of the 300GB backup was interrupted. Can't imagine if that lock could slow down the backup process, just something I found. As it's probably hard to reproduce now, digging in the dark isn't worth the time I guess. Like @fd0 said, I'll open another issue if it occours again.
Hi,
I'm using restic to backup to b2, and it's very painfully slow. A backup I started about 24h ago of 300GB isn't even finished yet. I am using -o b2.connections=200 but I can see at most 4 open connections and they are idle most of the time.
I tried hacking a bit in the code and found this. So I have done a few benchmarks.
I did these benchmarks under the following conditions:
I tweaked two of the variables in the code linked above and got these results:
| FileRead | SaveBlob | SaveTree | Total time |
|---|---|---|---|
| 2 | 16 | 1620 | 7min06s (default parameters) |
| 16 | 16 | 1620 | 2min27s |
| 16 | 64 | 64*20 | 2min33s |
It seems that 2 isn't that good of a value for this FileRead parameter. If it's really something that depends, maybe it should be made configurable from the command line. Or maybe it is and I missed it, please tell me.
I did some tests on a SSD too but with different conditions, and saw a similar change. Changing it from 2 to 8 changed the backup time from 52 minutes to 19 minutes.
Finally, I am posting this here because it happens I'm using b2 and everybody complaining about speed is using b2 as well, so I thought this was related. If someone can do a similar benchmark on another storage service, we could know if this setting really is suboptimal for everyone and not just b2 customers.
Please tell me if my reasoning is wrong, or if you have any suggestions to make my backups faster.
EDIT: added actual bucket size after the backup
EDIT2: added note about --no-cache
EDIT3: clarified that each run is done on a new repository
Does backing up to a local backup repository change anything regarding the time required for backup?
@blastrock PL here and changing DC to EU increases speed dramatically. US speeds to my systems were ~30 KiB/s (on 20 Mbps uplink).
So the target DC is the most important matter.
Does backing up to a local backup repository change anything regarding the time required for backup?
A local backup of the same folder from the HDD to that same HDD, with default settings, takes 27s. With FileReadConcurrency set to 16, it takes 19s. The difference is smaller here, and this test also proves that the time to read the files is negligible compared to the time of the whole backup to b2.
@blastrock PL here and changing DC to EU increases speed dramatically. US speeds to my systems were ~30 KiB/s (on 20 Mbps uplink).
Actually, the backup I talk about in my first paragraph was being done in the EU region. I got confused with the accounts and did the benchmark on US though. I can try to run that same benchmark on the EU region, just to have comparable numbers.
The difference is smaller here, and this test also proves that the time to read the files is negligible compared to the time of the whole backup to b2.
What I don't understand in that regard is that for SaveBlob=16 the number of FileReaders makes a large difference. The actual upload to B2 is happening in the SaveBlob threads, so if the B2 upload is the bottleneck then I would expect that the number of FileReaders does not make a larger difference.
In what regard does --no-cache help with ensuring comparable runs? If you start with a new repository then there's no cache for that repository and if the repository already contains the backup data then you'd just measure how fast restic can scan the backup data set, in which case the upload to B2 would no longer be the bottleneck. Did you flush the filesystem cache between the backup variants or did you run a warmup backup run to prime the filesystem cache?
What I don't understand in that regard is that for SaveBlob=16 the number of FileReaders makes a large difference. The actual upload to B2 is happening in the SaveBlob threads, so if the B2 upload is the bottleneck then I would expect that the number of FileReaders does not make a larger difference.
I agree, but I don't really know how restic works internally. With SaveBlob=16 and b2.connections=200, I can only see 4-5 connections from restic, and they are mostly idle. So my guess was that the "saving" part was not receiving enough data, and that somehow the "reading" part was waiting for the "saving" part to finish its work instead of just sending more.
If you start with a new repository then there's no cache for that repository
Oh, my bad, I thought some part of the cache was shared between repositories, like file modification timestamps. I did start with a new repository for each test. I'll edit my post to clarify that.
Did you flush the filesystem cache between the backup variants or did you run a warmup backup run to prime the filesystem cache?
I haven't thought about it at all actually ^^
I have just re-run the same test with FileRead=2 twice (to make sure the cache is warmed up) and got 6min12s and 6min00s this time. Not quite the same timings as before, but still bad. I have rerun the test just after that with FileRead=16 and got 2min00s.
I think this is in fact a performance problem in restic. Each file_saver splits a file into chunks and passed them to the blob_saver. A file_saver only continues with the next file after all chunks were uploaded. Storing a blob can either be fast, if the pack is not yet completed or slow if the pack get's finalized and uploaded. Or in other words the file_saver and the blob_saver are not completely decoupled. For small files the effective uploaded parallelism is limited to FileRead.
In my opinion this warrants its own issue, could you please open a new issue and add your observations to it? Could you also try how long it takes to upload an 1GB file containing random data (e.g. dd if=/dev/urandom of=test bs=1m count=1024)? I'd expect that that would end up in the 2min range.
Most helpful comment
Hey now !!! I just tested 0.8.3 and it's finally working as expected !!!! Great job to the developers.
I'm still not sure why people didn't have issues with 0.8.2 but that version was definitely slow for B2. But I am pleasantly pleased with 0.8.3. I just backed up 600MB consisting of 2800 files in only 3 minutes 20 seconds !!!!! Again kudos and GREAT JOB to all those responsible. You gotta winner !