If I understand it correctly, Borg can eat quite a lot of resources on the client machine. As discussed in #1793, a Raspberry Pi 3 would not quite meet the system requirements for a Borg client. Mr Waldmann writes about the Pi 3:
So, using such a device as a borg server can work ok (basically just key/value storage + ssh, no crypto, no chunking, no hashing), but as a borg client it is a bit much (chunking, hashing, crypto + ssh).
That leads me to the question: Are there any recommended system requirements for a Borg client? (It would probably be interesting to know the requirements for a Borg server as well).
Well, of course it always depends on the amount of data (repo size and backup data set size) and your speed expectations.
The system running the borg client is usually the system you want to back up, so usually not borg is determining the minimum requirements, but whatever productive usage you have for that system.
TLDR: if you avoid heavily unbalanced systems, you will be fine.
I didn't compute the following values (just experience and gut feeling), so take them with a grain of salt:
An example of heavily unbalanced would be e.g. a system (NAS, raspi, ...) with low-power ARM or MIPS cpu, <= 1GB of RAM, operating a repo of more than 100GB. These systems are often not expandable with more RAM, so it's basically game over if you run out of memory. Also, they can be slow with compression, hashing and encryption. Some (e.g. raspi) are also quite limited with I/O as everything goes over USB2 multiple times.
Still balanced in this case would be if you want to backup a few GB of data into a small repository just for that.
Balanced: "current" (not more than 5y old) celeron/pentium/i3 (or equivalent amd) system (x86/x64 arch), preferably with AES-NI. Usually you can easily have 4, 8, 16, 32GB of RAM with these and they can deal with terabytes of data.
If you have really a lot of data and your data is very important, consider Pentium or Xeon class CPUs with a good amount of ECC memory, AESNI, fast disk or network.
The borg client has medium, sometimes high cpu requirements (depends on compression, encryption, amount of changed data and a lot of other factors). There needs to be enough RAM for the chunks index and files cache (and the repo index, if your repo is local and you do not use a borg server).
The borg server has lower cpu requirements (it is usually just doing ssh and a key/value store). There needs to be enough RAM for the repo index.
There is a formula in the docs to compute the (RAM) resource usage for the caches/indexes.
@ThomasWaldmann isnt borg pretty much limited to a single Thread?
And that would mean that Server CPUs (e.g. Xeons) which tend to have a high core count but way less clockspeed will lose to sth. like an overclocked Pentium G3258 with only two cores.
@fsironman that currently holds true, but they are preparing to offload non-gil cpu work to more cores
I didn't say "buy a lot of cores", but recommended ECC if the data is really important.
ECC is present in some Pentiums and Xeons, but not in Core i3/i5/i7.
Great information, thanks!
so it's basically game over if you run out of memory.
With game over, does that mean that borg will simply stop in the middle of the backup process with an error like not enough memory? My idea was to backup a NAS (where I couldn't install borg) via a Raspberry Pi. About 300 Gb. If you would guess, would it work even though it would be very slow and take about the whole day to complete, or would it simply fail due to insufficient memory?
I guess the Raspberry Pi is simply to weak to manage backing up 300 Gb. Then the question is, how much better hardware do I need? :) For instance, an Asus Thinkerboard with a Rockchip Quad-Core RK3288 processor and 2GB Dual Channel DDR3, would you guess that would be enough? It is faster than the Raspberry, but it's still a machine with very limited resources. Maybe the only solution is to test it...
Borg's deduplication is working based on chunks (pieces of files) and it needs to manage all these chunks.
While the default target chunk size is 2MiB, chunks can also be (much) smaller if you have (much) smaller files. So it hard to predict chunks count without knowing your data.
repo index size and also chunks cache size grows linearly with the amount of chunks in the repo.
files cache size is also influenced by that and also grows linearly with the (input) files count.
when borg create is running, the amount of chunks usually is increasing (how much depends on how many new chunks it discovers). when the hashtables that implement repo index and chunks cache reach some threshold fill ratio, they will need to be enlarged. that means: allocating a bigger one, rebuilding it with data from original one, freeing the original one. that temporarily spikes memory usage to 2.x .. 3 times the original usage for that hashtable and after freeing to 1.x .. 2 times.
if you are already at the (virtual) memory limit, you will see a MemoryError exception then and borg will terminate/crash. having a good amount of swap space will avoid that and might be useful to deal with the resize-spike, but not when operating size is much larger than physical memory.
You can get a rough impression of the repo index size by looking at <repo_dir>/index.NNN.
Practical example from one of my repos:
borg@server:~/repo$ du -sh
443G .
borg@server:~/repo$ ls -lh index*
-rw------- 1 borg borg 450M Okt 2 21:14 index.142662
BTW, HP Microservers are relatively popular and usually some base model is also relatively cheap.
Currently e.g. the HP Microserver Gen8 starts at about 200 EUR in Germany (incl. 19% tax).
They have e.g.:
The newer Gen10 meanwhile starts at 220 EUR (w/ 8GB RAM) for cheapest model.
Issues:
I'll close this issue. It would be great to include system requirements in the docs, but I understand that it depends heavily on what you are backing up and with which options.
Yeah, that's the problem. So the short answer would be "it depends", not very helpful.
Maybe some longer writeup with real-live configurations and measurements could give a useful impression.
I think the previous comment about operating size vs available memory is a good guideline to have, even though it's not a full system spec requirement. I found it helpful to figure out that for the stuff I backup a Pi won't choke (even though it may be slow in throughput).
Most helpful comment
Yeah, that's the problem. So the short answer would be "it depends", not very helpful.
Maybe some longer writeup with real-live configurations and measurements could give a useful impression.