Azure-storage-azcopy: Recommendations for Nightly/Weekly Backups

Created on 11 Mar 2019  路  5Comments  路  Source: Azure/azure-storage-azcopy

Hello,

I don't have a bug or issue to report with the tool, Instead, I would like your recommendation on setting up storage account backup automation on a nightly and/or weekly basis using azcopy. Please let me know if there is a better place to ask this question.

Scenario:
I'm setting up Azure Hybrid Automation workers so we can backup our storage accounts using azcopy. I'm still collecting numbers but I think our average storage account has less than 50GB of data each. I want to back up those storage accounts on a nightly basis because they're small. I also have one storage account with nearly 2TB of data. I'd like to back up that storage account on a weekly basis due to its size. Lets assume I have about 50 storage accounts.

Question:
Can you recommend how many CPU cores and how much memory each VM worker should have to run azcopy well? In addition, given that I have roughly 50 storage accounts, how many worker VM's should I have so that I don't have too many concurrent backups running at the same time on one machine? Is it better to have one VM with powerful hardware or multiple smaller VM's with fewer hardware specs?

I'm sorry if the question is vague. I'm having a hard time figuring out how many virtual machines I need and their size to run azcopy optimally for the amount of storage accounts I have.

I would appreciate your input.

Thanks!

All 5 comments

Hi @AMoghrabi,

Since you asked the question here, I assume you're looking at AzCopy v10 (which is totally different from V8 and earlier).

V10 lets you run account-to-account copies for Blob, so the data doesn't even transit through the machine that's running AzCopy. AzCopy just organizes the work. Therefore it probably won't matter much what size of machine you use. At a guess, I'd suggest you test moving one account using a small-to-medium VM - e.g. 4 CPU cores. Make sure you're using the blob-to-blob copy syntax (so the data doesn't go through the VM) and you might find that's enough. If that's fast (e.g. just a few minutes per account), then you could probably just do your whole nightly backup with that one VM - moving just one account at a time. That's possibly the simplest approach.

If you want to move several accounts at a time then again, because the data won't transit the VM, you could run several instances of AzCopy on the one machine. I don't really know what sizings to suggest there. At a wild guess, maybe 2 to 4 cores per instance of AzCopy.

See "Copy Data Between Storage Accounts" here: https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10

I don't know what speeds you should expect in this scenario, because I look after a different part of AzCopy's feature set. I'm guessing you'll see multi-gigabit speeds, and at those speeds 50GB is not much at all. So I do not think you will need a lot of VMs. Probably just one or maybe two (at a guess).

Thanks @JohnRusk! Yes I was planning to use AzCopy v10. Sorry for not mentioning that.

I appreciate the information John, this is helpful. I'm going to experiment with a few different VM sizes tomorrow. I'll reopen/comment on this ticket if I have any additional questions.

Thanks again!

@AMoghrabi I have a similar use case where I want to replicate my file shares snapshots using Azcopy sync running on Azure Hybrid Automation workers. I am highly interested on your insights regarding the experience you had with Azcopy and Hybrid Automation workers and if you managed to do it differently ? Thank you.

Hey @kibnelbachyr! AzCopy is working very well for us actually, especially the latest version. We didn't go for the Hybrid Automation worker though. In our CI/CD tool (Azure DevOps) we run a script that downloads the azcopy binary and runs through all of our storage accounts in every subscription. We backup all storage accounts that are tagged with something specific. Likewise we have a different script that removes backups older than X days.

Hope that helps!

@AMoghrabi, thank you for your response. Ok, automating this through CI/CD is a good option as well.

Was this page helpful?
0 / 5 - 0 ratings