Azure-storage-azcopy: Parallel level in sync

Created on 17 May 2019  Â·  11Comments  Â·  Source: Azure/azure-storage-azcopy

Which version of the AzCopy was used?

azcopy version 10.1.0

Which platform are you using? (ex: Windows, Mac, Linux)

Linux

What command did you run?

azcopy sync "$folder" "$container/$saskey" --recursive=true --parallel-level 20 --delete-destination=true

What problem was encountered?

unknown flag: --parallel-level

All 11 comments

See the AZCOPY_CONCURRENCY_VALUE environment variable documented here https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10#advanced-configuration.

There is no command line parameter for parallelism.

Btw the default behaviour is already highly parallel. There is seldom any need to change it.


From: 6ixtec notifications@github.com
Sent: Saturday, May 18, 2019 12:42:02 AM
To: Azure/azure-storage-azcopy
Cc: Subscribed
Subject: [Azure/azure-storage-azcopy] Parallel level in sync (#390)

Which version of the AzCopy was used?

azcopy version 10.1.0

Which platform are you using? (ex: Windows, Mac, Linux)

Linux

What command did you run?

azcopy sync "$folder" "$container/$saskey" --recursive=true --parallel-level 20 --delete-destination=true

What problem was encountered?

unknown flag: --parallel-level

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fazure-storage-azcopy%2Fissues%2F390%3Femail_source%3Dnotifications%26email_token%3DAGFQHPR5PITLFLJIA7F6QC3PV2RZVA5CNFSM4HNU5ZCKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GUMTXVA&data=02%7C01%7Cjohn.rusk%40microsoft.com%7C9ff1190b0a2148c0215c08d6dac5105d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636936937255279323&sdata=cJq4UPvOUmHtDSfx3rv2mwLujTQ%2FrkQ0UzEKes%2BKiCE%3D&reserved=0, or mute the threadhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAGFQHPSYQ3P2DT2LLV5BE4TPV2RZVANCNFSM4HNU5ZCA&data=02%7C01%7Cjohn.rusk%40microsoft.com%7C9ff1190b0a2148c0215c08d6dac5105d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636936937255289314&sdata=AjzbDLACl%2Bq4MOJK2oUj35MZCPIUTLYAXS%2F8p2sk4jg%3D&reserved=0.

Hi @JohnRusk
I came here trying to find out how to instruct azcopy to sync only a few files at a time.
I will try to explain a scenario where I think that limit simultaneous syncing files is very useful.
I have a folder with 70~ big files that I want to put in Azure Blobs,
The first time I run AZcopy It will start to upload 70 files at a time (which is crazy). Lets suppose that after 8 ours or running, none of then still finished its upload. At this point, for a given reason, you have to stop azcopy.
As none of the sync files job has ended, if you cannot resume later, all your effort is lost.
I'm uploading to Azure Blob and I think resume is not possible, I'm right?
Not all the people has a big bandwidth, sometimes the upload slow a lot all the net, etc, you will find a lot of reasons to stop azcopy.
So if you can narrow the sync effort to a few files at a time, you will have a better chance of sync success or at least not to lost all the work doing.
At this moment, I'm facing this problem. :(
It should be good to have the possibility to instruct Azcopy to sync a few files at a time (not to attack all the sync files at once).
I hope my point has been understood.
Regards.

Hi @roberdaniel, thanks for the feedbacks!

The tool was indeed optimized for high bandwidth scenarios, and that's why we try to do as many files simultaneously as possible, since the service side scales better this way.

I see that for your scenario, the current behavior is not ideal. I'll discuss with @JohnRusk and the team to figure out how to better support it.

Robert and Ze, I've been thinking for some time of adding an optional environment variable to configure the number of files that are read from disk at the same time. If that is added, you'll be able to set it to a low value, to achieve what you want.

However, we are trying to minimize the number of "tuning knobs" that users have to think about when using the tool. So we'll need to think about it a bit more to decide whether that really is the right way to do this.

@roberdaniel What would you think of setting an environment variable, called for example AZCOPY_CONCURRENT_DISK_FILES?. If you set it to, e.g., 10, then it won't start reading the 11th file until the one of the first 10 has been fully read.

Robert and Ze, I've been thinking for some time of adding an optional environment variable to configure the number of files that are read from disk at the same time. If that is added, you'll be able to set it to a low value, to achieve what you want.

However, we are trying to minimize the number of "tuning knobs" that users have to think about when using the tool. So we'll need to think about it a bit more to decide whether that really is the right way to do this.

@roberdaniel What would you think of setting an environment variable, called for example AZCOPY_CONCURRENT_DISK_FILES?. If you set it to, e.g., 10, then it won't start reading the 11th file until the one of the first 10 has been fully read.

Hi John and Ze
I think that AZCOPY_CONCURRENT_DISK_FILES will achieve exactly what I need.
I'm very pleased to this issue has been reopened.
Thank you very much.
Regards.

I will add this environment variable to PR #511

Glad to hear that.
Thanks

BTW, I might change the name slightly, e.g. to AZCOPY_CONCURRENT_FILES.

Sure!, regards.

The release is out now, version 10.3 and I can confirm the name is AZCOPY_CONCURRENT_FILES.

Many thanks, I will try it.
Regards!!!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

DavidLafond picture DavidLafond  Â·  5Comments

brettrowberry picture brettrowberry  Â·  4Comments

Mmdixon picture Mmdixon  Â·  3Comments

Dikkekip picture Dikkekip  Â·  5Comments

colemickens picture colemickens  Â·  5Comments