AzcopyVersion 10.2.1
OS-Environment linux
OS-Architecture amd64
Job-Command sync /home/yada https://redacted.blob.core.windows.net/media?se=2020-09-03t10%3A57%3A00z&sig=-REDACTED-&sp=racwdl&sr=c&st=2019-09-02t10%3A57%3A38z&sv=2018-03-28 --recursive --cap-mbps=10
Number of CPUs: 32
Max file buffer RAM 16.000 GB
Max open files when downloading: 1047774
Max concurrent transfer initiation routines: 64
Network speed is 300+ down / 25 up
95% of uploads fail. Files are usually 1 - 8 Gb. Errors in log are http request failed -> context canceled / context deadline exceeded / connection reset by peer + entries for 'perf: primary performance constraint is Unknown'
Hi @digiface, thanks for reaching out.
Could you please search for the keyword UPLOADFAILED instead? E.g. cat xxxx.log | grep UPLOADFAILED to see what are the actually failures.
Aha. Ok this is more like it: INFO: [P#0-T#0] Early close of chunk in singleChunkReader with context still active
'UPLOADFAILED' does not appear in most log files. I get sections like these:
2019/09/02 16:28:37 PERF: primary performance constraint is Unknown. States: R: 4, D: 0, W: 1236, F: 0, B: 300, T: 1540
2019/09/02 16:28:39 JobID=eb3235e1-e98f-7846-4407-c0cde955c58a canceled
2019/09/02 16:28:39 ==> REQUEST/RESPONSE (Try=3/3m49.575929024s[SLOW >3s], OpTime=26m31.717406729s) -- REQUEST ERROR
PUT https://redacted.blob.core.windows.net/some.file?blockid=n2yynmrlzgqtmwqzmy04mjq3ltyynwetodawndnjzwm2ota1&comp=block&se=2020-09-03t10%3A57%3A00z&sig=-REDACTED-&sp=racwdl&sr=c&st=2019-09-02t10%3A57%3A38z&sv=2018-03-28&timeout=901
Content-Length: [8388608]
User-Agent: [AzCopy/10.2.1 Azure-Storage/0.7 (go1.12; linux)]
X-Ms-Client-Request-Id: [85174742-40b3-409b-6302-52545c702cfa]
X-Ms-Version: [2018-03-28]
--------------------------------------------------------------------------------
ERROR:
-> github.com/Azure/azure-pipeline-go/pipeline.NewError, /home/vsts/go/pkg/mod/github.com/!azure/[email protected]/pipeline/error.go:154
HTTP request failed
Put https://redacted.blob.core.windows.net/some.file?blockid=N2YyNmRlZGQtMWQzMy04MjQ3LTYyNWEtODAwNDNjZWM2OTA1&comp=block&se=2020-09-03T10%3A57%3A00Z&sig=-REDACTED-&sp=racwdl&sr=c&st=2019-09-02T10%3A57%3A38Z&sv=2018-03-28&timeout=901: context canceled
goroutine 104 [running]:
And
2019/09/02 16:07:16 ==> REQUEST/RESPONSE (Try=1/5m8.965240092s[SLOW >3s], OpTime=5m8.965502592s) -- REQUEST ERROR
PUT https://redacted.blob.core.windows.net/z.mp4?blockid=nzk2zwu0ymitngflzc0znzqzltvmywutm2flmmzjzdnjzwqz&comp=block&se=2020-09-03t10%3A57%3A00z&sig=-REDACTED-&sp=racwdl&sr=c&st=2019-09-02t10%3A57%3A38z&sv=2018-03-28&timeout=901
Content-Length: [8388608]
User-Agent: [AzCopy/10.2.1 Azure-Storage/0.7 (go1.12; linux)]
X-Ms-Client-Request-Id: [8eb2a167-522e-4e12-51d6-5b18adfcf16e]
X-Ms-Version: [2018-03-28]
--------------------------------------------------------------------------------
ERROR:
-> github.com/Azure/azure-pipeline-go/pipeline.NewError, /home/vsts/go/pkg/mod/github.com/!azure/[email protected]/pipeline/error.go:154
HTTP request failed
Put https://redacted.blob.core.windows.net/some.file?blockid=Nzk2ZWU0YmItNGFlZC0zNzQzLTVmYWUtM2FlMmZjZDNjZWQz&comp=block&se=2020-09-03T10%3A57%3A00Z&sig=-REDACTED-&sp=racwdl&sr=c&st=2019-09-02T10%3A57%3A38Z&sv=2018-03-28&timeout=901: read tcp 172.16.15.1:43966->13.74.208.64:443: read: connection reset by peer
goroutine 274 [running]:
github.com/Azure/azure-storage-azcopy/ste.stack(0xc007ee40c0, 0xc000e3f000, 0x0)
/home/vsts/work/1/s/ste/xferLogPolicy.go:139 +0x9d
Occasionally one that says HTTP/1.x transport connection broken
@JohnRusk could you please take a look? The line that got logged originates from here.
Got a lot of them too in my problem with token refreshes . File size is sometimes only KBs
Any update/insights here would be appreciated. We now have 51Mpbs upload bandwidth and it gets degraded over about 30 minutes to 20Mpbs with AzCopy running capped at 10Mpbs. Uploads start off succeeding then everything gets chewed up and there's not enough meaningful information in the logs to know why it's happening.
Thanks for the reminder. I was travelling when the original issue came through, and have not caught up yet.
Can I make sure I've understood correctly: you have 51 Mbps in total. AzCopy is running capped at 10 Mbps, and over time the bandwidth available to _other_ traffic degrades to 20 Mbps (when you would have expected 40). Is that right?
Given your CPU count, the current version of AzCopy is going to default to using a very high connection count, even though that's unnecessary at your bandwidth. The best thing I can think of would be to do a test where you set the environment variable AZCOPY_CONCURRENCY_VALUE to about 4 or 8 (try 4 first). That will make it use only 4 connections, which is a much more reasonable number for your network. (Basically, you have a big machine - 32 CPUs - on an small network, and that's not something that the current release handles very well by default. But it should, in theory, be fine if you set that environment variable to a low value.)
Please let us know how that test works for you.
Ah I wondered where the nc flag had gone. Will give it a go now, thanks
Yeah that's fixed this issue. Excellent , thanks!
Most helpful comment
Yeah that's fixed this issue. Excellent , thanks!