Conan: Package download retry. (conan.io ECONNRESET error)

Created on 18 Apr 2017  路  18Comments  路  Source: conan-io/conan

Hello!

Server sometimes responds with:
('Connection broken: error("(104, 'ECONNRESET')",)', error("(104, 'ECONNRESET')",)). [Remote: conan.io].

It is reproduced in Travis jobs. E.g. https://travis-ci.org/theirix/conan-libssh2/jobs/222844602 :

zlib/1.2.11@lasote/stable: Looking for package 8767e89412729a395e7a478e291eeb99291fe27b in remote 'conan.io' 
Downloading conanmanifest.txt
[==================================================]                  
Downloading conaninfo.txt
[==================================================]                  
Downloading conan_package.tgz
ERROR: Download failed, check server, possibly try again
('Connection broken: error("(104, \'ECONNRESET\')",)', error("(104, 'ECONNRESET')",)). [Remote: conan.io]
Error while executing:
     conan test_package . -s arch="x86" -s build_type="Release" -s compiler="gcc" -s compiler.version="5.4" -o libssh2:shared="True" -o OpenSSL:no_electric_fence="True" 

low medium queue feature

Most helpful comment

Download retries would be good anyway. Sometimes I experience connection problems with a private Conan server (caused by bugs in a old linux kernel while using docker). So these retries could help to mitigate a lot of problems.

All 18 comments

Thanks for the feedback. Unfortunately, these transfers are being executed directly from Amazon S3, so it is out of our control. I think that S3 transfers are not perfect, and they fail from time to time, and many factors as network connections can be an issue. For downloads it has never been a big issue, but we implemented the "retry" arguments in "upload" for this reason. Maybe the download could be parameterized/configurable too.

Thank you for the explanation. Will wait until the amazon problem resolves.

I don't think that the amazon issue will be resolved, because also it depends on many conditions, like all internet connection between S3 and consumers. Please keep us updated about this issue and its frequency, if it keeps being too frequent, please tell and we will considering adding retry arguments to package downloads, maybe that way we could minimize the impact.

Download retries would be good anyway. Sometimes I experience connection problems with a private Conan server (caused by bugs in a old linux kernel while using docker). So these retries could help to mitigate a lot of problems.

Hey @memsharded,

I've also have troubles with connection broken errors during an package install from time to time (also a private conan instance). It would be really great when there would be a retry option in the download/install time as in upload. At the moment i get connection errors in CI builds which is really annoying (using package tools).

I think you have your private conan_server in a Win machine. Have you considered running it in a Linux machine. Not strong evidence, but the network transfers might be slightly more stable in Linux. Did you launch it with gunicorn? With which configuration?

What do you think @lasote? Maybe add retry on package download?

I don't know... I don't remember the last time I had an issue downloading a package. Should be better to understand first the origin of the errors than just retry, no one could ensure that the failure will repeat during the retries.

I have experienced timeouts during package downloading on CI several times in last few days. It usually look like this:

boost_asio/1.65.1@bincrafters/stable: ERROR: Exception:
<class 'conans.errors.ConanException'>
Error downloading file https://api.bintray.com/conan/bincrafters/public-conan/v1/files/bincrafters/boost_asio/1.65.1/stable/package/5ab84d6acfe1f23c4fae0ab88f26e3a396351ac9/conan_package.tgz:
'HTTPSConnectionPool(host='akamai.bintray.com', port=443): Read timed out. (read timeout=60.0)'. [Remote: remote0]

But there once was this:

File "/usr/local/lib/python2.7/dist-packages/conans/client/remote_manager.py", line 285, in _call_remote
    raise exc.__class__("%s. [Remote: %s]" % (exception_message_safe(exc), remote.name))
conans.errors.ConanException: Error downloading file https://api.bintray.com/conan/bincrafters/public-conan/v1/files/bincrafters/boost_ratio/1.65.1/stable/export/conanfile.py: '('Connection aborted.', BadStatusLine("''",))'. [Remote: remote0]

I also stumbled upon failed boost build, even though it happend awhile ago, it seems that the problem might be present not only during download.

Download retries are now default (from conan 1.7) in Conan package transfers, so I think this issue can be closed. But please re-open or comment otherwise

So, I have figured out a local CI at my place with Artifactory + Gitlab CI. The problem is silly because they're on the same machine and there is no problems in Artifactory's logs.
The thing is that I've got the problem with download/install retries:

Downloading conan_package.tgz
ERROR: Download failed, check server, possibly try again
Transfer interrupted before complete: 1097971 < 44245525
Waiting 0 seconds to retry...
ERROR: Download failed, check server, possibly try again
Transfer interrupted before complete: 2514683 < 44245525
Waiting 0 seconds to retry...
qt/5.12.0@bincrafters/stable: ERROR: Exception while getting package: 2030dcc00573b7997a1cd6663d2e5801f3e78359
qt/5.12.0@bincrafters/stable: ERROR: Exception: <class 'conans.errors.ConanConnectionError'> Download failed, check server, possibly try again
Transfer interrupted before complete: 403779 < 44245525. [Remote: asd]

I kinda "fixed" it with explicit

timeout 2h bash -c 'until conan install .; do echo "Connection lost. Trying again..."; done'

But I'd prefer an explicit option to set the number of retries for big packages for example to use it with conan-package-tools for example.

Hi @Minimonium

I have reopened the issue, I think it is possible to add configuration for the number of retries.

However, I have assigned low-priority because this sounds more a workaround to something that a real solution. Trying 3 times should be more enough, and if not, there could be something wrong there. It would make sense to keep investigating to know what might be happening. How many times on average do you have to retry? What is the size of the packages, like 4.5Gb? Does it happens sometimes with other smaller packages or only with huge packages? Does it happen when running in that machine not in a CI job, but manually? What machine and OS is it?

Thanks for reporting!

@memsharded sure, it's not a hot problem since I have a workaround. I'm not sure actually what's the root of the problem, it's more of a general networking problem.

  • I'm afraid I'd not be able to provide the precise data on retries until Monday. It happens quite sporadically and again, mostly the problem of the current network layout.
  • The main culprit is the qt package which is 4.5Gb, though in one of the tries I have got a few retries on one of the boost level packages. So it happens only with huge ones, but I would wait for data on retry tests to be sure.
  • It never happens manually, only in CI. The current CI image under test is the image of Debian Stretch running from dind from virtual machine and writing it to the nfs which I guess could be a problem too. But it's off-topic in the context of the current issue.

I was mostly thinking of the simetry of the upload/download(/install/create?) retry options. I'll try to pick it up if it would block me and no one will be interested until then.

@memsharded there is something fishy since it's not Gb but Mb. Haha.
I have tried running multiple times and it breaks fails on other small ones too from time to time. And for the qt it needs up to 20 retries (it's pretty sporadic). I have checked the frontend/Artifactory settings and can't find an issue on that end.
I'll look into implementing the thing then. Though would such things be better of as global Conan settings instead?

I faced with this too, to workaround this on CI I have applied

- net_pyc_file=$(python -c "from conans.client.tools import net; print(net.__file__)")
- sudo sed -i 's/retry = 2/retry = 99/g' ${net_pyc_file::-1}
- sudo python -m compileall ${net_pyc_file::-1}

P.S. I understand that this workaround ugly, I just cannot apply previous workaround https://github.com/conan-io/conan/issues/1215#issuecomment-457615706

I definitely agree with @Minimonium, we should 'export' retry_count to ~/.conan/conan.conf alongside with request_timeout

And maybe think about connection_timeout too

I'm seeing this quite regularly on a CI build - say about 25-50% of the time.

ibjpeg/9c@bincrafters/stable: ERROR: Exception: <class 'conans.errors.ConanConnectionError'> ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

Unable to connect to conan-center=https://conan.bintray.com
libjpeg/9c@bincrafters/stable: WARN: Trying to remove package folder: ...
ERROR: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

Thanks for reporting @planetmarshall Could you please send us a couple of exact times (UTC) and package reference (libjpeg/9c@bincrafters/stable in this case). I want to check with the Bintray team if something is going wrong. It started to fail at some point or it always failed?

Implemented, will be released next Conan 1.16

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zomeck picture zomeck  路  3Comments

rconde01 picture rconde01  路  3Comments

petermbauer picture petermbauer  路  3Comments

db4 picture db4  路  3Comments

uilianries picture uilianries  路  3Comments