Scrapy: twisted.python.failure

Created on 12 Mar 2018 · 1Comment · Source: scrapy/scrapy

Scrapy : 1.5.0
lxml : 4.1.1.0
libxml2 : 2.9.5
cssselect : 1.0.3
parsel : 1.4.0
w3lib : 1.19.0
Twisted : 17.9.0
Python : 2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]
pyOpenSSL : 17.6.0.dev0 (OpenSSL 1.1.0g 2 Nov 2017)
cryptography : 2.1.4
Platform : Windows-7-6.1.7601-SP1

2018-03-12 21:20:46 [scrapy.core.engine] DEBUG: Crawled (200) (referer: None)
2018-03-12 21:20:46 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying (failed 1 times): [ wisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2018-03-12 21:20:50 [scrapy.downloadermiddlewares.retry] DEBUG: Retrying (failed 2 times): [ wisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2018-03-12 21:20:50 [scrapy.downloadermiddlewares.retry] DEBUG: Gave up retrying (failed 3 tim
es): [ lost.>]
2018-03-12 21:20:50 [scrapy.core.scraper] ERROR: Error downloading : [ ailure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]
2018-03-12 21:20:51 [scrapy.core.engine] INFO: Closing spider (finished)
2018-03-12 21:20:51 [scrapy.statscollectors] INFO: Dumping Scrapy stats:

bug https

Source

ghost

Most helpful comment

Hi @Svickie7 - This appears to be a duplicate of #3065. I checked the site, and it uses the same ciphersuite as the failing site in that issue.

If it's possible to crawl this site without HTTPS, then that might help solve this.

Another temporary-fix option is to try a local HTTP proxy that will accept the bad certificate, and run Scrapy through that?

I'll close this, as the other issue should be fine to continue discussing a fix for this problem. Thank you for reporting it; it's good for us to see how many people are affected by this.

cathalgarvey on 15 Mar 2018

👍2

>All comments

Hi @Svickie7 - This appears to be a duplicate of #3065. I checked the site, and it uses the same ciphersuite as the failing site in that issue.

If it's possible to crawl this site without HTTPS, then that might help solve this.

Another temporary-fix option is to try a local HTTP proxy that will accept the bad certificate, and run Scrapy through that?

I'll close this, as the other issue should be fine to continue discussing a fix for this problem. Thank you for reporting it; it's good for us to see how many people are affected by this.

cathalgarvey on 15 Mar 2018

👍2

Was this page helpful?

0 / 5 - 0 ratings