Scrapy: Scrapy Shell <url> ValueError: invalid hostname:

Created on 1 Jul 2018  路  3Comments  路  Source: scrapy/scrapy

I was following the Scrapy official documentation. When I run the command scrapy shell 'quotes.toscrape.com/page/1/' it shows me the below error

C:\WINDOWS\system32>scrapy shell 'http://quotes.toscrape.com/page/1/' 2018-07-01 20:54:02 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot) 2018-07-01 20:54:02 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.5.0, Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 17.5.0 (OpenSSL 1.0.2n 7 Dec 2017), cryptography 2.1.4, Platform Windows-10-10.0.14393-SP0 2018-07-01 20:54:02 [scrapy.crawler] INFO: Overridden settings: {'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0} 2018-07-01 20:54:02 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole'] 2018-07-01 20:54:03 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2018-07-01 20:54:03 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2018-07-01 20:54:03 [scrapy.middleware] INFO: Enabled item pipelines: [] 2018-07-01 20:54:03 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023 2018-07-01 20:54:03 [scrapy.core.engine] INFO: Spider opened Traceback (most recent call last): File "C:\ProgramData\Anaconda3\Scripts\scrapy-script.py", line 10, in <module> sys.exit(execute()) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 150, in execute _run_print_help(parser, _run_command, cmd, args, opts) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 90, in _run_print_help func(*a, **kw) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 157, in _run_command cmd.run(args, opts) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\commands\shell.py", line 73, in run shell.start(url=url, redirect=not opts.no_redirect) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\shell.py", line 48, in start self.fetch(url, spider, redirect=redirect) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\shell.py", line 115, in fetch reactor, self._schedule, request, spider) File "C:\ProgramData\Anaconda3\lib\site-packages\twisted\internet\threads.py", line 122, in blockingCallFromThread result.raiseException() File "C:\ProgramData\Anaconda3\lib\site-packages\twisted\python\failure.py", line 372, in raiseException raise self.value.with_traceback(self.tb) ValueError: invalid hostname: 'http

But while using scrapy shell only, it starts and I can do basic operations like working with xpath, etc.

I'm using Windows 10, Scrapy 1.5.0 in Anaconda 1.6.9

Most helpful comment

The solution is to use double quotes according to Scrapy Official Documentation. We have to use it scrapy shell "http://quotes.toscrape.com/page/1/"

All 3 comments

The solution is to use double quotes according to Scrapy Official Documentation. We have to use it scrapy shell "http://quotes.toscrape.com/page/1/"

The solution is to use double quotes according to Scrapy Official Documentation. We have to use it scrapy shell "http://quotes.toscrape.com/page/1/"

Scrapy should revise this example: https://docs.scrapy.org/en/latest/topics/shell.html

@dedenhabibi Could you open a separate issue about that, and detail the parts that need a review?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

pawelmhm picture pawelmhm  路  3Comments

DharmeshPandav picture DharmeshPandav  路  3Comments

mohmad-null picture mohmad-null  路  3Comments

LokiSharp picture LokiSharp  路  3Comments

yashrsharma44 picture yashrsharma44  路  4Comments