Scrapy: Scrapy Shell <url> ValueError: invalid hostname:

Created on 1 Jul 2018 · 3Comments · Source: scrapy/scrapy

I was following the Scrapy official documentation. When I run the command scrapy shell 'quotes.toscrape.com/page/1/' it shows me the below error

C:\WINDOWS\system32>scrapy shell 'http://quotes.toscrape.com/page/1/' 2018-07-01 20:54:02 [scrapy.utils.log] INFO: Scrapy 1.5.0 started (bot: scrapybot) 2018-07-01 20:54:02 [scrapy.utils.log] INFO: Versions: lxml 4.1.1.0, libxml2 2.9.7, cssselect 1.0.3, parsel 1.4.0, w3lib 1.19.0, Twisted 17.5.0, Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)], pyOpenSSL 17.5.0 (OpenSSL 1.0.2n 7 Dec 2017), cryptography 2.1.4, Platform Windows-10-10.0.14393-SP0 2018-07-01 20:54:02 [scrapy.crawler] INFO: Overridden settings: {'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter', 'LOGSTATS_INTERVAL': 0} 2018-07-01 20:54:02 [scrapy.middleware] INFO: Enabled extensions: ['scrapy.extensions.corestats.CoreStats', 'scrapy.extensions.telnet.TelnetConsole'] 2018-07-01 20:54:03 [scrapy.middleware] INFO: Enabled downloader middlewares: ['scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware', 'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware', 'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware', 'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware', 'scrapy.downloadermiddlewares.retry.RetryMiddleware', 'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware', 'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware', 'scrapy.downloadermiddlewares.redirect.RedirectMiddleware', 'scrapy.downloadermiddlewares.cookies.CookiesMiddleware', 'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware', 'scrapy.downloadermiddlewares.stats.DownloaderStats'] 2018-07-01 20:54:03 [scrapy.middleware] INFO: Enabled spider middlewares: ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware', 'scrapy.spidermiddlewares.offsite.OffsiteMiddleware', 'scrapy.spidermiddlewares.referer.RefererMiddleware', 'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware', 'scrapy.spidermiddlewares.depth.DepthMiddleware'] 2018-07-01 20:54:03 [scrapy.middleware] INFO: Enabled item pipelines: [] 2018-07-01 20:54:03 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023 2018-07-01 20:54:03 [scrapy.core.engine] INFO: Spider opened Traceback (most recent call last): File "C:\ProgramData\Anaconda3\Scripts\scrapy-script.py", line 10, in <module> sys.exit(execute()) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 150, in execute _run_print_help(parser, _run_command, cmd, args, opts) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 90, in _run_print_help func(*a, **kw) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 157, in _run_command cmd.run(args, opts) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\commands\shell.py", line 73, in run shell.start(url=url, redirect=not opts.no_redirect) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\shell.py", line 48, in start self.fetch(url, spider, redirect=redirect) File "C:\ProgramData\Anaconda3\lib\site-packages\scrapy\shell.py", line 115, in fetch reactor, self._schedule, request, spider) File "C:\ProgramData\Anaconda3\lib\site-packages\twisted\internet\threads.py", line 122, in blockingCallFromThread result.raiseException() File "C:\ProgramData\Anaconda3\lib\site-packages\twisted\python\failure.py", line 372, in raiseException raise self.value.with_traceback(self.tb) ValueError: invalid hostname: 'http

But while using scrapy shell only, it starts and I can do basic operations like working with xpath, etc.

I'm using Windows 10, Scrapy 1.5.0 in Anaconda 1.6.9

Source

mah1212

Most helpful comment

The solution is to use double quotes according to Scrapy Official Documentation. We have to use it scrapy shell "http://quotes.toscrape.com/page/1/"

mah1212 on 1 Jul 2018

👍17

All 3 comments

The solution is to use double quotes according to Scrapy Official Documentation. We have to use it scrapy shell "http://quotes.toscrape.com/page/1/"

mah1212 on 1 Jul 2018

👍17

The solution is to use double quotes according to Scrapy Official Documentation. We have to use it scrapy shell "http://quotes.toscrape.com/page/1/"

Scrapy should revise this example: https://docs.scrapy.org/en/latest/topics/shell.html