Issue with scraping nyaa.se possibly caused by cloudflare checking browser (5 seconds required).
Generated when I was testing the indexer, trying to figure out why it wasn't fetching when I found results in manual _search._
Unable to connect to indexer: HTTP request failed: [503:ServiceUnavailable] [GET] at [http://www.nyaa.se/?page=rss&cats=1_37&filter=1&offset=1]
Full code:
http://pastebin.com/BumR7XPh
Same problem here.
This is from the log
[v2.0.0.4326] NzbDrone.Common.Http.HttpException: HTTP request failed: [503:ServiceUnavailable] [GET] at [http://www.nyaa.se/?page=rss&cats=1_37&filter=1&offset=1]
at NzbDrone.Common.Http.HttpClient.Execute (NzbDrone.Common.Http.HttpRequest request) [0x001a1] in <4f799f5bb7104eb690bf5945746e79cf>:0
at NzbDrone.Core.Indexers.HttpIndexerBase`1[TSettings].FetchIndexerResponse (NzbDrone.Core.Indexers.IndexerRequest request) [0x00058] in <cf3b823821944161902a2ac2cbb870dd>:0
at NzbDrone.Core.Indexers.HttpIndexerBase`1[TSettings].FetchPage (NzbDrone.Core.Indexers.IndexerRequest request, NzbDrone.Core.Indexers.IParseIndexerResponse parser) [0x00000] in <cf3b823821944161902a2ac2cbb870dd>:0
at NzbDrone.Core.Indexers.HttpIndexerBase`1[TSettings].TestConnection () [0x00024] in <cf3b823821944161902a2ac2cbb870dd>:0
<!DOCTYPE HTML>
<html lang="en-US">
<head>
<meta charset="UTF-8" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=Edge,chrome=1" />
<meta name="robots" content="noindex, nofollow" />
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1" />
<title>Just a moment...</title>
<style type="text/css">
html, body {width: 100%; height: 100%; margin: 0; padding: 0;}
body {background-color: #ffffff; font-family: Helvetica, Arial, sans-serif; font-size: 100%;}
h1 {font-size: 1.5em; color: #404040; text-align: center;}
p {font-size: 1em; color: #404040; text-align: center; margin: 10px 0 0 0;}
#spinner {margin: 0 auto 30px auto; display: block;}
.attribution {margin-top: 20px;}
@-webkit-keyframes bubbles { 33%: { -webkit-transform: translateY(10px); transform: translateY(10px); } 66% { -webkit-transform: translateY(-10px); transform: translateY(-10px); } 100% { -webkit-transform: translateY(0); transform: translateY(0); } }
@keyframes bubbles { 33%: { -webkit-transform: translateY(10px); transform: translateY(10px); } 66% { -webkit-transform: translateY(-10px); transform: translateY(-10px); } 100% { -webkit-transform: translateY(0); transform: translateY(0); } }
.bubbles { background-color: #404040; width:15px; height: 15px; margin:2px; border-radius:100%; -webkit-animation:bubbles 0.6s 0.07s infinite ease-in-out; animation:bubbles 0.6s 0.07s infinite ease-in-out; -webkit-animation-fill-mode:both; animation-fill-mode:both; display:inline-block; }
</style>
<script type="text/javascript">
//<![CDATA[
(function(){
var a = function() {try{return !!window.addEventListener} catch(e) {return !1} },
b = function(b, c) {a() ? document.addEventListener("DOMContentLoaded", b, c) : document.attachEvent("onreadystatechange", b)};
b(function(){
var a = document.getElementById('cf-content');a.style.display = 'block';
setTimeout(function(){
var s,t,o,p,b,r,e,a,k,i,n,g,f, WJcZhCe={"KCJIYY":+((!+[]+!![]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]))};
t = document.createElement('div');
t.innerHTML="<a href='/'>x</a>";
t = t.firstChild.href;r = t.match(/https?:\/\//)[0];
t = t.substr(r.length); t = t.substr(0,t.length-1);
a = document.getElementById('jschl-answer');
f = document.getElementById('challenge-form');
;WJcZhCe.KCJIYY*=+((!+[]+!![]+[])+(!+[]+!![]));WJcZhCe.KCJIYY*=!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![];WJcZhCe.KCJIYY-=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]+!![]+!![]));WJcZhCe.KCJIYY-=+((!+[]+!![]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]));WJcZhCe.KCJIYY+=+((!+[]+!![]+!![]+[])+(!+[]+!![]));WJcZhCe.KCJIYY+=+((!+[]+!![]+!![]+[])+(!+[]+!![]));WJcZhCe.KCJIYY*=+((!+[]+!![]+[])+(!+[]+!![]+!![]+!![]+!![]+!![]+!![]));a.value = parseInt(WJcZhCe.KCJIYY, 10) + t.length; '; 121'
f.submit();
}, 4000);
}, false);
})();
//]]>
</script>
</head>
<body>
<table width="100%" height="100%" cellpadding="20">
<tr>
<td align="center" valign="middle">
<div class="cf-browser-verification cf-im-under-attack">
<noscript><h1 data-translate="turn_on_js" style="color:#bd2426;">Please turn JavaScript on and reload the page.</h1></noscript>
<div id="cf-content" style="display:none">
<div>
<div class="bubbles"></div>
<div class="bubbles"></div>
<div class="bubbles"></div>
</div>
<h1><span data-translate="checking_browser">Checking your browser before accessing</span> nyaa.se.</h1>
<p data-translate="process_is_automatic">This process is automatic. Your browser will redirect to your requested content shortly.</p>
<p data-translate="allow_5_secs">Please allow up to 5 seconds…</p>
</div>
<form id="challenge-form" action="/cdn-cgi/l/chk_jschl" method="get">
<input type="hidden" name="jschl_vc" value="2dc1e85c19219e5b0b4fb522dda0a377"/>
<input type="hidden" name="pass" value="1476033760.95-cuRx12hte/"/>
<input type="hidden" id="jschl-answer" name="jschl_answer"/>
</form>
</div>
<div class="attribution">
<a href="https://www.cloudflare.com/5xx-error-landing?utm_source=iuam" target="_blank" style="font-size: 12px;">DDoS protection by CloudFlare</a>
<br>
Ray ID: 2ef38404f71117da
</div>
</td>
</tr>
</table>
</body>
</html>
16-10-9 23:22:31.1|Warn|NzbDroneErrorPipeline|Invalid request Validation failed:
-- Unable to connect to indexer, check the log for more details
Ran into the same issue today. Hoping there'll be a fix, Nyaa was our main an only anime indexer for Sonarr.
Maybe this library could be useful: https://github.com/codemanki/cloudscraper
Same problem here. Accessing the website on a regular web browser and trying with Sonarr after also has no effect (Probably obvious but I thought I'd mention it).
This has been going on and off for the past few days at least, but I think this is the longest Nyaa has had the Cloudflare protection running so far.
I'm assuming it's based on browser cookies so nzbdrone would have to have one given to it, seems to be the point of the above mentioned script.
I've only just started with sonarr but I'll see if there's a way to hack that script in.
Guys, from what I understand, Nyaa is under DDOS attack, so they had to enable Cloudflare's I'm-Under-Attack-Mode.
The blocks are intermittent, just weather the storm.
_And be sure not to use VPNs, those usually are graded a higher risk by CF (VPNs are often used to hurl crap at sites, so they're the first to get slapped by CF protection mechanisms)._
I wouldn't really call testing as unavailable for nearly a single day "intermittent." The point of the Sonarr application is defeated if it is unable to access the indexer over this long a period of time (At least for Anime through Torrents).
No VPNs being used here.
I'd like to figure out how to get the Cloudflare thing mentioned above working for me too, if that can be the solution. Unfortunately I lack the knowledge for coding.
im using the Nyaa RSS feed as backup for now, this is the same catagory filter as set in default Nyaa Sonarr API
This has long since subsided since they are no longer blocking it. To avoid this in the future nyaa would need to allow their API to bypass the Cloudflare browser checking (which they may have done with the RSS feed).
Most helpful comment
Maybe this library could be useful: https://github.com/codemanki/cloudscraper