Attach (recommended) or Link to PDF file here:
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.144.7135&rep=rep1&type=pdf
Configuration:
Steps to reproduce the problem:
What is the expected behavior? (add screenshot)
This is what I can see when I pasted chrome-extension prefixed url or reloading the error pdf page.
(chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.144.7135&rep=rep1&type=pdf
)
What went wrong? (add screenshot)
This is what I can see when click the above link
Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
https://chrome.google.com/webstore/detail/pdf-viewer/oemmndcbldboiebfnladdacbdfmadadm
Possibly a duplicate of #10562.
@Snuffleupagus I think this is a different issue with #10562.
This is for some pdf isn't opened properly, whereas #10562 is the issue that pdf is opened by chrome's internal pdf viewer(pdfium?) instead of pdfjs extension.
I found the cause of this issue.
The reason is requesting twice to server in a very short interval.
To the second request, server redirects to downloadsexceeded.html
instead of pdf content.
So, pdf.js complains it's invalid/corrupted pdf file.
It's maybe server's DDoS protection I think.
Why two requests are issued when user clicks that link?
First one is issued by browser for user click.
Then, pdfjs extension intercepts header response and redirects to extension url.
Then, one more requesting is issued by pdf.js
.
I think we can improve this more.
How about using the contents received from first request instead of requesting again?
This is just an idea.
(Sorry, if this idea doesn't make sense. I don't fully understand about pdf.js/extension implementation now.).
WDYT? @timvandermeij @Rob--W
@simonhong
Will this be a temporary fix for this problem?
document.querySelectorAll('a[href]').forEach(function(a){
if (a.href.match(/.+.pdf$/)){
a.setAttribute('href', 'chrome-extension://oemmndcbldboiebfnladdacbdfmadadm/' + a.href);
}
});
@shge good try. I think it would work for the link that ends .pdf
suffix.
However, we can easily find pdf links that don't have that suffix.
Do you try to reload the web page after this exception has displayed and the issue will be solved? This exception will be thrown when the pdf data loading in the first time.
@864534182 Do you try to reload the web page after this exception has displayed and the issue will be solved? This exception will be thrown when the pdf data loading in the first time.
Yes, but it does not work on some pages that require referer information.
I came across a website which prevents the second request by plugin because it is "a direct request".
It lets me download the file only when I access it by clicking a link in a specific webpage (the referer has to be a specific page).
Anyway, it should request once with referer information.
https://github.com/brave/brave-browser/issues/3474#issuecomment-473666538
The referrer thing is a regression caused by a change in Chrome - see https://github.com/mozilla/pdf.js/issues/10645
I will post this on the brave-browser repository too:
Browser: Brave-browser.
In this link:
https://projecteuclid.org/euclid.rmjm/1181072068
there is a button linking to PDF file. When I click on the button, it shows the already mentioned "Invalid or corrupted PDf file" message.
This appears to have been resolved when I updated today. My issues with this have been resolved by the most recent update.
Closing since this seems to work again.
Most helpful comment
I found the cause of this issue.
The reason is requesting twice to server in a very short interval.
To the second request, server redirects to
downloadsexceeded.html
instead of pdf content.So, pdf.js complains it's invalid/corrupted pdf file.
It's maybe server's DDoS protection I think.
Why two requests are issued when user clicks that link?
First one is issued by browser for user click.
Then, pdfjs extension intercepts header response and redirects to extension url.
Then, one more requesting is issued by
pdf.js
.I think we can improve this more.
How about using the contents received from first request instead of requesting again?
This is just an idea.
(Sorry, if this idea doesn't make sense. I don't fully understand about pdf.js/extension implementation now.).
WDYT? @timvandermeij @Rob--W