Pdf.js: PDFJS.getDocument - Unable to catch thrown Error when file is corrupted

Created on 4 Jul 2017  路  10Comments  路  Source: mozilla/pdf.js

Link to PDF file (or attach file here):
BrokenPdf.pdf

Configuration:

  • Web browser and its version: Chrome 59.0.3071.115
  • Operating system and its version: Mac OSX Sierra 10.12.5
  • PDF.js version: 1.7.225
  • Is an extension: NO

Steps to reproduce the problem:

  1. Try _PDFJS.getDocument(fileReader.result).then( ... ).catch( ... )_ with the attached "corrupted file"

What is the expected behavior? (add screenshot)
As the file is corrupted, Pdf.js thrown an Error (Error: page dictionary kid reference points to wrong type of object).

screen shot 2017-07-04 at 17 18 04

What went wrong? (add screenshot)
I am trying to catch this error in the promise, but its never being caught.

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
Simplifying, I have similar scenario (found in the web):

http://jsbin.com/qulinaweho/1/edit?html,console,output

But when the file is corrupted, the library prints the error in the console log, and I am not able to catch it in order to handle the error.

Thank you.

1-other

All 10 comments

Try PDFJS.getDocument(fileReader.result).then( ... ).catch( ... ) with the attached "corrupted file"

I am trying to catch this error in the promise, but its never being caught.

Considering that the error in question, FormatError: page dictionary kid reference points to wrong type of object, isn't thrown when calling getDocument it's expected that you cannot catch it like that.

Please note that the error originates in src/core/obj.js#L492-L493, i.e. as a result of a PDFDocumentProxy.getPage() call. In order to catch this error, you'll need to add catch handlers to those calls; e.g. something like this (using your sample code):

PDFJS.getDocument(fileReader.result).then((pdfDocument) => {
  // Fetching a page, e.g. the first.
  pdfDocument.getPage(1).then((pdfPage) => {
    // The page is now available for use...
  }).catch((ex) => {
    // This will catch errors such as:
    // "FormatError: page dictionary kid reference points to wrong type of object"
  });
}).catch( ... )

Edit: In order for this to work as described, you'll need to use PDF.js version 1.8.564 (or greater). The reason for this is that prior to PR #8684, we didn't correctly propagate these kind of errors from the worker side to PDFDocumentProxy.getPage on the API side.

Hi @Snuffleupagus, thank you for your response.
Yes I have tried to put the catch in the getPage. Actually when i was testing I put a catch in all the promises, but none of them was catching the issue, it was like the code was continuing the normal flow (apart that pdf.js was printing the error in the Console and stopping to work). Unfortunately doesn't go inside the catch and the library throws an exception in the pdf.worker.js:

// Fatal errors that should trigger the fallback UI and halt execution by
// throwing an exception.
function error(msg) {
  if (verbosity >= VERBOSITY_LEVELS.errors) {
    console.log('Error: ' + msg);
    console.log(backtrace());
  }
  throw new Error(msg);
}

Thanks.

Do you have to same issue with newer version of PDF.js?

@yurydelendik I am using version 1.7.225 which I have download a couple of months ago. Didn't try with newer versions (if any).

Please also notice that new version PDF.js has stopAtErrors setting

I am using version 1.7.225 which I have download a couple of months ago. Didn't try with newer versions (if any).

Please note that I specifically mentioned at the end of https://github.com/mozilla/pdf.js/issues/8608#issuecomment-320069449 that you need at least version 1.8.564 for this to work :-)

@ironal WFM at http://jsbin.com/quqinecojo/edit?html,console

Thanks both for the info.
Looking at the documentation I see that the beta version released today is 1.8.188 and stable is the one that I am using.
Would it be a risk to use 1.8.564 in Prod? Probably better that is released?

Would it be a risk to use 1.8.564 in Prod?

Maybe. This risk might have a lower cost than the defect you are experiencing. It's up to you to judge.

Ok I will evaluate what's better ;-)
Edit: any idea when 1.8.564 could get stable?

Was this page helpful?
0 / 5 - 0 ratings

Related issues

THausherr picture THausherr  路  3Comments

jigskpatel picture jigskpatel  路  3Comments

zerr0s picture zerr0s  路  3Comments

aaronshaf picture aaronshaf  路  3Comments

patelsumit5192 picture patelsumit5192  路  3Comments