Pdf.js: DPUScan pdf shows black screen

Created on 1 Sep 2018  Â·  14Comments  Â·  Source: mozilla/pdf.js

Attach (recommended) or Link to PDF file here:
I am unable to get the official document due to restriction. But I have attached the screenshot. The issue seems to happen for all DPUScan documents of pdf 1.5
img-20180831-wa0006

Configuration:

  • Web browser and its version: Firefox 45. Also tested on Firefox 52
  • Operating system and its version: Windows 7
  • PDF.js version: Tested with 2.0.55.0 and 1.10.100
  • Is a browser extension: No

Steps to reproduce the problem:

  1. Rendering the DPUScan pdf 1.5 version document via pdf.js gives black screen. I also tried opening the same document via online viewer and also ended up with black screen.
  2. Attached the screenshot of the problem.

What is the expected behavior? (add screenshot)
PDF should be rendered correctly

What went wrong? (add screenshot)
Black screen is shown

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):

1-core 3-pdf-broken

Most helpful comment

I did some debugging and looked at jpx_stream.js. After jpxImage.parse() the object has 1 tile which is only zeroes. The size is correct, i.e. 2496 x 3512 = 8765952.

(With the other file I attached, the decoded bytes were all 255)

So this suggests that the problem is in the JPEG2000 decoder and not with the colorspace as in PDFBox.

@rafaelcaviquioli this is an open source project and powered by volunteers who may or may not have time. Re "my head depending on this solution": if it is so, then free somebody or several people of your team for a few days and have them debug the JPEG2000 decoder. The show happens in
https://github.com/mozilla/pdf.js/blob/master/src/core/jpx.js
near "case 0xFF93:".
Compare what is happening with the JPEG2000 specs (see https://jpeg.org/jpeg2000/ ) or compare what is happening to a working JPEG2000 decoder, e.g. the one from Java,
https://github.com/jai-imageio/jai-imageio-jpeg2000/

All 14 comments

From https://github.com/mozilla/pdf.js/blob/master/.github/CONTRIBUTING.md:

If the issue is related to errors produced by a specific PDF, please always include the PDF by providing a URL where contributors can download it. Without a PDF for reproduction, such issues will be closed.

Closing since there is nothing we can do without an example PDF file. Perhaps you can make a non-classified PDF file with the same tool to share here, after which we can reopen this.

I won't be able to share the pdf file here, Can I have your personal emailId? to share the document.

Finally managed to get a non-classified pdf file. I have attached the same here. Can someone have a look?
test.pdf

The file also fails in PDFBox… in java the image has an RGB icc colorspace but in the raster it has only 1 band with 1 bit pixels. One would have to analyze the JPEG2000 image with a good tool that tells what meta data is really there..
PDFJS-10026-image.zip

Any update we can implement on pdf.js side to render the pdf file?

Here's another exotic PDF file with a JPEG2000 image that can't be rendered with PDF.js (page 11, bottom right); the JPEG2000 image has 4 bits per pixel.
https://issues.apache.org/jira/secure/attachment/12655396/PDFBOX-2204-012411.pdf

Any temporary fixes we can apply to the pdf.js code to render this pdf? Please suggest.

Hello, can you please provide an update?

Hello! I'm having the same problem :/

Could you please take a look on this? Same problem here!

Actually the whole team depending on this

Same problem : /

I did some debugging and looked at jpx_stream.js. After jpxImage.parse() the object has 1 tile which is only zeroes. The size is correct, i.e. 2496 x 3512 = 8765952.

(With the other file I attached, the decoded bytes were all 255)

So this suggests that the problem is in the JPEG2000 decoder and not with the colorspace as in PDFBox.

@rafaelcaviquioli this is an open source project and powered by volunteers who may or may not have time. Re "my head depending on this solution": if it is so, then free somebody or several people of your team for a few days and have them debug the JPEG2000 decoder. The show happens in
https://github.com/mozilla/pdf.js/blob/master/src/core/jpx.js
near "case 0xFF93:".
Compare what is happening with the JPEG2000 specs (see https://jpeg.org/jpeg2000/ ) or compare what is happening to a working JPEG2000 decoder, e.g. the one from Java,
https://github.com/jai-imageio/jai-imageio-jpeg2000/

Was this page helpful?
0 / 5 - 0 ratings

Related issues

SehyunPark picture SehyunPark  Â·  3Comments

BrennanDuffey picture BrennanDuffey  Â·  3Comments

zerr0s picture zerr0s  Â·  3Comments

jigskpatel picture jigskpatel  Â·  3Comments

patelsumit5192 picture patelsumit5192  Â·  3Comments