Pdf.js: PDF with huge embedded image is displayed as a blank page

Created on 10 Dec 2015  路  16Comments  路  Source: mozilla/pdf.js

PDF: DC000933-uncompressed.pdf

If I open the above PDF in Chrome or Firefox, it is shown as blank at best. Firefox sometimes crashes if I (re)load a couple of times (even at a fresh start-up). I haven't built Firefox with symbols, but GDB shows that the stack trace ends in mozalloc_abort, which suggets that this segfault is caused by a memory allocation error.

This PDF is quite unusual, it has a width and height with excessive values:

<< /Contents 10 0 R /MediaBox [ 0 0 4444 2592 ] /Parent 6 0 R /Resources << /ExtGState << /G0 11 0 R /G1 12 0 R /G2 13 0 R >> /ProcSet [ /ImageB /ImageC /ImageI ] /XObject << /I0 14 0 R >> >> /Type /Page >>

...

<< /BitsPerComponent 1 /ColorSpace /DeviceGray /DecodeParms << /Columns 24688 /K -1 /Rows 14400 >> /Filter /CCITTFaxDecode /Height 14400 /Name /I0 /Subtype /Image /Type /XObject /Width 24688 /Length 972551 >>

It was allegedly created by MetaPrint.

1-core 2-performance

Most helpful comment

Yes, we have seen such PDF files before with images with a huge width and height. We might need to implement something to recognize that and downscale the image.

All 16 comments

Yes, we have seen such PDF files before with images with a huge width and height. We might need to implement something to recognize that and downscale the image.

i think it should return an error so that developers can handle the open pdf failure.

These files render fine at least with Evince and PDFium.

Is there already any news on this issue? We run into the same issue with half of our PDFs. As they consist of scanned building plans. Or is there a known cause of this problem?

@redlum94 Scanned building plans sound like large images (especially with a high resolution). In the first comment, @timvandermeij already suggested that we should downscale the image when needed.

I have worked on image downscaling before in PR #6606 , but that patch has bitrotten.

Does anyone know if there is a global variable I can look at to see if the viewer is empty or not?

I am another user affected by the bug. My tool also works with architectural diagrams, where pages sizes can be A1 or greater in size.

This bug was triggered by a PDF which was 841 (width) 脳 1007 (height) mm in size. What is interesting is that pdf.js handles A0 Landscape PDFs (1189 脳 841 mm) and A0 Portrait PDFs (841 脳 1189 mm) no problem. So it is not just a case of PDFs being "too big". Certain dimensions cause problems - even those smaller than other dimensions that don't cause problems.

The issue here is related to the size of images within the PDF, not the size of the PDF. A good architectural PDF would use vector graphics, so the problem here wouldn't occur at all for such files.

Is that the case, @THausherr? I experimented with two A0 Landscape pdfs containing one image each (with no vector graphics whatsoever), and pdf.js handled it fine on my computer. It just choked on the 841 x 1007 mm pdf.

It also depends on the size of the image within the PDF. The image from the PDF mentioned on top is 14400 x 24688. It would be interesting to see the size and colorspace of your PDFs.

I've got a lot to learn, obviously. Opening the 841 x 1007 PDF in a text editor, I get "/Width 13248 /Height 15858 /ColorSpace /DeviceGray". The A0 PDF has "/ColorSpace/DeviceRGB /Width 6622 /Height 9362". I hope that answers your question, @THausherr .

so is there already any news on this issue?

I'm experiencing the same bug with files scanned at 1200dpi, so i can confirm what @THausherr said, the scanner makes a one page pdf with a image sized 9824x13934. The color space is "/DeviceGray" but I think that the main issue here is the picture size.

change option.maxImageSize can solve this problem.

pdfjsLib.getDocument({ url: url, maxImageSize: MAX_IMAGE_SIZE })

@jxintang it does not work for me. Besides, documentation reports maxImageSize with a default value of -1 which means "renderer everything" but does not solve any issue.
` * @property {number} maxImageSize - (optional) The maximum allowed image size

  • in total pixels, i.e. width * height. Images above this value will not be
  • rendered. Use -1 for no limit, which is also the default value.`
    Am I wrong?

@jxintang it does not work for me. Besides, documentation reports maxImageSize with a default value of -1 which means "renderer everything" but does not solve any issue.
` * @Property {number} maxImageSize - (optional) The maximum allowed image size

  • in total pixels, i.e. width * height. Images above this value will not be
  • rendered. Use -1 for no limit, which is also the default value.`
    Am I wrong?

in my case, i can see a warning in my chrome like this
Warning: Image exceeded maximum allowed size and was removed.
so i just check the source code in pdf.worker.js

D9E9D2C2-6BD6-46B1-A3D6-ECCA46920B57

when i change the option.maxImageSize, the problem solved

Was this page helpful?
0 / 5 - 0 ratings

Related issues

syssgx picture syssgx  路  29Comments

Vanuan picture Vanuan  路  34Comments

agilgur5 picture agilgur5  路  32Comments

kaymes picture kaymes  路  62Comments

AliND picture AliND  路  29Comments