Pdf.js: Api to get images?

Created on 29 Feb 2016 · 2Comments · Source: mozilla/pdf.js

Hi everyone,

pdf.js is great! Was just wondering what I need to do in order to get all images of a page in node.js .
Seems the API is not quite there yet?

Maybe you could give me a hint how to accomplish that.

Thanks a lot
Andreas

Source

deepflame

Most helpful comment

At the moment you have to use getOperatorList (see https://github.com/mozilla/pdf.js/blob/master/src/display/api.js#L1033 and SVG converter as example at https://github.com/mozilla/pdf.js/blob/master/examples/svgviewer/viewer.js#L39). There are multiple ways images might be stored in the PDF: as JPEG with or without mask, as PNG, as scanned pages, as a BW bitmap data and as a pattern, sometime might be split into small several pieces. Please find the type of images used in your files and process only needed operations from the operator list. Closing as answered. Recovering of the original images from the PDF has little value for the viewer, so as is this requirement is out-of-scope of this project.

yurydelendik on 29 Feb 2016

👍3

All 2 comments

At the moment you have to use getOperatorList (see https://github.com/mozilla/pdf.js/blob/master/src/display/api.js#L1033 and SVG converter as example at https://github.com/mozilla/pdf.js/blob/master/examples/svgviewer/viewer.js#L39). There are multiple ways images might be stored in the PDF: as JPEG with or without mask, as PNG, as scanned pages, as a BW bitmap data and as a pattern, sometime might be split into small several pieces. Please find the type of images used in your files and process only needed operations from the operator list. Closing as answered. Recovering of the original images from the PDF has little value for the viewer, so as is this requirement is out-of-scope of this project.

yurydelendik on 29 Feb 2016

👍3

Hi @yurydelendik , thank you so much for your detailed and really fast response. This is highly appreciated! Wish you a nice day ahead

deepflame on 29 Feb 2016

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Remove PDF cache history in PDF.JS

sujit-baniya · 3Comments

Warning: TT: undefined function

timvandermeij · 4Comments

Badly rendered Times New Roman PS

dmisdm · 3Comments

Render all pages and allow scrolling?

liuzhen2008 · 4Comments

Error: Requesting object that isn't resolved yet Helvetica_path_F

THausherr · 3Comments