Pdf.js: Example using a textLayer

Created on 6 Oct 2019  路  8Comments  路  Source: mozilla/pdf.js

I have the basic pdf.js API working to render onto the canvas but I can't get it to work with the textLayer. The only examples I can find are very old and no longer work.

The typescript definition shows that I have to pass:

interface PDFRenderTextLayer {
    beginLayout(): void;
    endLayout(): void;
    appendText(): void;
}

... but the text layer code from the viewer doesn't actually seem to use this pattern.

There needs to be more complete examples as the ones we have now are very basic.

I appreciate that there's an amazing viewer app but it requires reverse engineering to actually determine how to do this. I can submit a PR for more examples if someone points me in the right direction.

1-other

Most helpful comment

These example are not even close to udnerstandable of what we have to do enable search on page....

All 8 comments

Hello

I think create a PR with more complexe example is a good point, and we can also document more the API, cause today the best doc is the code.

For your question I paste you how I build my PdfViewer in Typescript, I use PDFViewer object and I inject different feature that I need.

import { PDFJSStatic, PDFPageViewport, PDFViewer, PDFFindController, PDFLinkService } from 'pdfjs-dist';
import * as pdfjsLib from 'pdfjs-dist/build/pdf';
import * as pdfjsViewer from 'pdfjs-dist/lib/web/pdf_viewer';
import * as pdfjsLinkService from 'pdfjs-dist/lib/web/pdf_link_service';
import * as pdfjsFindController from 'pdfjs-dist/lib/web/pdf_find_controller';

private async pdfViewer(): Promise<void> {
    const container = document.getElementById('viewerContainer');

    this.pdfLinkService = new pdfjsLinkService.PDFLinkService();
    this.pdfFindController = new pdfjsFindController.PDFFindController({
      linkService: this.pdfLinkService,
    });

    this.pdfElement = new pdfjsViewer.PDFViewer({
      container: container,
      linkService: this.pdfLinkService,
      findController: this.pdfFindController,
      enhanceTextSelection: true,
      textLayerMode: 2
    });

    this.pdfLinkService.setViewer(this.pdfElement);

    // Loading document.
    const loadingTask = pdfjsLib.getDocument({
      url: this.pdfService.getPdfDocUrl(this.docId),
      cMapUrl: '../../node_modules/pdfjs-dist/cmaps/',
      cMapPacked: true,
      pdfFindController: this.pdfFindController,
    });

    const pdfDocument = await loadingTask.promise;

    this.pdfElement.setDocument(pdfDocument);
    this.pdfLinkService.setDocument(pdfDocument, null);
    this.pdfFindController.setDocument(pdfDocument);

  }

Thanks... I will try this. Why do you use the PDFViewer object and not the raw PDF API?

@zagoa how did you get those imports to work from pdfjs?

I'm using @types/pdfjs-dist for my types... is there a better typescript definition set I should use.?

@zagoa looks like 2.2.228 doesn't have types either.

Did you check the viewer components example at https://github.com/mozilla/pdf.js/blob/master/examples/components/pageviewer.js#L55? It's an up-to-date example on how to get a text layer. If that is not sufficient, additional examples are always welcome.

TypeScript improvements are already tracked in other open issues and even PRs. We're discussing what the best path forward is; most likely we're going forward with generating the typings from JSDoc comments, but let's track that elsewhere.

@timvandermeij Thanks for this! Really appreciate it. I'm about to rewrite Polar's pdf.js support using our own PDF viewer from the ground up so the typescript is super super helpful !

Really appreciate all your help!

The TypeScript improvements are tracked in #7909 and API documentation improvements are tracked in #6526 (but both are closely related). The existing viewer components example should suffice for how to use the text layer. However, if there are specific usages that are not covered by the examples, please create a follow-up issue with a clear description of what cannot be achieved with the current examples so we can consider expanding the set of examples. Thank you!

These example are not even close to udnerstandable of what we have to do enable search on page....

Was this page helpful?
0 / 5 - 0 ratings

Related issues

aaronshaf picture aaronshaf  路  3Comments

kleins05 picture kleins05  路  3Comments

patelsumit5192 picture patelsumit5192  路  3Comments

smit-modi picture smit-modi  路  3Comments

zerr0s picture zerr0s  路  3Comments