Pdf.js: helloworld example can't render some .pdf that include chinese character

Created on 19 Jan 2018  路  6Comments  路  Source: mozilla/pdf.js

it fails to render chinese character on canvas
but its okay to open the file with official pdf.js viewer

helloworld example

image

pdf.js viewer

image

import React, { Component } from 'react'
import { AutoSizer, List } from 'react-virtualized'

class Page extends Component {
  componentDidMount() {
    this.renderPDF()
  }

  renderPDF = async () => {
    const { pdf, pageNumber, pageWidth, pageHeight, scale, } = this.props
    const page = await pdf.getPage(pageNumber)
    const viewport = page.getViewport(scale)
    const canvas = this.refs.page
    const context = canvas.getContext('2d')
    canvas.height = viewport.height
    canvas.width = viewport.width
    page.render({
      canvasContext: context,
      viewport: viewport,
    })
  }

  render() {
    const { index, style, } = this.props
    return (
      <div className='watermark-layer' key={index} style={style}>
        <canvas ref='page'></canvas>
      </div>
    )
  }
}

export default class PDFViewer extends Component {
  state = {
    pdf: null,
    numPages: 0,
    pageWidth: 0,
    pageHeight: 0,
    scale: 0,
  }

  style = {
    minHeight: '100vh',
  }

  async componentDidMount() {
    const { file } = this.props
    const pdf = await PDFJS.getDocument(file)
    const { numPages } = pdf
    const { pageWidth, pageHeight, scale } = await this.getSize(pdf)
    this.setState({ pdf, numPages, pageWidth, pageHeight, scale, })
  }

  getSize = async (pdf, full) => {
    const page = await pdf.getPage(1)
    const { width, height } = page.getViewport(1)
    const { pdfdiv } = this.refs
    let pageWidth, pageHeight, scale

    if (full) {
      pageWidth = pdfdiv.clientWidth
      pageHeight = pageWidth * height /  width
      scale = pageWidth / width
    } else {
      pageHeight = pdfdiv.clientHeight
      pageWidth = pageHeight * width / height
      scale = pageWidth / width
    }

    return {
      pageWidth,
      pageHeight,
      scale,
    }
  }

  render() {
    const { pdf, numPages, pageWidth, pageHeight, scale, } = this.state
    return (
      <div ref='pdfdiv' style={this.style}>
        <List
          style={{ margin: '0 auto' }}
          width={pageWidth}
          height={pageHeight}
          rowHeight={pageHeight}
          rowCount={numPages}
          rowRenderer={({ index, style, }) => {
            return (
              <Page
                key={index}
                index={index}
                style={style}
                pdf={pdf}
                pageNumber={index + 1}
                pageWidth={pageWidth}
                pageHeight={pageHeight}
                scale={scale}
              />
            )
          }}
        />
      </div>
    )
  }
}

Most helpful comment

working

<script src='https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js'></script>
<script src='https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js'></script>
PDFJS.cMapUrl = 'https://cdn.jsdelivr.net/npm/[email protected]/cmaps/' // include "/"
PDFJS.cMapPacked = true // set cMapPacked = true to ignore Warning: Ignoring invalid character "121" in hex string

All 6 comments

it fails to render chinese character on canvas
but its okay to open the file with official pdf.js viewer

Since most of the required information is missing, e.g. the PDF file isn't attached to the issue and no complete (runnable) example is provided, which is requested in both ISSUE_TEMPLATE.md and CONTRIBUTING.md, the following is a best guess:
You need to include the CMap files, and set the following options accordingly
https://github.com/mozilla/pdf.js/blob/75dc2bbd359990ebb1c1484f204acf66c3cb8221/src/display/global.js#L101-L112
See how that's done in e.g. the simpleviewer components example
https://github.com/mozilla/pdf.js/blob/75dc2bbd359990ebb1c1484f204acf66c3cb8221/examples/components/simpleviewer.js#L27-L30

Closing as answered.

i can send an email of corrupted pdf file

@Snuffleupagus i've just sent a copy of wrong pdf file.

i've just sent a copy of wrong pdf file.

For future reference: You should attach the PDF file to the issue, not emailing it, since arbitrary contributors shouldn't be expected to provide you personalized support.

Besides, the answer already given in https://github.com/mozilla/pdf.js/issues/9380#issuecomment-358929935 still stands (regarding CMaps), so I'm not sure what the problem is here.

okay my fault, because the file include some "not so private but" data. will try it

working

<script src='https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.min.js'></script>
<script src='https://cdn.jsdelivr.net/npm/[email protected]/build/pdf.worker.min.js'></script>
PDFJS.cMapUrl = 'https://cdn.jsdelivr.net/npm/[email protected]/cmaps/' // include "/"
PDFJS.cMapPacked = true // set cMapPacked = true to ignore Warning: Ignoring invalid character "121" in hex string
Was this page helpful?
0 / 5 - 0 ratings