Puppeteer: Possibility for page numbers in PDF?

Created on 18 Aug 2017  路  51Comments  路  Source: puppeteer/puppeteer

Hi everyone! Currently I use PhantomJS to generate PDF documents from html output. Because PhantomJS isn't really up-to-date, I'd like to switch to Headless Chrome.

The thing that keeps me from making the switch is that it's not possible to print the page number on the bottom of every page. I found out that it's possible to set a header and/or a footer on every page with CSS.

header {
  display: block;
  position: fixed;
  top: 0px;
  left: 0px;
  right: 0px;
}

footer {
  display: block;
  position: fixed;
  bottom: 0px;
  left: 0px;
  right: 0px;
}

The only downside is that this only works for static data as it's not possible to somehow detect and print the page number in the footer or header. PhantomJS allows to set a header and footer of a document and print the page number.

feature

Most helpful comment

more clean option to customize header and footer will be awesome.

All 51 comments

displayHeaderFooter will add a page number to the bottom of the page. Unfortunately, chrome's print to pdf doesn't allow much customization of this and few browsers have support for the full set of features in CSS @page.

The following worked for me. To see the header, I had to add a top/bottom margin so the page's content wouldn't cover them up in the final pdf.

await page.pdf({
  path: 'page.pdf',
  displayHeaderFooter: true,
  margin: {top: 40, bottom: 40}
});

@JoelEinbinder @aslushnikov might have a better way.

I think CSS counters can help you get the page number in your CSS:
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Lists_and_Counters/Using_CSS_counters

I tried that together with the CSS from above and a :after pseudo selector to set the content of the footer. It prints the total number of pages on every page because in fact it's only one element.

Has anyone found any workaround for that?

The only thing that worked is displayHeaderFooter. The downside is that it also prints the page title and page date at the top. Page title can be removed by removing the title tag in the html file. Date is OK for me. The only thing that's a no go for me is the filepath/pageurl in the bottom left corner.

Sadly no support to add page numbers with puppeteer and no way to do this in CSS. However there is a workaround for that. Just edit your pdf file after puppeteer is doing magical things.

For my project I've used HummusJS (https://github.com/galkahana/HummusJS) to add page numbers after puppeteer. Performance is quiet good with puppeteer+hummus and even better than wkhtmltopdf or phantomjs.

HummusJS example to replace strings in an existing pdf file:
https://github.com/galkahana/HummusJS/issues/71

Here you can find different ways to do this:
https://stackoverflow.com/questions/1603301/how-to-add-page-numbers-to-postscript-pdf

Is there something I can do or help with to get this implemented? Does this mean that we have to add/change things in Chromium directly? Because the Page.PrintToPdf debug protocol only accepts a boolean for displayHeaderFooter. I would be happy to help you guys out with this but then I need a pointer at where I should/can start.

// @aslushnikov

I'm not familiar with the PDF generation code in Chromium, but our entry point is here:
https://cs.chromium.org/chromium/src/headless/lib/browser/headless_devtools_manager_delegate.cc?type=cs&l=395

I've been fixing the height on a "page" itself. For example, If you set pdf size to A4, you can set a fixed height of 1122px to a container. Then just some small JS to dynamically input page numbers.

await page.goto('https://bl.ocks.org/seripap/raw/81241195e182b62adc3c87c27258f85f/', {waitUntil: 'networkidle'});
await page.pdf({
  path: 'hacks.pdf',
  format: 'A4'
});

more clean option to customize header and footer will be awesome.

Specifically looking for this spec to be implemented: https://www.w3.org/TR/css3-page/#page-based-counters

Imhop, this isn't really a puppeter issue, discussion should be moved to: https://bugs.chromium.org/p/chromium/issues/detail?id=368053 or https://bugs.chromium.org/p/chromium/issues/detail?id=740496

Thank you very much @aslushnikov 馃嵒 !

:thumbsup: :thumbsup: :thumbsup:

@aslushnikov omg yes!! Thank you so much!! 馃憤馃憤馃憤

All credits go to @pavelfeldman and @aj-dev! 馃帀

Thank you as well and some feedback... I have two points.

  1. The header seems to be several times smaller than the content in the output pdf. Simple <h1>...</h1 in header and content looks in the output like this. At least on my windows.
    image

  2. The headerTemplate doesn't seems to evaluate javascript. Is this expected? I would like to use it to for example hide the page number for the first pdf page.

I was fiddling with the package installed through npm i GoogleChrome/puppeteer. I apologize if I did miss something or was too inpatient.

@pofider I've noticed the same with small header and footer fonts but had no time to dig deeper. There's some scaling issue, I guess. It works ok if I declare inline style, eg. style="font-size: 12px;"

What is the expected format of header/footerTemplate?

I set the following as options in page.pdf:

        displayHeaderFooter: true,
        footerTemplate: '<h1>Test Footer</h1>'

but nothing is shown in the generated pdf file. I have tried also with simple text and other html tags, but nothing produces any results.

@acgrama me too,it seems that chrome version not right. maybe @pofider can give some advise, thanks.

@aslushnikov can headerTemplate make a function like phantomjs header 锛烼his may more flexible
`
header: {

    contents: phantom.callback(function(pageNum, numPages) {

        if (pageNum % 2 != 0) {

            return '<div style="text-align:center">' + pageNum + ' / ' + numPages + '</div>';

        }

        return "";

    }),

},

`
thanks very much

have you tried not using displayHeaderFooter: true?, AFAIK that option is to show default header and footer that chrome has on printed pages but when you use headerTemplate or footerTemplate that option does not makes sense and you are likely getting into a conflict. try don't using it or put it in displayHeaderFooter: false

@bjrmatos yes锛宨 had set displayHeaderFooter: true.

Here is what Chrome itself uses. You can use date, title, url, pageNumber, totalPages classes to specify nodes values are injected into.

headerTemplate = `<div class='date'></div><div class='title'></div>`
footerTemplate = `<div class='url'></div>
    <div class='...'></div>
    <div class='...'>
      <span class='pageNumber'></span>/<span class='totalPages'></span>
    </div>`

@pavelfeldman thanks, this method is perfect, but i just want add header in odd page num , so this method can not use.

@yale8848 did you tried nth-child?

@bjrmatos @pavelfeldman I left out the displayHeaderFooter option and tried the suggested footerTemplate, but I can still see no footer.

Are you sure you installed the version from Github? The new feature is not yet released through npm.

@SamVerschueren Ok, then that's the problem! We installed through npm, as part of our docker image build chain.
Thanks a lot, looking forward to testing out the npm-based version!

@k15a thanks锛宨 will try it

Hey @pavelfeldman ,

I tried the following code it is not showing the template as expected.
I have used the following code:

index.js

var puppeteer = require('puppeteer');

(async () => {
    console.time('pdf-gen');
    const browser = await puppeteer.launch({
        args: ['--no-sandbox', '--disable-setuid-sandbox']
    });
    console.log("Puppeteer Browser Initiation Successful!");
    const page = await browser.newPage();
    var response = await page.goto('https://github.com/GoogleChrome/puppeteer/releases', {
        waitUntil: 'networkidle0'
    });
    if (response.ok) {
        var report = await page.pdf({
            path: 'hn.pdf',
            headerTemplate: `<div><span class='pageNumber'></span> 
            OF <span class='totalPages'></span></div>`
        });
    }
    await browser.close();
    console.timeEnd('pdf-gen');
})();

package.json

{
  "name": "puppeteer-sample",
  "version": "1.0.0",
  "main": "index.js",
  "author": "suri",
  "license": "MIT",
  "private": true,
  "dependencies": {
    "puppeteer": "https://github.com/GoogleChrome/puppeteer"
  }
}

It is not printing the the headers as Expected.

Any help would be appreciated.

Thank you.

So here's my two cents about how I got the custom templates to work -
Contrary to what @bjrmatos suggested, I did set displayHeaderFooter to true, and set the margin to, say, 40 ({margin: {top: 40, bottom: 40}})
Then, like @pofider said, the size of the text in the header/footer was very small, almost invisible (as in size 1 or two, like little ants)
I was not able to correct this with the page's own css, so I used an inline style attribute with a normal font-size in the header/footer template itself like so (use as value for footerTemplate/headerTemplate):

<div style="font-size:10px!important;color:grey!important;padding-left:10px;" class="pdfheader">
<span>Page: </span><span class="pageNumber"></span>/<span class="totalPages"></span>
</div>

Hope this saves someone a headache.

when will this feature be released to npm? thanks.

It looks like 1.0.0 was released 3 days ago.

https://chromium-review.googlesource.com/c/chromium/src/+/813177 landed in Chrome 65.0.3294.0, which is currently unstable/Canary. Make sure you have that version or else the custom headers/footers won't work!

Has anyone found how to insert an image into a footer or a header, because all i get is the broken image logo:
image

Thanks!

@ovheurdrive it seem that header img get from html cache. so you can try to set img url from current html or base64 data.

@ayal I had the same issue, and managed to style the header/footer without using inline styles, without the !important marks. All I do is set @page {margin: 1cm} and then embed a <style> element in the template.

Here is what my variable footerTemplate looks like (produces "Page 1/30" etc.):

<style type="text/css">
.pdfheader {
  font-size: 10px;
  font-family: 'Raleway';
  font-weight: bold;
  width: 1000px;
  text-align: center;
  color: grey;
  padding-left: 10px;
}
</style>

<div class="pdfheader">
  <span>Page </span>
  <span class="pageNumber"></span> / <span class="totalPages"></span>
</div>

It seems that some stylings just won't work though, like background-color.

@yale8848 do you have any progress with nth-child?

@paul-pro

<style type="text/css">
.test {
  font-size: 10px;
  font-family: 'Raleway';
  font-weight: bold;
  width: 1000px;
  text-align: center;
  color: grey;
  padding-left: 10px;
}
div.test:nth-child(odd)
{
    color:#ff0000;
}
</style>
<div class="test">test1</div>
<div class="test">test2</div>
<div class="test">test3</div>

nth-child can work in each page, but not work in the whole pdf pages
pdf-header

Did you try using