Pymupdf: Question / Comment: How can use this. to combine two pdfs

Created on 25 Aug 2020  路  2Comments  路  Source: pymupdf/PyMuPDF

I am using archlinux with python 3.7.3 installed.

I want to try this to combine few pdfs along with their bookmarks.

Can you help me how to start.

I have mupdf-gl. Its lightning fast to open a document of 100,000+ pages also. I use it as my default pdf viewer.

But hearing that it can be used to do more than just viewing i want to use this.

Also i started using the mupdf mini for android. its simply super fast.

Also how can i do the below things which are done using pdftk i.e to dump and later update the outline

pdftk file.pdf dump_data output dump_data.txt
pdftk file.pdf update_info dump_data.txt output file_index_updated.pdf
question resolved

Most helpful comment

I am afraid you must read the documentation. It has sections called "Tutorial" and "Collection of Recipes", where you find examples code for all of the questions you asked above.
To provide you with at least the most simplest thing, here is a handful of lines which join two PDFs and also combine their table of contents.

import fitz
doc1 = fitz.open("file1.pdf")  # open file 1
toc1 = doc1.getToC(False)  # its table of contents (list)
pc1 = len(doc1)  # number of its pages

doc2 = fitz.open("file2.pdf")  # open file 2
toc2 = doc2.getToC(False)  # its table of contents

new_toc2 = []  # modified toc2 with increased page numbers
for line in toc2:  # read to 2nd TOC to update its page numbers
    line[2] += pc1  # add file 1 page count to this page number
    new_toc2.append(line)

doc1.insertPDF(doc2)  # append file 2 to file 1
doc1.setToC(toc1 + new_toc2)  # set table of contents for the result
doc1.save("file3.pdf")  # save result to new file

All 2 comments

I am afraid you must read the documentation. It has sections called "Tutorial" and "Collection of Recipes", where you find examples code for all of the questions you asked above.
To provide you with at least the most simplest thing, here is a handful of lines which join two PDFs and also combine their table of contents.

import fitz
doc1 = fitz.open("file1.pdf")  # open file 1
toc1 = doc1.getToC(False)  # its table of contents (list)
pc1 = len(doc1)  # number of its pages

doc2 = fitz.open("file2.pdf")  # open file 2
toc2 = doc2.getToC(False)  # its table of contents

new_toc2 = []  # modified toc2 with increased page numbers
for line in toc2:  # read to 2nd TOC to update its page numbers
    line[2] += pc1  # add file 1 page count to this page number
    new_toc2.append(line)

doc1.insertPDF(doc2)  # append file 2 to file 1
doc1.setToC(toc1 + new_toc2)  # set table of contents for the result
doc1.save("file3.pdf")  # save result to new file

wow. Thank you very much for showing with such an a clear example.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Matmaus picture Matmaus  路  3Comments

deepanshug picture deepanshug  路  3Comments

axlgit picture axlgit  路  3Comments

akjanik picture akjanik  路  3Comments

Gabriellavoura picture Gabriellavoura  路  4Comments