Pymupdf: Fields “Created” and “Modified” in Document Properties (PDF) were not displayed

Created on 7 Feb 2021  Â·  3Comments  Â·  Source: pymupdf/PyMuPDF

Describe the bug (mandatory)

Currently I have merged many PDFs together to create one PDF together. I have added metadata information which includes two fields "Created" and "Modified" but as a result these fields still do not display information. Here's my source code:

import re
import os
import fitz
from datetime import datetime

def importMetaData(path):
    regex = r"^r20ut(\d+)ej(\d+)$"
    r_UM = re.compile(regex)
    extension = [".pdf"]
    now = datetime.now() # current date and time
    date_time = now.strftime("%m/%d/%Y %H:%M:%S %p")
    print("date and time:",date_time)
    Number = ""
    for root, dirs, files in os.walk(path):
        for file in files:
            ext = os.path.splitext(file)[-1].lower()
            f_name = os.path.splitext(file)[0]
            if ext in extension:
                if r_UM.search(f_name) is not None:
                    if root.endswith("thuan1"):
                        Number = dictRNumber["code1"]
                    elif root.endswith("thuan2"):
                        Number = dictRNumber["code2"]
                    else:
                        continue
                    inforPDF=fitz.open(os.path.join(root, file))
                    inforPDF.set_metadata({})
                    inforPDF.set_metadata(
                    {
                        "producer": "MicrosoftÂź Word for Office 365",
                        "author": "Thuan",
                        "modDate": date_time,
                        "title": "Data Analysis",
                        "creationDate": date_time,
                        "creator": "MicrosoftÂź Word for Office 365",
                        "subject": Number
                    })
                    inforPDF.save(os.path.join(root, f_name+".pdf"))

Expected behavior (optional)

Two fields "Created" and "Modified" will display date time.

Screenshots (optional)

Image error

I have created a ticket on stackoverfollow:
https://stackoverflow.com/questions/66027402/fields-created-and-modified-in-document-properties-pdf-were-not-displayed

not a bug

Most helpful comment

Dear JorjMcKie -san,
Thank you so much for your supporting.
I got it.

All 3 comments

You used an illegal date/time format - see the documentation:

* If the date fields contain valid data (which need not be the case at all!), they are strings in the PDF-specific timestamp format "D:<TS><TZ>", where

    - <TS> is the 12 character ISO timestamp YYYYMMDDhhmmss (YYYY - year, MM - month, DD - day, hh - hour, mm - minute, ss - second), and
    - <TZ> is a time zone value (time intervall relative to GMT) containing a sign (‘+’ or ‘-‘), the hour (hh), and the minute (‘mm’, note the apostrophies!).

* A Paraguayan value might hence look like D:20150415131602-04’00’, which corresponds to the timestamp April 15, 2015, at 1:16:02 pm local time Asuncion

If you put in some self-invented formats, PDF viewer applications may or may not understand it.
Why don't you use fitz.getPDFnow() for the current timestamp?

>>> import fitz
>>> print(fitz.getPDFnow())
D:20210207070439-03'00'
>>> 

BTW you do not need to first empty the metadata via inforPDF.set_metadata({})

Dear JorjMcKie -san,
Thank you so much for your supporting.
I got it.

In the future, please do not hesitate or wait to ask. I am always trying to help people using the package as soon as I can.
You may also want to post a question under "Discussions" (top menu item). Apart from myself, other people may be there to answer ... or learn from your postings.

Was this page helpful?
0 / 5 - 0 ratings