Docfx: Pdf generation

Created on 22 Feb 2017  路  18Comments  路  Source: dotnet/docfx

For some of our apps we still use chm files for help inside the app because:

  • It is a simple standalone file.
  • It can be structured (i.e. have topics and subtopics).
  • From our application we can jump directly to a topic when opening the help file.

Does docfx provide means to to generate chm files?

Area-Pdf Area-Plugins

Most helpful comment

An initial version for generating pdf is released.

Please have a try and provide your feedback/suggestions. 馃槃

Steps to generate PDF for docfx-seed project https://github.com/docascode/docfx-seed .

  1. Prerequisite:
    we leverage wkhtmltopdf to generate PDF. Download wkhtmltopdf and save the executable folder path to %PATH%

  2. Current design is that each TOC file generates a corresponding PDF file, TOC is also used as the cover page of the PDF, so we create a toc.yml file specially for PDF under a new folder pdf, using TOC Include to include content from other TOC files.

- name: Home
  href: ../index.md
- name: Articles
  href: ../articles/toc.md
  homepage: ../articles/docfx_getting_started.md
- name: API Documentation
  href: ../obj/api/toc.yml
- name: REST API
  href: ../restapi/toc.md

  1. Add "pdf" section in docfx.json, parameters are similar to "build" section, definitely it is using a different template (the builtin template is pdf.default), with another output destination. We also exclude TOC files as each TOC file generates a PDF file
...
  "pdf": {
    "content": [
      {
        "files": [ "**/*.yml" ],
        "cwd": "obj/api",
        "dest": "api",
        "exclude": [
          "**/toc.yml"
        ]
      },
      {
        "files": [ "articles/**/*.md", "*.md", "toc.yml", "restapi/**" ],
        "exclude": [
          "**/toc.yml",
          "**/toc.md"
        ]
      },
      {
        "files": [ "pdf/toc.yml"]
      }
    ],
    "resource": [
      {
        "files": [ "articles/images/**"]
      }
    ],
    "overwrite": "specs/*.md",
    "dest": "_site-pdf"
  }
...
  1. Run docfx pdf to generate the PDF file

Some improvements to make:

  • [ ] Better PDF file name: could specify through config or command
  • [x] Only generate PDF file in the output folder, currently html files are also generated
  • [ ] And more for you to add ...

All 18 comments

Which tool do you use to generate chm before? DocFX supports post processor, which can be used to add a step to convert generated html to chm.

For of our help (both API and conceptual) we use Sandcastle Help File Builder which among other formats is also able to generate chm files. I would assume that SHFB is the most commonly used (but not really loved) tool to build help documents for .NET projects. So I think it would really help and give docfx a lot of traction if there would be a good migration path from SHFB to docfx.

So to generate chm files I would

  1. need to generate html using docfx in a form that can be consumed by a chm generator
  2. find a chm generator that can generate chm from html that docfx produced

Any idea?

Plan to support PDF in early April

I would like the implementation from docs.microsoft.com (see https://docs.microsoft.com/en-us/dotnet/articles/welcome)

Update status: resource working on PDF is limited, extend the task to May

That sounds really great! Is there already an approximate plan, if one can count with a first testable version? This feature would be incredibly important.

An initial version for generating pdf is released.

Please have a try and provide your feedback/suggestions. 馃槃

Steps to generate PDF for docfx-seed project https://github.com/docascode/docfx-seed .

  1. Prerequisite:
    we leverage wkhtmltopdf to generate PDF. Download wkhtmltopdf and save the executable folder path to %PATH%

  2. Current design is that each TOC file generates a corresponding PDF file, TOC is also used as the cover page of the PDF, so we create a toc.yml file specially for PDF under a new folder pdf, using TOC Include to include content from other TOC files.

- name: Home
  href: ../index.md
- name: Articles
  href: ../articles/toc.md
  homepage: ../articles/docfx_getting_started.md
- name: API Documentation
  href: ../obj/api/toc.yml
- name: REST API
  href: ../restapi/toc.md

  1. Add "pdf" section in docfx.json, parameters are similar to "build" section, definitely it is using a different template (the builtin template is pdf.default), with another output destination. We also exclude TOC files as each TOC file generates a PDF file
...
  "pdf": {
    "content": [
      {
        "files": [ "**/*.yml" ],
        "cwd": "obj/api",
        "dest": "api",
        "exclude": [
          "**/toc.yml"
        ]
      },
      {
        "files": [ "articles/**/*.md", "*.md", "toc.yml", "restapi/**" ],
        "exclude": [
          "**/toc.yml",
          "**/toc.md"
        ]
      },
      {
        "files": [ "pdf/toc.yml"]
      }
    ],
    "resource": [
      {
        "files": [ "articles/images/**"]
      }
    ],
    "overwrite": "specs/*.md",
    "dest": "_site-pdf"
  }
...
  1. Run docfx pdf to generate the PDF file

Some improvements to make:

  • [ ] Better PDF file name: could specify through config or command
  • [x] Only generate PDF file in the output folder, currently html files are also generated
  • [ ] And more for you to add ...

I can't make it work following the steps described. I've got errors during PDF generation. It seems that somehow docfx adds an extra .tmp to the path (Convert the file : "c:/Temp/docfx-seed-master/_site-pdf/.tmp/pdf/../api/CatLibrary.Tom.html" - has exception, the details: The system cannot find the file specified)

@csnemes Looks like a recursive inclusion, files inside _site-pdf is included in docfx.json... did you call docfx pdf several times?

I've managed to figure out that a trailing slash used in dest defintion caused the problem. Changing"dest" : "_site-pdf/" to "dest" : "_site-pdf" fixed the problem.

Very nice, seems to work quite well for my scenario. The template is super barebones but I imagine that is no surprise at such an early stage. Looking forward to advanced features like page and chapter numbers!

I notice that hyperlinks to articles don't always go to the right page. One that should take me to page 120 takes me to page 121 instead (the article spans multiple pages starting from 120). Some others seem to work fine, though. Hmmm will have to try to narrow it down. From first glance, it looks like most links are +1 page in my current document.

@sandersaares wkhtmltopdf jumps to the middle page of the linked topic...

Today I have looked a bit deeper into PDF generation (I know I am a bit late). As mentioned earlier, the generated PDF looks very basic, which is OK since you can change the template yourself. But there are some things you cannot do that are really needed before PDF is a viable solution for technical documentation (for us):

  1. support for footers and TOC with page numbers #2003
  2. support for a cover page #2004

I was playing around with the PDF-generation and I had two questions:

  • Is there a way to control the complete layout of the PDF? I've still got to get acquainted with the templating system in DocFX so I might be missing something here, but from what I've gathered generating the PDF's is a bit less flexible that the site generation. For instance, for the PDF generation there is no template that enables you to style the articles themselves, or am I missing something?

  • How to debug your PDFs? It's difficult to figure out what is going wrong when you just get to see the outputted PDF. Is there a way to expose the HTML that is being used to generate the PDF from? Or are there other, better ways to debug the PDF generation?

I'm interested in the initial topic of this issue, which was the generation of .chm help files. I require these for integration into existing software that links into .chm topics. It would be great to have a modern alternative to Sandcastle for docfx output.

  • Is there some guidance or a solution to making .chm file yet?
  • Should I open a new topic for this, since this issue has been highjacked repurposed to discuss .pdfs?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs.

As PDF feature has shipped for a while, I'd suggest to opening new issues for new feedbacks.

Note that you could expect delay response about PDF feature due to current roadmap

Was this page helpful?
0 / 5 - 0 ratings