As promised some write up about my personal idea's I wish to have within a pdf renderer within React.
First of all the idea to create pdf's with the declerative syntax of react sounds amazing, also composing a PDF of more composable Components.
To support those two render targets we could make 2 separate packages "react-pdf-dom" and "react-dom-native" we could do it within separate repo's but personally I like to keep all the relavant code within the same code base, this can be archived with a solution like lerna what is your opinion about that? @diegomura
As an example the idea of a cross-renderer react renderer can be seen within the react-art-fiber renderer, public here.
The idea is to have two main building blocks for your pdf render target.
Those two will be Text and View, the reason I think this would be a great idea, is that it is used in other react renderers so it would feel less ancient. Within this repo there is some more explanation about this reasoning.
This is something we can't currently archive, the reason behind is that pdfkit does have it's own kind of layout rendering and / or calculations.
Things we should do before we could support this kind of magic is:
I could have missed some.
This is not because pdfkit is bad or something, it is just that it makes it more difficult to archive the goals mentioned in the comment, we still will be using fontkit I guess because text measuring is hard.
This will be something I don't even know how to implement yet, but should be doable when we have defined our own pdf primitives. It's something that would be great but not relevant yet.
Do you have some other thoughts?
First of all, thanks for your great ideas!! Some of them were in the back of my head, but you summarise them in a very neat way. Other are very original.
1- Support react-dom and react-native
This is the only point where I don't see the whole picture. Can you explain a bit more?
We are going to develop components based on the primitives of point 2, which internally will put together the pdf document, right? That I assume is on the react-pdf dependency.
What's going to be on the other ones? Components to show the document on both platforms or something like that? Maybe a code snippet of mock-up would be helpful to see what's in your mind 馃槃
About the other points, I agree with all. I always wanted to use Lerna, and the idea of having a Flexbox-ly way of layout things sounds great!
The thing about the react-dom and react-native bindings is just providing a <Document /> component for both platforms, where you can provide your react-pdf components like <Page />, <Text /> and <View />. And then it would save the file to or you're local system, pdf viewer or a download link. Not really to fancy stuff.
Nearly at the point you can see the idea behind the react-dom specific packages.
I'm currently working on a component which does what is on my mind, will share later 馃憤
Anyway pushed it, it can be seen in this branch but currently I am a little troubled with how pdfkit works, now thinking how I can make the pdf file myself with use of the reconciler.
I now understand that the different packages are for. Thanks for the explanation.
I agree with all your ideas! It would be great to see them working 馃槃
Just piggybacking this issue to also say that I am intrigued by the idea. I would be interested in contributing if my time allows it. Will be following the project closely. Bonne chance!
It's amazing that more people get attracted by the idea, of creating pdf's with basic react primitives, currently I'm just trying to get the internals working, will be working on this today and tomorrow. If I got the pdf output working appropriate there will be lots of opportunities to contribute to the project. Maybe It's an idea to make Github Project for it so we can work altogether to an initial release @diegomura.
@jbovenschen done! I'm new using gh projects, but I guess it's like just any other tool. Feel free to add any issue to it and I can handle them, or ask me for anything I can do to make your work easier. Thanks again
In this branch I got the least applicable pdf file working, without using third party libs, pretty stoked about the results :) https://github.com/jbovenschen/react-pdf/tree/custom-layout.
There is a lot of room for future optimizations, but the first results are there.
Things I will do before merging this branch are:
Hey guys, cool project! I'm the author of pdfkit and fontkit, and as such I have a lot of experience dealing with the PDF format and various file formats that come with it (e.g. fonts, image formats, etc.). Not just because I'm the author, I would highly recommend building this project on top of pdfkit, as opposed to reinventing it. It deals with a TON of stuff that you don't know yet that you'll have to rewrite, e.g. font embedding and text processing, image conversions, gradients, vector graphics, annotations, etc. Many of these are really hard problems that have taken years to build, and it would be a shame to reinvent the wheel here.
Several other higher-level projects like pdfmake have successfully built on top of PDFKit to provide things like layout and nicer declarative syntax. I'd like to think PDFKit is a relatively low level library for dealing with the PDF format that other abstractions can build on top of, similar to canvas or something.
If something about how PDFKit works is causing you issues, let me know how I can help.
Hi @devongovett !
First of all, thanks for reaching us. pdfkit is an awesome library. I've used it in the past many times. So congratulations about that and thanks for your work 馃槃
You clearly have a point on this. Actually, react-pdf used pdfkit when it first started. Implementing our own pdf bindings had give us a lot of flexibility so far, but it's also true that we will have to deal with more complex stuff now that we had before.
I had to admit that I follow @jbovenschen suggestion of re-implementing the pdf rendering withoud asking so much. It was good because I've learn so much about the pdf insights, but on the other hand you've gone through this path before, and clearly resolved many issues that we will face. So, I'm open to consider using pdfkit again, but I would like @jbovenschen to explain why he thinks we can have trouble using it.
@jbovenschen do you mind giving your opinion on this subject?
@devongovett Thanks for reaching out to us.
First of all thank you for creating pdfkit and fontkit, it made it always more pleasant to create PDF files within a web environment.
First of all we will use fonkit to do the text measurements and font embedding, it's stuff you won't rewrite yourself, also at the beginning like @diegomura this library did use pdfkit as the rendering layer and react-pdf would just give a react-like api to do so. Mainly how pdfmake is created, this all works great when you don't have updates inside your document which could be created later on.
If there will be an update submitted by the react reconciler we should create a new document, which is pretty heavy when you create a file with like 40 pages, still this is not a real issue within a node environment.
But after we added the dom bindings where you could create a PDF document inside of your application tree, it is open to receive dynamic updates caused by global state and or internal component state, since the web is only single threaded it would block this thread while processing the tree and create a new document, which is not really efficient and not something you would want in a interactive environment. I guess this could be solved inside of pdfkit? But would take a lot of work to do so, because there is no way to remove and or update instances which are created by pdfkit mostly due to pipe interface which you would use to create text, images and the other possible stuff inside pdfkit.
Also there were some smaller issues I hit when trying to create the DOM version for react-pdf.
browserify or webpack or deliver a custom build for the web version yourself.I'm happy to consider pdfkit again, if you could elaborate on those issues and if they could be improved. Also don't hesitate to say if I said something that is incorrect.
@jbovenschen
I can see your point about updates to an existing document, and this is kinda hard to do with the current streaming model. It's a tradeoff though: either you keep everything in memory so updates are possible, or you use the streaming model as PDFKit does. PDFKit used to keep everything in memory, but I found I'd hit the Node heap limit on large documents with lots of images fairly quickly. It could be possible to do something with incremental updates though: the PDF format does support them, just need to add something to PDFKit maybe. I think basically you just append updates to the document, which should work great with streaming. Needs experimentation to see if PDF readers support it though.
For text, fontkit is only one piece of the puzzle. It does most of the hard work of shaping the text (telling you what glyphs to show, and where to place them), but doesn't integrate at all with PDF. There is quite a bit of additional code in PDFKit to actually subset and embed the font in a PDF compatible way (you can't just drop the whole file in there), and to actually position the glyphs correctly (again, more complex than just placing the whole string in at once due to kerning etc.). Finally, PDFKit also performs unicode linebreaking, which supports all languages (can't just split based on spaces for example).
I get build issues about webpack all the time now that it has become dominant, so I do need to address that anyway. Browserify should work out of the box because we can configure the transforms we need at a module level, rather than every app needing to know how to build PDFKit. I'll probably start publishing a pre-built version to npm soon to resolve this.
As for CoffeeScript, PDFKit has been around for a long time now (far before ES6 and Babel existed), so it is in need of an update. My plan is to convert it to ES6 soon. This is already done for fontkit, but I haven't gotten to PDFKit yet, but it is overdue.
Thanks for your constructive answer.
We also were thinking about using Incremental PDF updates, so hopefully PDF readers does support that feature, if this can be solved within pdfkit I will be happy to use pdfkit again, because writing everything from the ground up ourselves will cause a lot of extra work. And if we do so we will be improving another ecosystem around pdfkit.
@diegomura do you think we could use pdfkit again for now, we will be faster to make a stable release in the upcomming months.
@devongovett let us know if we can provide any help with the changes you would need to make within pdfkit.
@devongovett @jbovenschen really enjoy reading this discussion. Thanks for explaining these points in detail.
As I said, it's certainly true that pdfkit already sorted lots of complex pdf related issues that we face now and will face in the future. Making documents render and update as fast as possible is something we should aim for, but it's not something we are dealing now. It will probably be on v2 or v3 at least. So I think it won't be a bad idea to use pdfkit for now, release a tested and stable version sooner (stars are getting out of control. people want this 馃槄 ), and solve the updating in the future. If we need to reimplement then our own bindings, so be it. But it's comforting that @devongovett has this and the ES6 issue on his scope.
I'll may be able to run some test this afternoon with pdfkit inside react-pdf, if you agree @jbovenschen .
Would be possible to change the /Producer and /Creator info attributes of the doc in case we use pdfkit?
@devongovett thanks again for showing interest on this project!
Yeah when you create the document, you can pass in metadata:
```
new PDFDocument({
info: {
Creator: 'react-pdf',
// ...
}
});
@devongovett currently having some trouble in the migration. I created https://github.com/devongovett/pdfkit/issues/666 for that 馃槃
All the points listed in this issue are already working and published on npm, so I'm closing it 馃槃
Most helpful comment
@jbovenschen
I can see your point about updates to an existing document, and this is kinda hard to do with the current streaming model. It's a tradeoff though: either you keep everything in memory so updates are possible, or you use the streaming model as PDFKit does. PDFKit used to keep everything in memory, but I found I'd hit the Node heap limit on large documents with lots of images fairly quickly. It could be possible to do something with incremental updates though: the PDF format does support them, just need to add something to PDFKit maybe. I think basically you just append updates to the document, which should work great with streaming. Needs experimentation to see if PDF readers support it though.
For text, fontkit is only one piece of the puzzle. It does most of the hard work of shaping the text (telling you what glyphs to show, and where to place them), but doesn't integrate at all with PDF. There is quite a bit of additional code in PDFKit to actually subset and embed the font in a PDF compatible way (you can't just drop the whole file in there), and to actually position the glyphs correctly (again, more complex than just placing the whole string in at once due to kerning etc.). Finally, PDFKit also performs unicode linebreaking, which supports all languages (can't just split based on spaces for example).
I get build issues about webpack all the time now that it has become dominant, so I do need to address that anyway. Browserify should work out of the box because we can configure the transforms we need at a module level, rather than every app needing to know how to build PDFKit. I'll probably start publishing a pre-built version to npm soon to resolve this.
As for CoffeeScript, PDFKit has been around for a long time now (far before ES6 and Babel existed), so it is in need of an update. My plan is to convert it to ES6 soon. This is already done for fontkit, but I haven't gotten to PDFKit yet, but it is overdue.