_[email protected] commented:_
Which version of PhantomJS are you using? Tip: run 'phantomjs --version'.
1.4.1What steps will reproduce the problem?
- phantomjs rasterize.js 'http://en.wikipedia.org/w/index.php?title=Jakarta&printable=yes' jakarta.pdf
- Look at result and try to select text
- Impossible to select text
What is the expected output? What do you see instead?
output is a pdf. i expect to see pdf and ability to select the text.Which operating system are you using?
osxDid you use binary PhantomJS or did you compile it from source?
binaryPlease provide any additional information below.
Are you supposed to be able to select the text for the pdf? If not then what is the purpose? Thanks!
Disclaimer:
This issue was migrated on 2013-03-15 from the project's former issue tracker on Google Code, Issue #373.
:star2: 14 people had starred this issue at the time of migration.
_[email protected] commented:_
Same on Debian with the 1.5.0 binary. Tried it with several websites. That's a pretty big issue :/
_[email protected] commented:_
This issue seems very similar to http://code.google.com/p/wkhtmltopdf/issues/detail?id=886, where rendering with the newest static build of wkhtmltopdf leads to unselectable text, but using an older binary works well. May be that helps, since wkhtmltopdf is also relying on QtWebKit.
_[email protected] commented:_
Using 1.6 now, this issue is still here, and doesn't appear to be one of the milestones for 1.7. Wanted to check in and see if anyone found a workaround?
_[email protected] commented:_
It turns out that phantomjs 1.6 running on Linux does output PDFs with selectable text. I can confirm that phantomjs 1.6 on Mac OS X 10.7.4 does not output pdfs with selectable text.
_[email protected] commented:_
Also an issue on phantomjs 1.7.0 on OSX 10.6.8 using the same jakarta example.
_[email protected] commented:_
I have got the same problem on phantomjs 1.8.1 on OSX 10.8.2
_[email protected] commented:_
I can confirm that text is selectable inside the PDF with phantomjs 1.8.1 on Ubuntu 12.04. It looks like it really is a Mac OSX problem.
using an old version of wkhtmltopdf seems to work (like this one) http://wkhtmltopdf.googlecode.com/files/wkhtmltopdf-OSX-0.10.0_rc2-static.tar.bz2
I can also confirm that running phantomjs 1.8.2 on OSX 10.8.5 results in text not being selectable and links not clickable.
OSX 10.7.5 here, same issue.
There is a pull request for a fix here: https://github.com/ariya/phantomjs/pull/11509
any progress on this?
I guess this will be fixed in version phantomjs v2.
It's using QT5 which shouldn't have the problem.
Backing up @andi-neck - definitely an OSX issue. Identical NPM/phantomjs versions show this working on Linux (Ubuntu 14.04), but not on OSX (Yosemite).
I'm running 2.0.0 (development) on linux and I'm seeing the issue. While trying to render an svg containing embedded fonts all the text is rasterized (it looks correct, but it isn't text).
I experience the same issue same code runs great on windows 8, except, the viewport seems to be better respected on my windows box and the text just is not selectable.
What is the resolution for this? Is it confirmed that it works on Linux but not Mac?
It's not working on Linux for me (2.0.0).
I discovered that as long as you don't embed the fonts in the SVG they render as text. I guess you need to have them registered and available within the system but that gets into some Linux font madness that I couldn't quite figure out.
I was trying to take my embedded fonts, strip them from the SVG, register them temporarily as system fonts, render the PDF, then deregister the fonts again. I think there's a work around there - hopefully someone who knows a little more about font management in Linux knows how to make this pattern work.
I confirm this bug too using phantomjs 2 in Ubuntu 14.04
confirmed that this issue is still present with 2.0 on Mac OS 10.10.2. Rendering a PDF results in text being rasterized and not selectable.
Hey Folks, is there any version of phantomjs where text seems to be selectable on MacOSX.
I have 1.9.0 on linux box and the resultant pdf has non selectable text.
on MacOSX i have tried with version 2.0.0, 1.9.0, 1.9.8, 1.9.6
This is turning out to be a deal breaker for me.
@anantshri I'm currently getting selectable text in Mac OS using 1.9.8 with this patch applied: https://github.com/ariya/phantomjs/blob/1.x/patches/osx-pdf-selectable-text.patch (installed from source through Homebrew)
Any documentation to know how to install a this from homebrew with the patch on ?
@MoOx Can't remember the source, but here is a Gist of the steps I use to apply the patch (and rollback to 1.9.8): https://gist.github.com/jordanandree/3e6d37ce7aa68ed6fa43
@jordanandree thanks for sharing!
@jordanandree thanks for sharing i can confirm that it is working and text is selectable. But there are some styles which are not rendered properly in 1.9.8 but were rendering properly in 2.0.0. Is this patch possible in 2.0.0 branch.
I also have this issue, is there any patches to fix this issue on 2.0.0 (or 2.0.1) as this is a deal breaker for me. I need small PDF files output and therefore I need PDF with actual text and not images of text.
Thanks
@jeromeof unselectable text does not mean it would rasterized. You can also produce text in pdf from pure vectors. I assume the difference between text and vector primitives in file size is not huge.
I tried a simple sample file in both Safari (using Export as PDF) and with PhantomJS. With Safari the PDF file was 230KB with PhantomJS it was 12.9MB in size. So, for me this would add TB's to the storage requirements for a solution generating PDF files (which mainly contain text) using PhantomJS. Possible this is only an issue on OS X but its definitely an issue still.
That issue is most likely related to https://bugreports.qt.io/browse/QTBUG-10094. I've applied the corresponding patch and I confirm this is fixing the issue on Mac OS 64-bit platform. This is fixing both the selectable text and file size issues. Indeed, having the text outlines rendered with vector primitives do increase the file size significantly.
I've just created https://github.com/ariya/phantomjs/pull/13243 to have the fix merged as this is only available in QT 4.8+ and 5.5+.
I have the same issue with phantomJS 2.0 in MacOS.
It would be great if someone could have a look at @astefanutti's PR..
I confirm this bug too using phantomjs 2 in Mac
I can confirm this is happening when building PhantomJS 2.0.0 from source on CentOS 6
+1 same issue
+1 with PhantomJS 2.0.0 on Mac. Any workarounds?
+1 Same issue
Given the externalisation of Qt Base as a submodule, PR #13243 is superseded by Vitallium/qtbase#2.
@astefanutti do you have instructions on how this pull request could be tested. I am interested in doing the test just need proper instructions.
@anantshri just rebuild PhantomJS with Vitallium/qtbase#2 on Mac OS and run the usual rasterize.js
example with output PDF.
@astefanutti does that also include hyperlink support (https://github.com/ariya/phantomjs/issues/10196)? Given the warning in the compilation instructions (http://phantomjs.org/build.html), any chance you would mind posting a compiled binary for Mac OS? Thanks a ton for your fixes!
@travis5555 you can get a Mac OS binary from here: https://github.com/astefanutti/decktape#install. It contains hyperlink support as well.
thanks for the quick reply @astefanutti! I hadn't seen decktape before - it looks pretty awesome. since it has all of the awesomeness of embedded text, hyperlink, support, etc, is there any reason it can't be used for generic HTML > PDF conversion as opposed to slide decks specifically?
@travis5555 you're right. DeckTape depends on the improved version of PhantomJS that I maintain here https://github.com/astefanutti/phantomjs and that can be used for general purpose HTML to PDF conversion. Ideally, all the improvements will be integrated in PhantomJS upstream so that DeckTape value remains for slide decks specifically.
@astefanutti how do I npm i
your fork ?
Some progress related to that issue can be found at ariya/phantomjs#13997 (mostly Linux).
+1 same issue. Any workaround to fix this?
Wanted to post my solution to this problem. It turns out that loading a web font from a remote URL will cause PhantomJS to rasterize the font in the PDF. This creates a PDF where the text cannot be highlighted, since it is an image. This causes the PDF file size to grow 10 times.
We were using Proxima Nova, and our CSS file looked like this:
@font-face
font-family ProximaNovaReg
font-style normal
font-weight 100
src url("/assets/fonts/ProximaNova-ThinWeb.woff") format("woff")
body
font-family ProximaNovaReg
To fix the issue, we installed the Proxima Nova TTF files directly onto our Ubuntu box. This means copying the TTF files to /usr/share/fonts/truetype
, and running fc-cache -fv
.
Now we can change our CSS to just the following:
font-family "Proxima Nova"
PhantomJS now treats Proxima Nova as a natively installed font, and renders a smaller sized PDF with selectable text. This is the right solution.
Note: I only encountered this problem on Linux. Mac OS worked fine.
@robinfhu I was going to try your solution and realised that upgrading my dependencies was enough. I think there _was_ a bug, and it got fixed at some point. Regardless of that, the case you found is a very good one to know.
FWIW, I'm using phantomjs indirectly via https://www.npmjs.com/package/markdown-pdf.
1.9.8 works for me on Centos7. 2.1.1 rasterizes the output
@robinfhu your solution works for me.
environment:
phantomjs: 2.1.1
os: Ubuntu 16.04 LTS
I installed the font used in my pdf to the machine. Now the pdf output is as normal as the output in my Mac.
But the thing is I need to run some pdf render test in the CI with Ubuntu OS. I don't want to install font before run the test, and that is not a good solution.
Maybe with some fix the linux version of Phantomjs can fix this issue. I am still watching this, if there are any updates.
Due to our very limited maintenance capacity (see #14541 for more details), we need to prioritize our development focus on other tasks. Therefore, this issue will be closed. In the future, if we see the need to attend to this issue again, then it will be reopened.
Thank you for your contribution!
Most helpful comment
Wanted to post my solution to this problem. It turns out that loading a web font from a remote URL will cause PhantomJS to rasterize the font in the PDF. This creates a PDF where the text cannot be highlighted, since it is an image. This causes the PDF file size to grow 10 times.
We were using Proxima Nova, and our CSS file looked like this:
To fix the issue, we installed the Proxima Nova TTF files directly onto our Ubuntu box. This means copying the TTF files to
/usr/share/fonts/truetype
, and runningfc-cache -fv
.Now we can change our CSS to just the following:
PhantomJS now treats Proxima Nova as a natively installed font, and renders a smaller sized PDF with selectable text. This is the right solution.
Note: I only encountered this problem on Linux. Mac OS worked fine.