Configuration:
Steps to reproduce the problem:
What went wrong?
Getting this:
OpenSansisahumanistsansseriftypefacedesignedbySteveMatteson.OpenSanswasdesignedwithanuprightstress,openformsandaneu-tral,yetfriendlyappearance.Itwasoptimizedforprint,web,andmobileinterfaces,andhasexcellentlegibilitycharacteristicsinitsletterforms(seefigure聛onthefollowingpage).ThisfontisavailablefromtheGoogleFontDirectory[聛]asTrueTypefileslicensedundertheApacheLicenseversion聜.聙.
Instead of this:
Open Sans is a humanist sans serif typeface designed by Steve Matteson.
Open Sans was designed with an upright stress, open forms and a neutral,
yet friendly appearance. It was optimized for print, web, and mobile
interfaces, and has excellent legibility characteristics in its letterforms (see
figure 1 on the following page). This font is available from the Google Font
Directory [1] as TrueType files licensed under the Apache License version 2.0.
This is quite significant problem, because I would say 1/20 of scientific papers encounter this problem.
It is still happening to you? I tested here and i coundt see this bug
Yeah, the problem still exists. Just try to open the attached 1.pdf
in https://mozilla.github.io/pdf.js/web/viewer.html and copy -> paste text. I tried out with the current master on Chrome on both Ubuntu and Android. I think it should be the same whatever system you would use.
Hi,
I am facing the same issue
Any update on this.
Space between words is removed for paragraphs with style="text-align: justify;"
each word is rendered in separate span. When i copied text from pdf to notepad spaces are removed.
It works fine without this style.
We are converting html from ckeditor to pdf, user expect pdf renders exactly same ways as it shown in ckeditor(WYSIWYG). I tried pdfjs-2.1.266-dist, problem is consistent. Please help.
Still facing this issue.
Any updates/ workarounds on this ?
Need to mention that this issue exits right in the default viewer.
There's no space between lines and
is pasted as the systemidentifieshot(
how do chrome and acrobat handle this?
Any solution for that?
I need to open some PDFs in Okular (Linux) that copies the text just fine that is missing spaces in Firefox :(
Running 76.0.1 (64-bit) on Mac OS and it happens still. Download the PDF and open it in Preview and copy-paste includes spaces, as expected.
Possible hotfix that _may_ help here https://github.com/mozilla/pdf.js/issues/7310#issuecomment-530713483 ? YMMV
I think I've found an issue in core/evaluator.js:2043-2056.
if (spaceWidth) {
textContentItem.spaceWidth = spaceWidth;
textContentItem.fakeSpaceMin = spaceWidth * SPACE_FACTOR;
textContentItem.fakeMultiSpaceMin = spaceWidth * MULTI_SPACE_FACTOR;
textContentItem.fakeMultiSpaceMax = spaceWidth * MULTI_SPACE_FACTOR_MAX;
// It's okay for monospace fonts to fake as much space as needed.
textContentItem.textRunBreakAllowed = !font.isMonospace;
} else {
textContentItem.spaceWidth = 0;
textContentItem.fakeSpaceMin = Infinity;
textContentItem.fakeMultiSpaceMin = Infinity;
textContentItem.fakeMultiSpaceMax = 0;
textContentItem.textRunBreakAllowed = false;
}
This if-else is not correct I think, the spaceWidth should be always grater than zero by definition, setting a fallback value equals to the min width of the current font solves the issue for the test pdf.
The fix that worked for me:
var fontMinWidth = Math.min.apply(null, font.widths.filter(w => !!w));
var spaceWidth = (fontMinWidth / 1000) * textState.fontSize;
Facing similar issue:
{
str: 'wykonanaprzezBibliotek臋Narodow膮zegzemplarzapochodz膮cegozezbior贸wBN.',
dir: 'ltr',
width: 250.77697646469932,
height: 8.012811977655979,
transform: [Array],
fontName: 'g_d0_f1'
},
any ideas?
Most helpful comment
how do chrome and acrobat handle this?