Attach (recommended) or Link to PDF file here:
test4.pdf
Configuration:
Steps to reproduce the problem:
What is the expected behavior? (add screenshot)
No errors.
What went wrong? (add screenshot)
One path in the output got split into 2 parts.
The initial "M" is in one node and the following "L" is in the next node.
These should be in the same node.
Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
This may be related to bug #9167.
I think the culprit is here:
https://github.com/mozilla/pdf.js/blob/a045a00af34b764edda5991d2bcd18541ed60536/src/core/operator_list.js#L533-L534
I'm not very familiar with the workings of OperatorList
but it looks like operator lists are split into chunks of about 1000 operators. Sometimes the chunk boundary is placed in the middle of a PDF path definition. This produces two OPS.constructPath
operators and the latter one doesn't start with a moveTo
.
Does this sound plausible?
Instead of modifying OperatorList
and its constant CHUNK_SIZE
, svg.js should be fixed. Maybe like this: Consecutive OPS.constructPath
operators should be combined into one <svg:path>
node if there is no intervening path painting operator...
I'm not very familiar with the workings of
OperatorList
but it looks like operator lists are split into chunks of about 1000 operators.
Correct; this is to allow rendering to begin before the entire OperatorList
(i.e. the page) has been parsed, thus reducing overall time needed from the page loading to it being fully rendered.
Does this sound plausible?
Yes, and it should be easy to verify (just increase the value a lot, effectively disabling this chunking).
Instead of modifying
OperatorList
and its constantCHUNK_SIZE
, svg.js should be fixed.
Agreed, modifying the constant is definitely not an acceptable solution. First of all, it would do nothing more than move the error elsewhere. Second of all, and much more importantly, changing it could have far-reaching implications for the general rendering performance in the canvas back-end.
Maybe like this: Consecutive
OPS.constructPath
operators should be combined into one<svg:path>
node if there is no intervening path painting operator...
Again, that sounds totally reasonable.
I can confirm that if I increase CHUNK_SIZE to 10000000 then the problem goes away.
(and I agree that this isn't a proper solution)
Thanks for your help.
Most helpful comment
Correct; this is to allow rendering to begin before the entire
OperatorList
(i.e. the page) has been parsed, thus reducing overall time needed from the page loading to it being fully rendered.Yes, and it should be easy to verify (just increase the value a lot, effectively disabling this chunking).
Agreed, modifying the constant is definitely not an acceptable solution. First of all, it would do nothing more than move the error elsewhere. Second of all, and much more importantly, changing it could have far-reaching implications for the general rendering performance in the canvas back-end.
Again, that sounds totally reasonable.