I'm having issue #409 with weasyprint 44. That issue is closed and I don't see how I could reopen it. I've described what's happening in that issue for now.
Unfortunately, I can't reproduce your error, I get a beautiful PDF. Do you have the problem when setting other fonts in your CSS file?
That's interesting. The crash only triggers with white-space:nowrap and with either font-family:"Trebuchet MS" or font-family:Arial for me. The Trebuchet MS and Arial fonts are from Microsoft's old TrueType core fonts for the Web from a long time ago. Microsoft doesn't offer these for download anymore, but the Wayback Machine links still work. Extracted the exe files with cabextract.
It might be that the fonts themselves are broken, but even if so, that still shouldn't cause WeasyPrint to crash.
Even with the fonts I can't reproduce. I've set font-family to Trebuchet Ms or Arial, but everything works as expected, and the fonts embedded in the PDF files are the Microsoft fonts.
Is it possible for you to edit the /usr/local/lib/python3.7/site-packages/weasyprint/layout/tables.py file and add the line print(guess, assignable_width) on line 588 (before if upper_guess == lower_guess:, with the same number of spaces before)? Relaunching WeasyPrint should give you numbers that will help me to understand what's wrong.
Sure, here's what I'm getting immediately before the traceback:
[20.73333333333333, 21.9947265625, 141.26666666666665, 31.93333333333333, 114.06666666666666, 143.9333333333333, 20.2, 20.2, 20.2, 20.2, 25.53333333333333, 18.066666666666666, 18.066666666666666, 18.066666666666666, 18.066666666666666, 23.4] 662.5947265625002
Ooops, I mean print(guesses, assignable_width).
([20.73333333333333, 21.9947265625, 141.26666666666665, 31.93333333333333, 114.06666666666666, 143.9333333333333, 20.2, 20.2, 20.2, 20.2, 25.53333333333333, 18.066666666666666, 18.066666666666666, 18.066666666666666, 18.066666666666666, 23.4], [20.73333333333333, 21.9947265625, 141.26666666666665, 31.93333333333333, 114.06666666666666, 143.9333333333333, 20.2, 20.2, 20.2, 20.2, 25.53333333333333, 18.066666666666666, 18.066666666666666, 18.066666666666666, 18.066666666666666, 23.4], [20.73333333333333, 21.9947265625, 141.26666666666665, 31.93333333333333, 114.06666666666666, 143.9333333333333, 20.2, 20.2, 20.2, 20.2, 25.53333333333333, 18.066666666666666, 18.066666666666666, 18.066666666666666, 18.066666666666666, 23.4], [20.73333333333333, 21.9947265625, 141.26666666666665, 31.93333333333333, 100.73333333333333, 143.9333333333333, 20.2, 20.2, 20.2, 20.2, 25.53333333333333, 18.066666666666666, 18.066666666666666, 18.066666666666666, 18.066666666666666, 23.4]) 662.5947265625002
Thank you.
OK, we don't have a floating point rounding error, which is good news. Minimum content width is larger than maximum content width, I have to understand why.
I'm afraid it's a duplicate of #628. You only get the problem with white-space:nowrap because it makes the string ' A' (space + A) appear in some cells when calculating minimum size of cells, instead of putting the words starting with 'A' at the beginning of a new line.
There IS something wrong with some MS fonts and Pango.
I can reproduce #628 with font-family: "Trebuchet MS" as well, but that doesn't crash WeasyPrint. So why does WeasyPrint crash in this case?
Minimum content width is larger than maximum content width, I have to understand why.
Who calculates those widthes? Is this another outcome of a dissent between PangoLayout and PangoLogAttr like in #614?
BTW: Cannot reproduce the crash, either.
First of all, I'm not sure yet it's a duplicate of #628.
So why does WeasyPrint crash in this case?
Because this use case makes the table layout crash, but #628 doesn't involve the table layout.
Who calculates those widthes?
I have to check that.
I'd love to understand what's wrong. I can't reproduce #628 on my Gentoo Linux computer but can on my Ubuntu one. I'm not able to reproduce #770 anywhere.
@andreas-gruenbacher could you please tell what your Linux distribution is (with the version)?
Fedora 29. WeasyPrint has been installed with pip3 install WeasyPrint.
I'd love to understand what's wrong.
Looks like it's something special in the 5th column, but I don't see nothing special in the 5th column
Packages in Fedora 29 seem to be up to date, just as on my Gentoo computer. Why can't I reproduce this problem?
Looks like it's something special in the 5th column, but I don't see nothing special in the 5th column
Neither do I.
@andreas-gruenbacher One (last :innocent:) question: could you provide the whole traceback with the prints, so that we can know which table has the problem?
Sure, I've already posted that in #409.
Sure, I've already posted that in #409.
With the print(…) line added in tables.py, it would be perfect :wink:.
Can anyone explain what the code below # We have to work around floating point rounding errors here in auto_table_layout is trying to achieve? This looks rather misguided.
Can anyone explain what the code below
# We have to work around floating point rounding errors hereinauto_table_layoutis trying to achieve
We have a tuple like ([1, 2], [3, 4], [5, 6], [7, 8]), corresponding to possible column widths found according to various variables. We also have an "assignatable width", let's say 8. We want to guess the list whose sum is the greatest but lower than 8 (it's [3, 4]), and the list whose sum is the lowest but greater than 8 (it's [5, 6]).
The code only works when the tuple is sorted (sum([1, 2]) <= sum([3, 4]) <= …) and when assignatable_width is between the first and last lists. It should logically always be the case: if it's not, there's a bug somewhere.
We could of course silently ignore this bug (for example, setting an original value for lower_guess and upper_guess), but that wouldn't solve the root of our problem and only make further debugging even harder.
YEAH. I can reproduce the bug with Trebuchet.
Minimal HTML sample:
<table>
<th style="font-family: Trebuchet MS; white-space: nowrap">
Fahrzeug
</th>
</table>
The famous mystery of the 5th column :fearful:…
OK, I understand what's going on. The problem appears when two letters have a special kerning value ("Fa" here, but also crashes with famous couples like "LV", "FA", etc).
AND OF COURSE " A".
...and the solution is: DONT USE UPPERCASE, especially when there is a 5th column
We could of course silently ignore this bug
It seems like you're close to finding out what's causing the bug and so it can hopefully be fixed. Maybe the algorithm could be made a bit more robust to perhaps only warn when this kind of problem occurs in the future?
Good luck @liZe! Me myself get brain damages when reading the layout source code :smirk:
It's difficult to find where the bug comes from, but I'm now 99% sure it's not a bug in WeasyPrint. The same layout created twice with the same parameters doesn't give the same result. Setting this code twice in text.py (line 987) solves the problem:
layout = create_layout(
text, style, context, max_width, justification_spacing)
first_line, index = layout.get_first_line()
It's probably a bug in Cairo / Pango / FontConfig / Freetype / etc. @andreas-gruenbacher Is it possible for you to give the version of these libraries on your system?
Maybe the algorithm could be made a bit more robust to perhaps only warn when this kind of problem occurs in the future?
Yes, if there's a bug in an external library, we'll have to add a workaround.
@liZe the versions of which packages do you want to know exactly? Isn't it good enough for now that you can reproduce?
I can reproduce your text.py hack from above. It also seems a bit weird that split_first_line is called six times in the minimal HTML example: with text = 'Fahrzeug ', ' Fahrzeug ', ' Fahrzeug', and then again 'Fahrzeug ', ' Fahrzeug ', ' Fahrzeug'. If I remove all the whitespace around 'Fahrzeug' in the minimal HTML example, that reduces to two calls with text = 'Fahrzeug'.
The same layout created twice with the same parameters doesn't give the same result
... first result being erroneous ... minimum column widths being calculated first ...
I vote for a workaround not along the lines of if pango-or-whatever-library-version == ???: but rather sth. like if sum(min-values) > sum(max-values): or an original value for lower_guess / upper_guess
the versions of which packages do you want to know exactly? Isn't it good enough for now that you can reproduce?
Cairo, Pango, FontConfig and Freetype. I'd like to compare with the ones on my systems to find where the bug comes from.
It also seems a bit weird that split_first_line is called six times in the minimal HTML example […]. If I remove all the whitespace around 'Fahrzeug' in the minimal HTML example, that reduces to two calls
You can't imagine the incredible amount of work needed to find the size of cells in a table (a hint). You at least need the min-content width and the max-content width of the cell, that's why you need at least two calls.
Handling white spaces also requires some work, that's why you get extra calls when there are spaces and empty lines. Of course, there are lots of "useless" function calls in WeasyPrint, but it's often hard to remove them :smile:.
I vote for a workaround not along the lines of
if pango-or-whatever-library-version == ???:but rather sth. likeif sum(min-values) > sum(max-values):or an original value forlower_guess/upper_guess
Of course.
cairo-1.16.0-3.fc29.x86_64
cairo-gobject-1.16.0-3.fc29.x86_64
cairomm-1.12.0-9.fc29.x86_64
fontconfig-2.13.1-3.fc29.x86_64
freetype-2.9.1-6.fc29.x86_64
pango-1.42.4-1.fc29.x86_64
pangomm-2.40.1-6.fc29.x86_64
python3-cairo-1.17.1-2.fc29.x86_64
python3-cairocffi-0.7.2-15.fc29.noarch
python3-cairosvg-1.0.20-9.fc29.noarch
/usr/local/lib/python3.7/site-packages/ contains the following versioned directories:
cairocffi-0.9.0-py3.7.egg-info
cssselect2-0.2.1.dist-info
tinycss2-0.6.1.dist-info
WeasyPrint-44.dist-info
@andreas-gruenbacher thanks a lot.
I have exactly the same versions on my Gentoo box that works. Options left are font configuration and different compilation flags…
It should be fixed now, but I really need to know the real reason for this (and fix #628).
Almost, but the assertion below if upper_guess == lower_guess in tables.py triggers now. Once removed, WeasyPrint no longer crashes.
Almost, but the assertion below
if upper_guess == lower_guessin tables.py triggers now. Once removed, WeasyPrint no longer crashes.
Fixed in 89041e8, thanks.
This is working for now, thanks.
Most helpful comment
...and the solution is: DONT USE UPPERCASE, especially when there is a 5th column