Kolibri: Khmer(km) font (and other scripts with ascenders + descenders) render badly broken in Firefox

Created on 14 Dec 2020  Â·  18Comments  Â·  Source: learningequality/kolibri

Observed behavior


Screenshot from 2020-12-14 21-36-58

Khmer language rendering is not correct in latest Firefox on Ubuntu gnu/Linux. Somehow Google-chrome handles this much better.

Developer tools console in FF reports(Not sure if this related):

[WARN: kolibri/core/assets/src/utils/setupAndLoadFonts.js] Could not load full font for 'km' setupAndLoadFonts.js:78:14
In Firefox, Noto Khmer font seems to be sanitized. downloadable font: rejected by sanitizer (font-family: "noto-full" style:normal weight:400 stretch:100 src index:0) source: http://127.0.0.1:8009/static/assets/fonts/noto-full.NotoSansKhmer.400.woff

…

Expected behavior


This is how the font is rendered in latest Google-Chrome.
Screenshot from 2020-12-14 21-51-04

…

User-facing consequences

Khmer users can not use Kolibri in Firefox
…

Errors and logs

…

Steps to reproduce

…

Context

…

P0 - critical regression

Most helpful comment

Focal, Xenial, Bionic and Trusty are built and released. If you need another one, please, let me know it

All 18 comments

Seeing the same errors in Firefox on Windows.

2020-12-14_16-12-58

cc @jonboiser @nucleogenesis

@radinamatic This problem affects other south asian languages. https://kolibri-demo.learningequality.org/hi-in/learn/#/topics

After testing all the locales on the demo server in Firefox, I'm not sure if the downloadable font: rejected by sanitizer error will help, as it also appears on non South Asian languages. Sometimes it's there on the first load of the new locale, sometimes it appears if you reload the page. So far the only locales where I didn't see the error are Chinese and Korean, do they use Noto font at all?

Also worth noting that the same console message appears in Chrome too, but is treated as a warning, instead of an error as in Firefox.

However, the only locales where I managed to notice visible font break in the Firefox (not technically tofu artifacts, but rather a weird _dotted circle_) are:

  • MR
  • HI-IN
  • BN-BD
  • MY
  • KM

I've been comparing 0.14.3 and 0.14.5 and here are some big differences I'm noticing.

0.14.3

  • The woff files for Khmer, Devangari (used for HI locale), etc. are fully sized coming over the wire (e.g. around 150KB)
  • In Chrome, the UI messages all use the same font. For example, inspect an element in devtools, go to the "Computed" tab and the "Rendered fonts" section on the bottom will say "Noto Sans Khmer UI" only.

0.14.5

  • The woff files for Khmer, Devangari (used for HI locale), etc. are undersized coming over the wire (e.g. less than 1KB) and are corrupted when trying to run through font validators and other font tools like Fontforge.
  • In Chrome, the UI messages all use the same font. For example, inspect an element in devtools, go to the "Computed" tab and the "Rendered fonts" section on the bottom will say "Noto Sans Khmer UI" as well as my macOS system font for Khmer.

image

So Chrome looks like it's trying to mix and match fonts to render the message. For native readers, the message will be legible, but the "copy-paste" quality is noticeable. Firefox might be trying just to use a less aggressive strategy to salvage the message

On Firefox, the analogous font tool shows a similar difference in messages that look "broken" (they attempt to use system font), and those that are not (they only use the noto font)

image

In a nutshell, the app is broken for multiple locales for both Chrome and Firefox, it's just more obvious on Firefox

I looked at the code diff between 0.14.3 and 0.14.5 and there are no changes to our font code, so my hunch now is that Google or someone might have moved some of the online resources we use to build our fonts.

@indirectlylit

  1. First feedback from the user:

    On Wed, Dec 23, 2020 at 2:01 AM:

    we noticed something weird: Khmer fonts are displayed wrong. We didn't notice it before because it is ok in Kolibri studio and in previous hardware, now we are testing in anything but old tablet and is displayed wrong, as in this following example:

    Right font from Kolibri studio
    immagine

    wrong from Kolibri server
    immagine2

    I know you don't read khmer but I guess you can see that + symbols should not be there.

  2. My reply:

    On Mon, Jan 4, 2021 at 12:47 PM Radina Matic radina@learningequality.org wrote:

    We have been tracking the font rendering issue on Kolibri GitHub code repository, but thought it limited to the Firefox browser. Would you confirm that you are using the Firefox browser on the tablets that you mention, and if the fonts are rendered correctly on Chrome? That information would help us debug the issue.
    We don't have in-house Khmer speakers, and that makes it more challenging to notice problems in the localized version. For example, if we look at the image in attachment, we can see several instances of the '+' symbol that you mention in Firefox, but also some in Chrome. Are those also incorrect? It would be of great help if you could provide as many details as possible about your setup, like make and model of the tablet, version of the browsers used, token of the channel you are testing on, etc.

    Khmer-on-Firefox-Chrome

  3. Their last reply:

    Date: Mon, Jan 4, 2021 at 8:25 AM

    The problem is quite strange as the rendering in firefox is wrong in all platforms, however is correct in some instances of Chrome and some not. The one in your snapshot is wrong, the + symbol doesn't exist in Khmer. Debug is not easy as we have two different tablets with the same Chrome version, but only one of the two is showing the fonts properly. Thanks for the GitHub link, I will forward it to our team to follow up the issue.

I looked at the code diff between 0.14.3 and 0.14.5 and there are no changes to our font code, so my hunch now is that Google or someone might have moved some of the online resources we use to build our fonts.

The font files are pinned in this file:

https://github.com/learningequality/kolibri/blob/release-v0.14.x/build_tools/i18n/noto_source/manifest.json

The Khmer source files still download fine, and it shouldn't be possible for them to have changed:

    "NotoSansKhmer": {
      "bold_url": "https://raw.githubusercontent.com/googlei18n/noto-fonts/c30307083469f0c05e216ac75216fd454a517858/phaseIII_only/hinted/ttf/NotoSansKhmerUI/NotoSansKhmerUI-Bold.ttf",
      "reg_url": "https://raw.githubusercontent.com/googlei18n/noto-fonts/c30307083469f0c05e216ac75216fd454a517858/phaseIII_only/hinted/ttf/NotoSansKhmerUI/NotoSansKhmerUI-Regular.ttf"
    },

One thing worth comparing between cases that work and don't work is which of the following files are being loaded?

  • noto-full.NotoSansKhmer.400.woff
  • noto-full.km.modern.css
  • noto-full.km.basic.css
  • noto-subset.km.css

I can't explain why this would be a regression, but one hypothesis: in broken cases, it's using only the subset variant. Early on, I had trouble getting Hindi ligatures to display correctly for subset variants, which resulted in similar behavior to this:

image

The Khmer source files still download fine, and it shouldn't be possible for them to have changed:

Where is the TTF to WOFF conversion happening? Between the versions, the WOFF files seem to have been changed. The 0.14.5 WOFF files look corrupted because they aren't the right size

Compare https://kolibri-training.learningequality.org/static/assets/fonts/noto-full.NotoSansKhmer.400.woff

(200 B) compared with the full font file that you get when running 0.14.3

FYI, the dotted circle is unicode U+25CC, and is used as a replacement character when a ligature or diacritic-modified character cannot be fully rendered.

Where is the TTF to WOFF conversion happening?

It happens in two stages using the fontTools library.

The TTF is loaded here:

https://github.com/learningequality/kolibri/blob/7d49bed730aa7e43cf01d61739ad1f6a3dc621e3/build_tools/i18n/fonts.py#L108

and the full woff is written here:

https://github.com/learningequality/kolibri/blob/7d49bed730aa7e43cf01d61739ad1f6a3dc621e3/build_tools/i18n/fonts.py#L288-L292

We suspect that this might be caused by an issue in the build system related to git LFS and the font files. We've made a temporary change to our build system that should address the issue, and I will tag a new 0.14.6 pre-release right now to do some testing.

Thanks for working on this @jonboiser Let me know when this fix is release so I can do some user testing.

CC @TukTuk-Charity

@arky Please download and test the latest 0.14.6alpha release, and let us know if it improves the issue you've seen.

@radinamatic I have tested 0.14.6alpha release on my development machine. It does fix the font issue for Khmer and Hindi.

Once the PPA repo packages are created for this release, I push build so our team could do further testing.

hi @arky I'm doing the ppa release, what Ubuntu version do you need to be supported?

Focal, Xenial, Bionic and Trusty are built and released. If you need another one, please, let me know it

Thanks @jredrejo Our deployments are done on Raspberry Pi OS Lite (based on gnu/Debian Buster).

https://downloads.raspberrypi.org/raspios_lite_armhf/images/raspios_lite_armhf-2020-12-04/2020-12-02-raspios-buster-armhf-lite.zip

Using Buster you can choose between different ppa series. I hope you're done with the ones I released, if not, I can add the one you need.

This should be fixed in 0.14.6

Was this page helpful?
0 / 5 - 0 ratings

Related issues

cpauya picture cpauya  Â·  5Comments

jonboiser picture jonboiser  Â·  5Comments

radinamatic picture radinamatic  Â·  6Comments

mrpau-julius picture mrpau-julius  Â·  4Comments

indirectlylit picture indirectlylit  Â·  5Comments