Weblate: Weblate and `lupdate` fighting over XML entities/non-breaking spaces.

Created on 25 Apr 2021  Â·  6Comments  Â·  Source: WeblateOrg/weblate

Describe the issue

It looks like Qt linguist's lupdate command prefers HTML entities and Weblate prefers the effective UTF-8 character.

I regularly run lupdate qml/ -ts translations/*.ts on my project (in CI), and use Weblate to translate.
Weblate regularly sends me diffs of the form:

-        <translation>Whisperfish neu starten&#xa0;…</translation>
+        <translation>Whisperfish neu starten …</translation>

and lupdate reverts that:

-        <translation>Whisperfish neu starten …</translation>
+        <translation>Whisperfish neu starten&#xa0;…</translation>

Since I'm running lupdate in CI (for line number tracking, mostly), this is getting quite noisy.
Especially since French seems also affected for some HTML entities.

I already tried

Describe the steps you tried to solve the problem yourself.

  • [x] I've read and searched the docs and did not find the answer there.
    If you didn’t try already, try to search there what you wrote above.

To Reproduce the issue

Steps to reproduce the behavior:

  1. Have a project with German and a  
  2. Run lupdate manually
  3. Change any unrelated string in German
  4. Have Weblate trigger a commit

Expected behavior

On of the following:

  • lupdate shouldn't alter the entities
  • Weblate should only trigger an update in the commit for the relevant string if the string has actually changed.
  • Weblate can be configured to prefer encoding the entity instead of the raw UTF-8 character.

Screenshots

n.a., I think the diffs above should be enough

Exception traceback

n.a.

Server configuration and status

weblate.org

Additional context

Public and open source project, here are the links:

enhancement translate-toolkit

All 6 comments

This issue looks more like a support question than an issue. We strive to answer these reasonably fast, but purchasing the support subscription is not only more responsible and faster for your business but also makes Weblate stronger. In case your question is already answered, making a donation is the right way to say thank you!

The file formats are handled by https://github.com/translate/translate/, for most formats full serialization is done even on change, so these kind of differences can happen. In this particular case I'd prefer to fix translate-toolkit to save the strings in a same way as lupdate does. The question is what are the rules here - what should be stored as entities...

whisperfish/translations
❯ rg '&..?.?.?;' -oI --no-heading | sort | uniq 
&apos;
&quot;
&#xa0;

Honestly, given that my files have a UTF-8 XML header, I would expect Qt Linguist to be a good citizen and print them as UTF-8 too, but I don't think I'd be able to pull that off with Qt.

The issue you've reported needs to be addressed in the translate-toolkit. Please file the issue there, and include links to any relevant specifications about the formats (if applicable).

Thank you for your report; the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.
Was this page helpful?
0 / 5 - 0 ratings