Weblate: Provide one unified JSON format [$31 awarded]

Created on 27 Jul 2018  路  19Comments  路  Source: WeblateOrg/weblate

The JSON format is not really useful on it's own - the simple JSON files work as well with JSON nested and it just confuses user to choose JSON when they actually want nested variant.

Thinkgs to do:

  • [ ] Remove existing JSON format
  • [ ] Rename JSON nested to JSON
  • [ ] Add migration to change setting for existing components
bounty enhancement

Most helpful comment

I've released translate-toolkit 3.1.0 fixing this issue today. Weblate 4.2 users should be able to benefit from this after upgrading translate-toolkit, Weblate 4.3 will unify the formats into one.

All 19 comments

As an input to this change:

We are using JSON format instead of JSON nested because JSON nested will transform keys with a dot into an object structure. To really remove the old JSON format it needs to more strictly obey the source files. So when a key in the source file is not an object it should never explode the dots and create an object.

{
  "foo.bar": "translation text"
}

vs.

{
  "foo": {
    "bar": "translation text"
  }
}

Good point, that's reason to keep both formats for now. On the other side, there should be just one JSON format being able to represent even files like this:

{
  "foo": {
     "bar.baz": "text"
  }
}

So in the end this boils down to fixing translate-toolkit: https://github.com/translate/translate/issues/3819

The other issue with JSON is that it parses lists as key[0]..., but this is can also be a key in JSON file, so following will not round trip:

{
   "day[0]": "Monday",
   "day[1]": "Tuesday"
}

And on saves it becomes:

{
   "day": [
      "Monday",
      "Tuesday"
   ]
}

One possibility of handling this better in translate-toolkit: https://github.com/translate/translate/pull/3835

@nijel I just observed the mentioned list behavior: Input key[0], Output "key": [ ... and wanted to file a new issue.

I guess this will be solved as well when implementing this isssue?

@rhofer That might as well be https://github.com/translate/translate/pull/3836 if you are not on translate-toolkit 2.4.0. Are you using nested json or not?

@nijel , yes, it's nested json.

@nijel this problem becomes more and more cumbersome since I have continuously growing number of HTML5 projects onboarding for translation. Do you see a way to push the resolution of this "bug"?

I would also love to have proper support for nested JSON. Currently, I have to patch translate-toolkit myself to perform proper deflattening of deeply nested structures.

@sr258 If you have a patch, why not to have it upstream via pull request on translate-toolkit directly?

The way it un-flattens the JSON would break the cases mentioned above (dots in property names would become nested objects and brackets with numbers in property names would become arrays). It only fixes the broken un-flattening of objects, but doesn鈥榯 resolve the problem of escaping characters. I鈥榤 also not 100% sure if it might break things in other uses I鈥榤 not aware of. Further, I don鈥榯 really know Python und just hacked a few things together to get things working for my use case, so I don鈥榯 think it鈥檚 PR ready. If I have the time I鈥榣l try revisiting the issue and write a proper PR. Then I wouldn鈥榯 have to worry about the patch...

If your鈥榬e interested in the patch BTW you can find it here: https://github.com/sr258/translate/tree/2.4.0-unflatten-json

Unfortunately, I found another nasty side effect:

First source file update

In a first go, the developer added the days of week to the source file as following:

"BASIC_TEXTS": {
   "WEEK[0]": "Sunday",
   "WEEK[1]": "Monday",
   ...
}

As we know, this enjoys a buggy change for a target translation when json export happens, to:

"BASIC_TEXTS": {
   "WEEK": [
      "Sonntag",
      "Montag,
      ...
   ]
}

Second source file update

Later on, the developer saw a need to add the string "Week" to his UI and did it in the source file as following:

"BASIC_TEXTS": {
   "WEEK[0]": "Sunday",
   "WEEK[1]": "Monday",
   ... ,
   "WEEK": "Week"
}

For a target translation, this ended up in removal of the days of week by Weblate and only "Week" remained. Result looks as following:

"BASIC_TEXTS": {
   "WEEK": "Woche"
}

I guess, from an export perspective "WEEK": [Sonntag, Montag, ...] and "WEEK": "Woche" delivers a value for the same key and somehow last wins principle is applied.

@nijel can you confirm my assumption?

@rhofer Yes, your assumption is correct. This all comes from mapping structured data into text field and then reconstructing it again. The translate-toolkit itself contains at least three different implementation of this (see https://github.com/translate/translate/issues/3819) and each of them suffers similar problems.

The proper solution is mostly needed on the translate-toolkit side:

  • Extend the translate-toolkit API to include rich metadata on the structure in addition to the text key (described in https://github.com/translate/translate/issues/3819#issuecomment-449043811)
  • Utilize this information on saving round trip in affected formats

The https://github.com/translate/translate/pull/3835 was wrong approach, as it changed existing API and that makes it too hard to use.

Additionally, we might need to store this in Weblate, but that would be only needed to handle quite unlikely conflicts in the text representation of the keys. To base on your example it would be needed to handle this:

"BASIC_TEXTS": {
   "WEEK[0]": "Sunday",
   "WEEK": [
      "Sonntag",
   ]
}

The most important bit for this is https://github.com/translate/translate/pull/4091. Once that is merged and released in the translate-toolkit, Weblate can depend on the updated version.

I've released translate-toolkit 3.1.0 fixing this issue today. Weblate 4.2 users should be able to benefit from this after upgrading translate-toolkit, Weblate 4.3 will unify the formats into one.

Thank you for your report, the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, consider supporting Weblate by donating.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

nijel picture nijel  路  3Comments

WTBenjamin picture WTBenjamin  路  4Comments

rvanlaak picture rvanlaak  路  5Comments

yarons picture yarons  路  4Comments

agaida picture agaida  路  4Comments