Weblate: Adding Arabic Tatweel and related Tatweel check

Created on 15 Oct 2018  路  4Comments  路  Source: WeblateOrg/weblate

Hi,
Arabic Tatweel U+0640 and related tatweel letters are decorative letters and usually does not pass spell checking and are useless in a translation and pollute TMs. Can we add a check for them to inform users to avoid using them ?

http://graphemica.com/%D9%80
Something like in : https://hosted.weblate.org/checks/
Regards,

enhancement good first issue hacktoberfest help wanted

All 4 comments

So this char should not be used at all? Can you please post some reference on that?

I've just quickly checked translations on Hosted Weblate service and it's used in 314 translations right now.

Hi,
In this document, please search for Kasheeda or Tatweel :
http://thesai.org/Downloads/Volume8No2/Paper_37-Sentiment_Analysis_Challenges_of_Informal_Arabic_Language.pdf

RFC5564, _section 2.1.2. Kasheeda or Tatweel (Horizontal Character Size Extension)_ in domain names.

I think that this section of the RFC should be applied in all translation plateforms. Users can still use Tatweel or Kasheeda in arabic texts in a Word Processor such as LibreOffice of LateX but if the words are extracted to be normalized, it is recommended to remove U+0640 and it variant ligatures :
{U+FCF2, U+FCF3, U+FCF4, U+FE71, U+FE77, U+FE79, U+FE7B, U+FE7D, U+FE7F};
Regards,

Thanks for adding more detailed information! I think such check certainly makes sense, it should be easy implement, there is similar check for zero width space which could be used as a inspiration:

https://github.com/WeblateOrg/weblate/blob/de272e64cf5ab4bfe80c02df0fca9e4d238581e4/weblate/checks/chars.py#L336-L346

Thank you for your report, the issue you have reported has just been fixed.

  • In case you see problem with the fix, please comment on this issue.
  • In case you see similar problem, please open separate issue.
  • If you are happy with the outcome, consider supporting Weblate by donating.
Was this page helpful?
0 / 5 - 0 ratings

Related issues

tariver picture tariver  路  4Comments

nijel picture nijel  路  3Comments

nblock picture nblock  路  5Comments

mlaggner picture mlaggner  路  3Comments

nijel picture nijel  路  3Comments