Mastodon: Wrong character counting on Emojis

Created on 6 Nov 2019  ·  3Comments  ·  Source: tootsuite/mastodon

Some emojis (e.g. 🏴󠁧󠁢󠁳󠁣󠁴󠁿 :flag-scotland: ) are counted as multiple characters (7 for that), which should be 1.
(omg GitHub paints them all with black ink :-( )

The stringz library does not seem to support Emoji 5.0 (2017). Perhaps we should use another one for character counting to support the latest Unicode specification.
AFAIK, grapheme-splitter will do that properly.

Expected behaviour

All emojis are counted as 1 character.

Actual behaviour

Some flag emoji are counted as 2 or more characters.

Steps to reproduce the problem

Input :flag-england: , :flag-scotland: , or :flag-wales: .
image

Specifications

Mastodon: 3.0.1
Browser: Firefox 70.0.1 (Windows)

All 3 comments

Emojis are, in fact, several characters, and they are counted the same way on Twitter et al. A lot of emojis are combinations of emojis that happen to be displayed as a single image, but also, some browsers might not display emojis at all, or display them as several glyphs.

The bottom line is, while it might seem counter-intuitive, the character count is correct and should stay this way.

Yah, I understand that emoji are constructed from several codepoints. The problem is, some emoji are counted by codepoints, but others are by grapheme clusters. I'm insisting on the inconsistency.

  • 👨‍👨‍👧‍👦 consists of 7 codepoints, 1 grapheme cluster (so counted as 1)
  • :flag-scotland: also consists of 7 codepoints, 1 grapheme cluster (but counted as 7)

By the way, the lengths of posts are always counted by grapheme cluster in backend.
Just replacing stringz with grapheme-splitter does not work?

Sorry, I've read your comment more generally. You are raising a valid point then if there is a discrepancy between backend and frontend.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Lewiscowles1986 picture Lewiscowles1986  ·  3Comments

sorin-davidoi picture sorin-davidoi  ·  3Comments

marrus-sh picture marrus-sh  ·  3Comments

golbette picture golbette  ·  3Comments

lauramichet picture lauramichet  ·  3Comments