Gitea: Replace emojify.js with openmoji set

Created on 27 Nov 2019  路  39Comments  路  Source: go-gitea/gitea

emojify.js is not maintained anymore.

At some point we would need to switch to another lib to manage emoji (or maintain a fork).
The lib emojify.js does two think :

  • replace :emoji: with tag
  • provide a set of emoji

I suggest to move to openmoji for the emoji set.

It seems to be a large choice with good quality : https://openmoji.org/library/

kinproposal kinui

Most helpful comment

I would actually prefer Google Noto Color Emoji - https://emojipedia.org/google/ https://github.com/googlefonts/noto-emoji
And it's Apache 2.0 licensed

All 39 comments

Not bad, but we ned to write own code - openmoji is no drop in place solution it is only a colection of emojis

@6543 Yes, that why I distinguish the two roles of emojify.

I wanted to let the first point up to discussion as I don't know if there is a lib that do it. Otherwise, I would plan on forking emojify based on openmoji set + capacity to add specific emoji like the gitea one at "runtime".

I think a custom implementation is the way to go, backed by a minimal name-to-unicode map like https://github.com/muan/emojilib. Most modern browser support unicode emoji so we could get away with just supporting unicode (+custom graphics).

If image fallbacks are still desirable, maybe https://github.com/iamcal/js-emoji.

On a sidenote, emojilib is probably too big to be included:

https://bundlephobia.com/[email protected]

I guess we'd need something more trimmed down with just name to unicode maps without the extra fluff.

I don't think we can use openmoji as it is licensed under cc-by-sa 4.0 that is compatible only with GPLv3 if I understand it correctly :(

@lafriks They mention that it just need a mention somewhere https://github.com/hfg-gmuend/openmoji/blob/master/README.md#attribution-requirements
But I understand that it is not compatible on code-source with MIT but it would be compatible with distributing along. It just don't have to be under the MIT license and in gitea code.

I don't have any opinions about which emoji library to use, but regarding how to implement it, I'd advise against parsing the emoji on the server (when comments are submitted or when the comments are rendered), i.e. I'm in favor of on-the-fly emoji detection that the user can control and nullify (e.g. using backspace). My reasons:

  • I never hit the right name for an emoji except the most basic 馃榿. Relying on the submitted message to check whether I've typed in the right name will be very frustrating and lead to a lot of editions.
  • The user will have to be extra careful to avoid typing anything that will become an emoji without realizing it. Furthermore, as we won't have control on the actual names for the emojis (they will come from a lib), we'll be hacking awfully to circunvent some annoying case.
  • The escaping to avoid emojization (e.g. \:flags: vs. :flags:) is yet one more syntax to learn and it's not really Markdown. We should support escaping if we force emojization.

Another question we have to ask is whether we want to store emoji as unicode in the database. I think GitHub does it that way and it would make data more portable at the expense of requiring the user to have Unicode support in the DB which may not be given on older MySQL installations.

@silverwind I would prevere the :keyword: being stored in the DB as it is now

We could support both if we used the openmoji font (or any other emoji font) and replace in ui :keyword: with unicode. The :keyword: format is also used in commit message generally. This will also assure backward compat. This could also be implemented in the markdown rendering.

Sorry I linked wrong issue

Just a note: there's also https://twemoji.twitter.com/ which I find much prettier than openmoji. The current emoji set is definitely not very attractive IMO. Would be great to upgrade to a nice new set eventually.

I would actually prefer Google Noto Color Emoji - https://emojipedia.org/google/ https://github.com/googlefonts/noto-emoji
And it's Apache 2.0 licensed

We could maybe use (or develop if not existing) a lib that replace :smile: to the corresponding char when posting. That way the administor can import what emoji font it wants (or use the default OS one).

@sapk I already proposed to use something like https://github.com/muan/emojilib to convert tokens to unicode.

I think we should just use the system-provided emoji font as modern OS come with a emoji font. GitHub for example just uses Apple Color Emoji,Segoe UI Emoji (which may exempt Linux, thought).

I think we should just use the system-provided emoji font as modern OS come with a emoji font. GitHub for example just uses Apple Color Emoji,Segoe UI Emoji (which may exempt Linux, thought).

Yes, Linux needs a custom font.

It looks like Ubuntu comes with Noto Color Emoji according to this so a font stack of Apple Color Emoji,Segoe UI Emoji,Noto Color Emoji might cover all major platforms.

System fonts are the way to go in my opinion. Loading a custom web font just for emoji would impact performance and I guess users will expected to see the emojis in their platform-native form.

System fonts are the way to go in my opinion. Loading a custom web font just for emoji would impact performance and I guess users will expected to see the emojis in their platform-native form.

Then why does every major website provide custom fonts? I think there are many cases for a decent fallback, not just Linux that isn't Ubuntu (which, by the way, isn't a vast majority of Linux users either).

every major website

I'd strongly doubt that. Github for example does not rely on a single web font for emoji or regular text which is ideal for performance and eliminates issues like FOUT.

Yes, GitHub is using PNG images instead. I didn't mean that it has to be a webfont, but that you don't want to rely on native emoji fonts only.

GitHub is using PNG

They use Unicode like they should. PNG is only a fallback used on legacy Platforms (I think Windows XP and earlier):

html <g-emoji alias="+1" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f44d.png" class="emoji mr-1">馃憤</g-emoji>

I'm thinking we just need a config option like USE_WEBFONTS which would enable webfonts for both text and emojis. When disabled, fall back to a system font stack like this one.

thinking about a implementing a function and register it to the router ...
so you could exec it in the template and it produce an html like @silverwind menitioned?

{{emojihtml $reaction}} -> {{emojihtml +1}} -> <g-emoji alias="+1" fallback-src="https://github.githubassets.com/images/icons/emoji/unicode/1f44d.png" class="emoji mr-1">馃憤</g-emoji> ?

EDIt: o right - this only solve reaction emojis ... but in generel what do you think of an go solution as plugin to goldmark ...?

what do you think

If it can be done server-side, that would be ideal. No flashes of wrong content and best performance. I guess you only need a module to map tokens to unicode codepoints.

@silverwind but we cant change all emojis to utf8? -> :rocket: , :gitea:, ...

Yeah, custom ones would need a image. Also have to keep in mind that there are places outside of Markdown that have them, so the replacement would need to run at a higher level, maybe as some sort of post-processing step after template rendering.

They use Unicode like they should. PNG is only a fallback used on legacy Platforms (I think Windows XP and earlier):

I'm on the very latest Arch Linux release with modern GNOME 3, which also has a native emoji font. I'm getting PNG images on GitHub and basically every other major website. And I also find them nicer to look at than my system's own font.

Edit: custom emojis are also a great use case for providing the fallback. Check out e.g. the custom emoji feature in Mastodon to see what I mean. Would be awesome to be able to configure your own additional emojis on every Gitea instance.

I think GitHub does User-Agent sniffing for the decision on whether they serve unicode emoji or fallback images. Try changing to a recent Windows/Mac UA and see if you get Unicode.

A major benefit of unicode is that one can copy-paste the text with emojis included (sometimes, they are important to the actual discussion).

Try changing to a recent Windows/Mac UA and see if you get Unicode.

I do not have any other machine with me (while traveling) that would run those operating systems and I also wouldn't buy a Windows license just to test if GitHub delivers emojis in a different way to Windows users.

I just checked what Twitter does, and they're using SVG images with the unicode emoji as alt attribute, which I can also copy and paste as unicode emoji.

Example: This tweet contains emoji 馃く with image https://abs-0.twimg.com/emoji/v2/svg/1f92f.svg

Edit: for inline emojis (as opposed to comment reactions e.g.), GitHub uses a Twemoji webfont.

@silverwind was talking about changing your User-Agent which is a value send by your browser that can be changed to be seen as a different os, browser, architecture, ... and not your computer.

Ah, stupid me. Didn't have coffee yet. :)

The alt attribute also permit screen reader to understand the emojis.

Facebook uses PNG images with no alt attribute, but a hidden plaintext fallback:

<span class="_47e3 _5mfr" title="unsure emoticon">
  <img class="img" role="presentation" src="https://static.xx.fbcdn.net/images/emoji.php/v9/t83/1.5/30/1f615.png" alt="" width="30" height="30">
  <span aria-hidden="true" class="_7oe">:/</span>
</span>

Edit: they also use an actual unicode fallback for other emojis, so they're not all the same:

<span style="height: 32px; width: 32px; font-size: 32px;
  background-image: url(https://static.xx.fbcdn.net/images/emoji.php/v9/tf9/1.5/32/2764.png);"
  class="_6qdm">
  鉂わ笍
</span>

The alt attribute also permit screen reader

That could be solved by a wrapper tag <span alt="camel">馃惇</span>, ideally done in the template code. Thought I guess screen readers ought to have some brains to recognize the code points.

I still think we can do Unicode without fallbacks (except for custom images). Checking http://caniemoji.com/, the only relevant browser targets lost when not serving fallbacks are:

  • Chrome on Windows 8 or earlier
  • Edge on Windows 8 or earlier

From what I know screen reader understand unicode in alt attr (even if it make weird definition/translation) and that why twitter put it there.

I still think we can do Unicode without fallbacks (except for custom images). Checking http://caniemoji.com/, the only relevant browser targets lost when not serving fallbacks are:

Sure, but at least ship an emoji webfont then, like GitHub does.

ship an emoji webfont then, like GitHub

GitHub does not ship any webfonts. Where are you getting this from?

Ah, sorry. I hadn't inspected it well enough. Mozilla is shipping the Twemoji font with Firefox apparently.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

thehowl picture thehowl  路  3Comments

jonasfranz picture jonasfranz  路  3Comments

kolargol picture kolargol  路  3Comments

adpande picture adpande  路  3Comments

kifirkin picture kifirkin  路  3Comments