Mastodon: bug: unicode hashtags don't work?

Created on 25 Nov 2016  ·  5Comments  ·  Source: tootsuite/mastodon

not sure if it's unicode exclusive or what but this hashtag query doesn't show any results

https://mastodon.social/web/timelines/tag/%EF%BD%81%EF%BD%85%EF%BD%93%EF%BD%94%EF%BD%88%EF%BD%85%EF%BD%94%EF%BD%89%EF%BD%83

Most helpful comment

It looks like if you put the uppercase version (AESTHETIC) into the URL, it displays the posts with that hashtag, but if you click on a hashtag it opens the lowercase version which contains nothing. It looks to me like Ruby does not lowercase that string, while JS does, hence on the server lowercase AESTHETIC is still AESTHETIC, while in the web UI it's aesthetic

All 5 comments

You're searching for the lowercase full-width text--the uppercase variant (which people actually used) works just fine. And case-insensitivity for full-width characters is probably not a useful thing to be messing with. (Probably?)

Generally all hashtags are lowercased in the DB, but I have no idea what "lowercase" actually means when talking about non-alphanumeric characters.

in that case, i guess the bug is that the hashtag links in toots link the the lowercased version instead of the working one

It looks like if you put the uppercase version (AESTHETIC) into the URL, it displays the posts with that hashtag, but if you click on a hashtag it opens the lowercase version which contains nothing. It looks to me like Ruby does not lowercase that string, while JS does, hence on the server lowercase AESTHETIC is still AESTHETIC, while in the web UI it's aesthetic

This is the case; JavaScript uses Unicode mappings for uppercasing and lowercasing while Ruby only handles ASCII. Note that this will likely change in Ruby 2.4. One solution in the meantime is to use mb_chars.downcase instead of just downcase; see http://api.rubyonrails.org/classes/ActiveSupport/Multibyte/Chars.html#method-i-downcase

Was this page helpful?
0 / 5 - 0 ratings

Related issues

valentin2105 picture valentin2105  ·  67Comments

Gargron picture Gargron  ·  121Comments

Thann picture Thann  ·  63Comments

nclm picture nclm  ·  187Comments

Laurelai picture Laurelai  ·  57Comments