Thelounge: url preview for pages with non-unicode encoding needs conversion logic

Created on 4 Feb 2019  路  4Comments  路  Source: thelounge/thelounge

URL preview does not convert charset for pages with non-unicode encoding.

Sample URL that shows the problem: https://freak.no/forum/forumdisplay.php?f=45

The error is shown in the meta-tags "description" and/or "og:title". One of these is used for the URL preview which comes out wrong.

Bug help wanted

All 4 comments

That particular site sends content-type: text/html; charset=ISO-8859-1

So the fix would be converting it to UTF-8.

Hi,
I looked into this issue and the only way I see it can be fixed is the following:
Get page encoding from html head (should be no problem with the already used cheerio lib) and maybe use the already added is-utf8 lib
Use iconv to convert if not utf-8

This would however add some overhead.
I looked at some popular norvegian sites and they all use utf-8. This particular example seems like an older site.
@xPaw what is your opinion?

The way I see this is there are two options:

  1. implement this using the added lib iconv
  2. ignore this. This may be an edge case not worth the trouble.

I lean towards the edge case too.

Unfortunately a bunch of japanese sites are still not using utf8 so I see garbled text a bit more frequent than I'd like.

Those are just the places that I either frequent or just happen to have the tabs open.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Frotty picture Frotty  路  4Comments

jlu5 picture jlu5  路  5Comments

McInkay picture McInkay  路  4Comments

astorije picture astorije  路  4Comments

dgw picture dgw  路  3Comments