Irssi: Remove support for non-UTF8 terminals

Created on 11 Mar 2017 · 17Comments · Source: irssi/irssi

We are planning to remove 8-bit and Chinese support from Irssi.

Interaction with legacy IRC channels would still be provided through /recode, as it is currently.

However, Irssi would stop working on non-UTF-8 terminals (or at least appear heavily glitched)

We're especially interested to learn about people who are still using the 8-bit support and why you would not be able to move to Unicode.

(I will moderate this thread and delete comments I deem not contributing to the discussion)

Mentioned so far:

Issue #55 incomplete recode support
Refusal to use UTF-8 terminal emulator

WIP waiting for feedback

Source

ailin-nemui

👍16 🎉4 👎1

All 17 comments

I think that using only Unicode sounds like a great idea in 2017. 🎉

For the record (while I wouldn't use it), shouldn't it still be possible to use _screen_ to translate to a legacy encoding, even if irssi only implements UTF-8? For the one or two people who have some legacy terminal.

joneskoo on 12 Mar 2017

It seems like a good idea. The world seems to be moving to UTF-8 by default/only. Plus, if this lets the codebase be simplified, that's great! Deleted code is debugged code!

horgh on 13 Mar 2017

👍1

If it's harder to support many encodings, it is correct decision.

timur-enikeev on 21 Mar 2017

Very grievous. I prefer to use irssi with the singlebyte locale (KOI8-R) which perfectly fits to my framebuffer terminal. Singlebyte encodings have their own advantages such as low memory usage, fast processing and strong classic API (fgetc(), fputc(), printf(),... , "str + N" to move the pointer by N chars,... ).

saahriktu on 21 Mar 2017

👍1

@saahriktu Nice, glad you've found this issue. Can you provide some more details about your setup?

What distro?
How do you set up a KOI8-R framebuffer terminal? Is it console-setup stuff or some third party software?
Are there any IRC networks that primarily use KOI8-R?
Do you use the "recode" feature?
Please post the output of the 'locale' command

Singlebyte encodings have their own advantages such as low memory usage, fast processing and strong classic API (fgetc(), fputc(), printf(),... , "str + N" to move the pointer by N chars,... ).

This isn't really correct, as we're talking about UTF-8, not UTF-16. UTF-8 is designed to share most of the advantages of single byte encodings and the same API is used. The memory usage and performance differences are negligible in practice, even with weak hardware. Well, uh, make that a 6th question: what sort of hardware?

dequis on 21 Mar 2017

Own distro based on LFS.
$ cat /etc/sysconfig/console
KEYMAP="ru6"
FONT="ter-u30b -m koi8-r"
LOGLEVEL="1"
UNICODE="0"

$ grep utf8 /etc/inittab
r2::wait:/bin/echo 0 > /sys/module/vt/parameters/default_utf8

$ grep KOI8-R ~/.bashrc
export LANG="ru_RU.KOI8-R"
$ grep -e ^def ~/.screenrc
defutf8 off
defencoding KOI8-R
$ grep koi8 ~/.vimrc
set encoding=koi8-r

I don't know, but irssi also works with BitlBee, and BitlBee can be built with libpurple support. This brings to irssi huge amount of protocols such as facebook, matrix, mail.ru agent, telegram and vk.com. With KOI8-R support on BitlBee side.
No. Time by time I go to Rusnet network, which has special port for KOI8-R. With BitlBee I have native KOI8-R support on BitlBee side.
$ locale
LANG=ru_RU.KOI8-R
LC_CTYPE="ru_RU.KOI8-R"
LC_NUMERIC="ru_RU.KOI8-R"
LC_TIME="ru_RU.KOI8-R"
LC_COLLATE="ru_RU.KOI8-R"
LC_MONETARY="ru_RU.KOI8-R"
LC_MESSAGES="ru_RU.KOI8-R"
LC_PAPER="ru_RU.KOI8-R"
LC_NAME="ru_RU.KOI8-R"
LC_ADDRESS="ru_RU.KOI8-R"
LC_TELEPHONE="ru_RU.KOI8-R"
LC_MEASUREMENT="ru_RU.KOI8-R"
LC_IDENTIFICATION="ru_RU.KOI8-R"
LC_ALL=
I run the same environment on different machines. From Raspberry Pi 1 with 512 Mb RAM to i7-2600K/8 Gb RAM.

Evaluation of advantages and flaws may be differ from view to view. In case of primary usage of ASCII range there is no big diffrent between KOI8-R and UTF-8. But cyrillic chars in UTF-8 have two byte size. Therefore in this case "char *strptr; int N; ... strptr + N" becomes incorrect. This needs extra math for each char size calculations.

saahriktu on 21 Mar 2017

Libpurple only deals with utf-8 internally. You're using a bitlbee feature, the "charset" setting

irssi can recode from UTF-8 channels to KOI8-R terminals and keep this ability?
If I setup BitlBee charset to UTF-8 channels encoding becomes UTF-8 that not fits to KOI8-R terminal.

saahriktu on 21 Mar 2017

can you switch to utf8 terminal? why not? this issue is about removing support for non-utf8-terminal from irssi. You can also check if GNU screen can emulate utf8 inside and KOI8R outside (suggested by joneskoo up there)

ailin-nemui on 21 Mar 2017

can you switch to utf8 terminal? why not?

No. Because I need ability of working with local text files in KOI8-R encoding. Primary area for my arguments is a local lowlevel tasks. Internet with protolols and their clients on the last place for me. Thereafter network tasks don't determine a local encoding for me.

By the way, my font has 256 chars only and I don't want to see squares on the place of absent chars.

saahriktu on 21 Mar 2017

Because I need ability of working with local text files in KOI8-R encoding.

Why can't you recode those files into utf8 once and forever?

dexpl on 23 Mar 2017

👎1

Why can't you recode those files into utf8 once and forever?

This is a system. Although I have no files in KOI8-R, I will make new ones. Ability of working with local text files in KOI8-R is a priceless treasure.

saahriktu on 23 Mar 2017

👍1

Although I have no files in KOI8-R, I will make new ones.

For what reason?

dexpl on 24 Mar 2017

Currently I'm against dropping non-UTF8 terminal support because of #55. I cannot join non-UTF8 networks because channel names are not currently recoded. For example, irc.juggler.jp uses iso-2022-jp encoding for everything (including the channels), so my terminal has to match that encoding to be able to join any non-ascii channels properly until recode support is expanded (I just opened a bounty for that issue).

bparker06 on 24 Mar 2017

For what reason?

Ability of working with local text files in KOI8-R is a priceless treasure.

saahriktu on 24 Mar 2017

@bparker06 nice, thanks for the feedback, totally forgot about that particular use case. I agree that fixing recode is the best solution here. Also thanks for the concrete network example, that helps a lot. Also first time I see iso-2022-jp being used in the wild, i thought it was all shift-jis and euc-jp.

Everyone else: let's not make this thread exclusively about @saahriktu's use case, we'd like to hear from more people.

dequis on 24 Mar 2017

@dequis Welcome. If you need a generic channel to test on that network, #ロビー (lobby) works. All Japanese IRC networks I know of use the 2022 encoding. Traditionally *nix users would use euc-jp as their terminal encoding while Windows would use CP932 (shift-jis w/ extensions). UTF-8 adoption in Japan is still very slow.

bparker06 on 24 Mar 2017

And in the meantime #55 sits there.