Irssi: Remove support for non-UTF8 terminals

Created on 11 Mar 2017  Â·  17Comments  Â·  Source: irssi/irssi

We are planning to remove 8-bit and Chinese support from Irssi.

Interaction with legacy IRC channels would still be provided through /recode, as it is currently.

However, Irssi would stop working on non-UTF-8 terminals (or at least appear heavily glitched)

We're especially interested to learn about people who are still using the 8-bit support and why you would not be able to move to Unicode.

(I will moderate this thread and delete comments I deem not contributing to the discussion)


Mentioned so far:

  • Issue #55 incomplete recode support
  • Refusal to use UTF-8 terminal emulator
WIP waiting for feedback

All 17 comments

I think that using only Unicode sounds like a great idea in 2017. 🎉

For the record (while I wouldn't use it), shouldn't it still be possible to use _screen_ to translate to a legacy encoding, even if irssi only implements UTF-8? For the one or two people who have some legacy terminal.

It seems like a good idea. The world seems to be moving to UTF-8 by default/only. Plus, if this lets the codebase be simplified, that's great! Deleted code is debugged code!

If it's harder to support many encodings, it is correct decision.

Very grievous. I prefer to use irssi with the singlebyte locale (KOI8-R) which perfectly fits to my framebuffer terminal. Singlebyte encodings have their own advantages such as low memory usage, fast processing and strong classic API (fgetc(), fputc(), printf(),... , "str + N" to move the pointer by N chars,... ).

@saahriktu Nice, glad you've found this issue. Can you provide some more details about your setup?

  1. What distro?
  2. How do you set up a KOI8-R framebuffer terminal? Is it console-setup stuff or some third party software?
  3. Are there any IRC networks that primarily use KOI8-R?
  4. Do you use the "recode" feature?
  5. Please post the output of the 'locale' command

Singlebyte encodings have their own advantages such as low memory usage, fast processing and strong classic API (fgetc(), fputc(), printf(),... , "str + N" to move the pointer by N chars,... ).

This isn't really correct, as we're talking about UTF-8, not UTF-16. UTF-8 is designed to share most of the advantages of single byte encodings and the same API is used. The memory usage and performance differences are negligible in practice, even with weak hardware. Well, uh, make that a 6th question: what sort of hardware?

  1. Own distro based on LFS.
  2. $ cat /etc/sysconfig/console
    KEYMAP="ru6"
    FONT="ter-u30b -m koi8-r"
    LOGLEVEL="1"
    UNICODE="0"

$ grep utf8 /etc/inittab
r2::wait:/bin/echo 0 > /sys/module/vt/parameters/default_utf8

$ grep KOI8-R ~/.bashrc
export LANG="ru_RU.KOI8-R"
$ grep -e ^def ~/.screenrc
defutf8 off
defencoding KOI8-R
$ grep koi8 ~/.vimrc
set encoding=koi8-r

  1. I don't know, but irssi also works with BitlBee, and BitlBee can be built with libpurple support. This brings to irssi huge amount of protocols such as facebook, matrix, mail.ru agent, telegram and vk.com. With KOI8-R support on BitlBee side.
  2. No. Time by time I go to Rusnet network, which has special port for KOI8-R. With BitlBee I have native KOI8-R support on BitlBee side.
  3. $ locale
    LANG=ru_RU.KOI8-R
    LC_CTYPE="ru_RU.KOI8-R"
    LC_NUMERIC="ru_RU.KOI8-R"
    LC_TIME="ru_RU.KOI8-R"
    LC_COLLATE="ru_RU.KOI8-R"
    LC_MONETARY="ru_RU.KOI8-R"
    LC_MESSAGES="ru_RU.KOI8-R"
    LC_PAPER="ru_RU.KOI8-R"
    LC_NAME="ru_RU.KOI8-R"
    LC_ADDRESS="ru_RU.KOI8-R"
    LC_TELEPHONE="ru_RU.KOI8-R"
    LC_MEASUREMENT="ru_RU.KOI8-R"
    LC_IDENTIFICATION="ru_RU.KOI8-R"
    LC_ALL=
  4. I run the same environment on different machines. From Raspberry Pi 1 with 512 Mb RAM to i7-2600K/8 Gb RAM.

Evaluation of advantages and flaws may be differ from view to view. In case of primary usage of ASCII range there is no big diffrent between KOI8-R and UTF-8. But cyrillic chars in UTF-8 have two byte size. Therefore in this case "char *strptr; int N; ... strptr + N" becomes incorrect. This needs extra math for each char size calculations.

Libpurple only deals with utf-8 internally. You're using a bitlbee feature, the "charset" setting

irssi can recode from UTF-8 channels to KOI8-R terminals and keep this ability?
If I setup BitlBee charset to UTF-8 channels encoding becomes UTF-8 that not fits to KOI8-R terminal.

can you switch to utf8 terminal? why not? this issue is about removing support for non-utf8-terminal from irssi. You can also check if GNU screen can emulate utf8 inside and KOI8R outside (suggested by joneskoo up there)

can you switch to utf8 terminal? why not?

No. Because I need ability of working with local text files in KOI8-R encoding. Primary area for my arguments is a local lowlevel tasks. Internet with protolols and their clients on the last place for me. Thereafter network tasks don't determine a local encoding for me.

By the way, my font has 256 chars only and I don't want to see squares on the place of absent chars.

Because I need ability of working with local text files in KOI8-R encoding.

Why can't you recode those files into utf8 once and forever?

Why can't you recode those files into utf8 once and forever?

This is a system. Although I have no files in KOI8-R, I will make new ones. Ability of working with local text files in KOI8-R is a priceless treasure.

Although I have no files in KOI8-R, I will make new ones.

For what reason?

Currently I'm against dropping non-UTF8 terminal support because of #55. I cannot join non-UTF8 networks because channel names are not currently recoded. For example, irc.juggler.jp uses iso-2022-jp encoding for everything (including the channels), so my terminal has to match that encoding to be able to join any non-ascii channels properly until recode support is expanded (I just opened a bounty for that issue).

For what reason?

Ability of working with local text files in KOI8-R is a priceless treasure.

@bparker06 nice, thanks for the feedback, totally forgot about that particular use case. I agree that fixing recode is the best solution here. Also thanks for the concrete network example, that helps a lot. Also first time I see iso-2022-jp being used in the wild, i thought it was all shift-jis and euc-jp.

Everyone else: let's not make this thread exclusively about @saahriktu's use case, we'd like to hear from more people.

@dequis Welcome. If you need a generic channel to test on that network, #ロビー (lobby) works. All Japanese IRC networks I know of use the 2022 encoding. Traditionally *nix users would use euc-jp as their terminal encoding while Windows would use CP932 (shift-jis w/ extensions). UTF-8 adoption in Japan is still very slow.

And in the meantime #55 sits there.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

CatPlanet picture CatPlanet  Â·  5Comments

spaghetti2514 picture spaghetti2514  Â·  47Comments

fratertenc picture fratertenc  Â·  7Comments

dequis picture dequis  Â·  5Comments

thiagomacieira picture thiagomacieira  Â·  35Comments