Crystal: Why does Crystal Strings use iconv

Created on 20 Nov 2019  路  7Comments  路  Source: crystal-lang/crystal

I was reading Coordinate porting to Windows and saw there was a comment about porting iconv. I see we are using iconv and I was wondering why? It seems like ruby no longer uses iconv. Are there benefits to using iconv?

Here is a link to the history of iconv in ruby
https://medium.com/@farsi_mehdi/the-evolution-of-ruby-strings-from-1-8-to-2-5-8b2ed8f39fad

Most helpful comment

Maybe we can cheat and have script that steals the tables from some other project (possibly iconv or ruby) and automatically generates Crystal source files that provide them to our own implementation of things. Then we can run that on each release of the library we steal the tables from.

But I agree this is no pressing issue and more of mid to long-term goal.

All 7 comments

So, iconv is used behind the scenes when you encode data from one encoding to another. For example from LATIN-1 to UTF-8, or UTF-16 to -UTF-8, etc. For example if you do an HTTP::Client.get to a page that uses an encoding that's not UTF-8, the body will be automatically converted to UTF-8.

I think Ruby stopped using iconv and is now using onigma's encoding tables for that, or maybe something else. But we need the encoding logic to be there, be it iconv or something else.

Are you suggesting removing iconv or encoding things in general?

Are there benefits to using iconv?

The benefit of iconv is that it's very conveniently available on all supported platforms so far. There seem to be some issues with musl-libc, but in general it has worked pretty good without doing much for it.

Unfortunately, we won't be able to use iconv as easily on Windows. And it probably won't work for other future targets such as WASM.

Moving away from iconv would be great, but the question is what to use instead.

Ruby seems to maintain their own internal encoding conversions in https://github.com/ruby/ruby/tree/master/enc

Having a native Crystal replacement would be great. But that's going to be a tremendous effort, and is probably not feasible in the near future.

Having a native Crystal replacement would be great. But that's going to be a tremendous effort, and is probably not feasible in the near future.

I also can't imagine compiling all of that every time you compile a "hello world" program.

Doing it in C and linking is an option, though.

Maybe we can cheat and have script that steals the tables from some other project (possibly iconv or ruby) and automatically generates Crystal source files that provide them to our own implementation of things. Then we can run that on each release of the library we steal the tables from.

But I agree this is no pressing issue and more of mid to long-term goal.

I didn't realize crystal made their own. I was just reading the story and was trying to figure out our reasoning. It seems like it would be awesome to have our own but that seems like something to do in the future.

@wontruefree , maybe you should reopen the issue so that the discussion can continue on this?

We could use WideCharToMultiByte on windows - though this means that character set conversion would have to go via UTF-16. For an initial implementation (with no additional dependencies), this performance tradeoff seems very worth it.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

oprypin picture oprypin  路  3Comments

TechMagister picture TechMagister  路  3Comments

RX14 picture RX14  路  3Comments

Papierkorb picture Papierkorb  路  3Comments

relonger picture relonger  路  3Comments