:unicode.characters_to_binary, which String.from_char_list uses, allows for much more types of input than String.from_char_list defines in its spec. Some libraries like https://github.com/meh/jazz/ already use this implicit functionality, which is fine unless one want to use the dialyzer, as it will complain about it.
Would it be an idea to change String.from_char_list/1 (and String.from_char_list!/1) into String.from_characters/2 in order to explicitly allow other types of input? The extra argument would be needed for setting the encoding of the input, since it would also allow a list of binaries.
If the answer is yes, I would be happy to provide a PR for this.
We have two options: add a guard or generalize the input. I am more inclined to the former, as someone can always fallback to the unicode one. The question is how much havroc would it cause? Can you please give it a try in Elixir itself?
I did a quick make test with the following implementation:
@spec from_char_list(char_list) :: { :ok, String.t } | { :error, binary, binary } | { :incomplete, binary, binary }
def from_char_list(list) when is_list(list) do
if char_list?(list) do
case :unicode.characters_to_binary(list) do
result when is_binary(result) ->
{ :ok, result }
{ :error, _, _ } = error ->
error
{ :incomplete, _, _ } = incomplete ->
incomplete
end
else
{ :error, :badarg }
end
end
@spec from_char_list!(char_list) :: String.t | no_return
def from_char_list!(list) when is_list(list) do
unless char_list?(list), do: raise(ArgumentError, message: IO.inspect(list))
case :unicode.characters_to_binary(list) do
result when is_binary(result) ->
result
{ :error, encoded, rest } ->
raise UnicodeConversionError, encoded: encoded, rest: rest, kind: :invalid
{ :incomplete, encoded, rest } ->
raise UnicodeConversionError, encoded: encoded, rest: rest, kind: :incomplete
end
end
defp char_list?(list) do
Enum.all?(list, &is_integer/1)
end
Unfortunately, that caused quite a lot of havoc, to the point that ExUnit itself crashes and isn't able to finish all tests. Without having looked into it, it seems there are quite a few places where the data type is not actually a real char_list.
Ahhh, I see what you mean. For me we were passing non list inputs but
that's not the issue. We are passing io lists.
I need to think about this more because I don't like "from_characters". I
also need to check other places where we may accept similar and act
accordingly.
On Thursday, February 20, 2014, Vincent Siliakus [email protected]
wrote:
I did a quick make test with the following implementation:
@spec from_char_list(char_list) :: { :ok, String.t } | { :error, binary, binary } | { :incomplete, binary, binary }
def from_char_list(list) when is_list(list) do if char_list?(list) do case :unicode.characters_to_binary(list) do result when is_binary(result) ->
{ :ok, result }{ :error, _, _ } = error -> error { :incomplete, _, _ } = incomplete -> incomplete end else { :error, :badarg } endend
@spec from_char_list!(char_list) :: String.t | no_return
def from_char_list!(list) when is_list(list) do unless char_list?(list), do: raise(ArgumentError, message: IO.inspect(list))
case :unicode.characters_to_binary(list) do result when is_binary(result) ->
result{ :error, encoded, rest } -> raise UnicodeConversionError, encoded: encoded, rest: rest, kind: :invalid { :incomplete, encoded, rest } -> raise UnicodeConversionError, encoded: encoded, rest: rest, kind: :incomplete endend
defp char_list?(list) do Enum.all?(list, &is_integer/1)
endUnfortunately, that caused quite a lot of havoc, to the point that ExUnit
itself crashes and isn't able to finish all tests. Without having looked
into it, it seems there are quite a few places where the data type is not
actually a real char_list.
Reply to this email directly or view it on GitHubhttps://github.com/elixir-lang/elixir/issues/2067#issuecomment-35667036
.
_Jos茅 Valimwww.plataformatec.com.br
http://www.plataformatec.com.br/Founder and Lead Developer_
@zambal I would like to introduce the concept of char_data, which is basically a binary or a list made of characters data. The idea is to contrast it with iodata, which is binaries and lists, but not representing characters.
Note to self: we should also rename iolist_to_binary/1 and iolist_size/1 to iodata_to_binary/1 and iodata_size/1.
Fixed on master.
just for the deprecation record, String.from_char_list/1 was replaced by String.from_char_data/1, which was later replaced by List.to_string/1
for anybody coming here looking for deprecation replacements
from the following errors:
warning: String.from_char_list!/1 is deprecated, please use String.from_char_data!/1 instead
and
warning: String.from_char_data/1 is deprecated, use List.to_string instead
Most helpful comment
just for the deprecation record,
String.from_char_list/1was replaced byString.from_char_data/1, which was later replaced byList.to_string/1for anybody coming here looking for deprecation replacements
from the following errors:
and