Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:8:8] [ds:8:8:10] [async-threads:10] [hipe] [kernel-poll:false]
Elixir 1.6.0 (compiled with OTP 20)
iex(10)> "\"\\x\"" |> Code.string_to_quoted
** (ArgumentError) missing hex sequence after \x, expected \xHH
(elixir) src/elixir_interpolation.erl:155: :elixir_interpolation.unescape_hex/3
(elixir) src/elixir_interpolation.erl:81: :elixir_interpolation."-unescape_tokens/2-lc$^0/1-0-"/2
(elixir) src/elixir_tokenizer.erl:580: :elixir_tokenizer.handle_strings/6
(elixir) lib/code.ex:568: Code.string_to_quoted/2
Other syntax errors give a result of {:error, _}
Very similarly:
** (ArgumentError) invalid Unicode sequence after \u, expected \uHHHH or \u{H*}
(elixir) src/elixir_interpolation.erl:182: :elixir_interpolation.unescape_unicode/3
(elixir) src/elixir_interpolation.erl:81: :elixir_interpolation."-unescape_tokens/2-lc$^0/1-0-"/2
(elixir) src/elixir_tokenizer.erl:580: :elixir_tokenizer.handle_strings/6
(elixir) lib/code.ex:568: Code.string_to_quoted/2
I'm fuzzing code with StreamData and rescued those two cases. It looks like there is nothing more coming.
Here is another one:
iex(23)> ".\"\#{}\"" |> Code.string_to_quoted
** (CaseClauseError) no case clause matching: {1, 7, [{{1, 3, 1}, []}], []}
(elixir) src/elixir_tokenizer.erl:636: :elixir_tokenizer.handle_dot/6
(elixir) lib/code.ex:568: Code.string_to_quoted/2
I have seen at least two more now (it's a bit tricky because I have to reduce by hand). When I find one, I filter it out by the top stacktrace function (to protect my own code against failures when fuzzing).
Shall I continue posting examples?
@schnittchen yes, please post here, we will organize it somehow later. :)
I seriously bumped up the max_runs and the test timeout and only once found a problem.
Here's the stacktrace:
** (SystemLimitError) a system limit has been reached
code: check all snippet <- string(:ascii), max_runs: 80000 do
stacktrace:
:erlang.binary_to_atom("@GR{+z]`_XrNla!d<GTZ]iw[s'l2N<5hGD0(.xh&}>0ptDp(amr.oS&<q(FA)5T3=},^{=JnwIOE*DPOslKV KF-kb7NF&Y#Lp3D7l/!s],^hnz1iB |E8~Y'-Rp&*E(O}|zoB#xsE.S/~~'=%H'2HOZu0PCfz6j=eHq5:yk{7&|}zeRONM+KWBCAUKWFw(tv9vkHTu#Ek$&]Q:~>,UbT}v$L|rHHXGV{;W!>avHbD[T-G5xrzR6m?rQPot-37B@", :utf8)
(elixir) src/elixir_parser.yrl:876: :elixir_parser.build_quoted_atom/3
(elixir) src/elixir_parser.yrl:271: :elixir_parser.yeccpars2_50/7
(elixir) /usr/lib/erlang/lib/parsetools-2.1.6/include/yeccpre.hrl:57: :elixir_parser.yeccpars0/5
(elixir) src/elixir.erl:284: :elixir.tokens_to_quoted/3
However,
iex(57)> ":" <> String.duplicate("foo", 100) |> Code.string_to_quoted
{:error,
{1, "atom length must be less than system limit: ",
":foofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoofoo"}}
The other case that I thought to have seen must have been either from a copy-paste error or the fact that
string |> inspect |> Code.string_to_quoted! == string
is not always true.
@josevalim I tried looking into this and it looks like at least one error comes from the fact that we raise directly in elixir_interpolation:unescape_tokens and unescape_chars. In the tokenizer we return errors as tuples but we call these functions, however we also call them in some other places (Kernel and Macro) and I am not sure we should not raise there. Suggestions on the approach?
Yeah, we will have to make the Erlang code not raise and move the error
José Valimwww.plataformatec.com.br
http://www.plataformatec.com.br/Founder and Director of R&D
Is anyone doing it? I can do it if not.
@kelvinst please go ahead!
@kelvinst ping :)
Hello @kelvinst @schnittchen @whatyouhide
I am just following this issue.
Not sure if I can fix or help much
This is what I understand
The issue/problem is when I run in iex "\x" |> Code.string_to_quoted
iex(19)> "\x" |> Code.string_to_quoted
** (ArgumentError) missing hex sequence after \x, expected \xHH
This raises proper error "\d" |> Code.string_to_quoted
iex(19)> "\d" |> Code.string_to_quoted
{:error, {1, "unexpected token: ", "\"\d\" (column 1, codepoint U+007F)"}}
So somewhere in the file
In the file
lib/elixir/src/elixir_interpolation.erl
In case statement like this, a change is to be made
94 unescape_chars(<<$\\, $x, Rest/binary>>, Map, Acc) ->
95 case Map(hex) of
96 true -> unescape_hex(Rest, Map, Acc);
97 false -> unescape_chars(Rest, Map, <<Acc/binary, $\\, $x>>)
98 end;
any thoughts or feedback?
The "\x" |> Code.string_to_quoted error is actually not happening inside the string_to_quoted call - "\x" itself will error since it's not a valid string syntax.
Sorry guys! I didn’t get time to work on that yet, just started to check the code and how it could be done. If anyone has any idea and wanted to try it out, just go ahead! :)
Best,
Kelvin Stinghen
kelvin.[email protected]
On Jun 13, 2018, at 17:56, José Valim notifications@github.com wrote:
@kelvinst https://github.com/kelvinst please go ahead!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/elixir-lang/elixir/issues/7270#issuecomment-396990462, or mute the thread https://github.com/notifications/unsubscribe-auth/ACqaHUuDBPcJVRNfRvgaEhMZByhcbaEhks5t8TYsgaJpZM4Rvb3D.
Greetings everyone! I've been taking a look at this issue and mainly been focusing on the Hex issues as of now. I believe I've gotten that part squared away (#7809). If that goes in, I can do more follow-ups on the other issues with unicode and dot (hopefully)! 👍
All of those cases have been tackled! Thank you @schnittchen and @drincruz!
Most helpful comment
All of those cases have been tackled! Thank you @schnittchen and @drincruz!