Crystal: incomplete multibyte sequence (ArgumentError)

Created on 10 Sep 2016  Â·  16Comments  Â·  Source: crystal-lang/crystal

require "base64"
str = Base64.decode_string("w+Xt5fDg9uj/IPHq8Ojv8uAgMC4wMTI2IPHl6vPt5Ao=")
m = MemoryIO.new(str)
m.set_encoding("UTF-8", invalid: :skip)
p m.gets_to_end

crashed on OS X, in crystal Crystal 0.19.1 (2016-09-09) (but here it works: https://play.crystal-lang.org/#/r/192u)

incomplete multibyte sequence (ArgumentError)
[4424733042] *CallStack::unwind:Array(Pointer(Void)) +82
[4424732945] *CallStack#initialize:Array(Pointer(Void)) +17
[4424732904] *CallStack::new:CallStack +40
[4424705641] *raise<ArgumentError>:NoReturn +25
[4424806739] *IO::Decoder#read<MemoryIO>:(Int32 | Nil) +947
[4424799925] *MemoryIO@IO#gets_to_end:String +133
[4424799677] *MemoryIO#gets_to_end:String +45
[4424684099] __crystal_main +1203
[4424717832] main +40
bug stdlib

Most helpful comment

Dynamically changing and overriding, two different things.

All 16 comments

Hmm, it's strange that iconv returns different values depending on the platform...

Who am I kidding? This is probably the norm when it's about supporting multiple platforms :-P

this fix actually add another bug, but on linux only, in the end of buffer lost 1 character, trying to reduce example

https://play.crystal-lang.org/#/r/19b3
on master this gives 4275

дарил им игрушки vs дари им игрушки

i dont know how to reduce it more, but problem is around in position of buffer size

if i just remove some starting symbols from str, problem just moved on the same position

if i change line to when Errno::EILSEQ # , Errno::EINVAL - fixed

on osx bug also

Is there any way you can reduce it? With 4275 bytes I have no idea what I have to test...

ill try

require "base64"
str = Base64.decode_string "77u/CgkJCgkJItCw0LHQstCz0LQi"
m = MemoryIO.new(str)
m.set_encoding("UTF-8", invalid: :skip)
res = m.gets_to_end
puts res.bytesize

also need change buffer size:

  # :nodoc:
  class Decoder
    BUFFER_SIZE     = 15
    OUT_BUFFER_SIZE = 15

output is: 19

if i change it back

  when Errno::EILSEQ # , Errno::EINVAL

output is: 21

btw why i cannot change constants like in ruby?

module IO
  # :nodoc:
  class Decoder
    BUFFER_SIZE     = 15
    OUT_BUFFER_SIZE = 15
  end
end

change constants

Isn't that kind of an oxymoron? I mean, you're _changing_ a _constant_...

ruby allow that, and it nice when need to hack some code like this

Dynamically changing and overriding, two different things.

@kostya I tried to understand what was wrong with the current code and _I think_ this time I got it right. But of course if you find more incorrect cases let me know. At least now there are two specs with the cases that failed with the old fix and without the old fix :-)

hm, bug is still there :), but another, trying to reproduce

now bug with content size 2*buffer
https://play.crystal-lang.org/#/r/19m5
in master return 8974

small example

require "base64"
str = Base64.decode_string "77u/CgkJCgkJItCw0LHQstCz0LQi77u/CgkJCgkJItCw0LHQstCz0LQi"
m = MemoryIO.new(str)
m.set_encoding("UTF-8", invalid: :skip)
res = m.gets_to_end
p res
puts res.bytesize
  # :nodoc:
  class Decoder
    BUFFER_SIZE     = 15
    OUT_BUFFER_SIZE = 15
crystal 10.cr
"\n\t\t\n\t\t\"абвгд\"\n\t\t\n\t\t\"абвгд\""
42

./bin/crystal 10.cr
"\n\t\t\n\t\t\"абгд\"\n\t\t\n\t\t\"абвгд\""
37

@kostya I think I found the bug in my previous commit. Can you try again? :-)

thanks, seems all fixed.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

grosser picture grosser  Â·  3Comments

ArthurZ picture ArthurZ  Â·  3Comments

Sija picture Sija  Â·  3Comments

relonger picture relonger  Â·  3Comments

cjgajard picture cjgajard  Â·  3Comments