From the conversation https://gitter.im/crystal-lang/crystal?at=5d07a49b2313502d385b7a3c
https://play.crystal-lang.org/#/r/730d
require "uuid"
def convert(value : ::UUID)
value.to_slice
end
uuid = UUID.new "24b408d0-5f08-43d3-9027-c0d4ec097e3a"
uuid.to_slice # => Bytes[36, 180, 8, 208, 95, 8, 67, 211, 144, 39, 192, 212, 236, 9, 126, 58]
convert uuid # => Bytes[36, 180, 8, 208, 95, 8, 67, 211, 0, 0, 0, 0, 16, 0, 0, 0]
I can't imagine this being the desired behavior; but if it is, it should probably be documented somewhere.
oof
Isolated behavior is moving a stack allocated StaticArray
, causing it to point to invalid memory when accessing its data.
def convert(v : StaticArray(UInt8, 16))
v.to_slice
end
a = StaticArray(UInt8, 16).new(0)
b = convert(a)
p!(a)
p!(b)
a # => StaticArray[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
b # => Bytes[22, 192, 119, 204, 115, 85, 0, 0, 136, 166, 125, 204, 115, 85, 0, 0]
This can be fixed by changing @bytes
to Slice(UInt8)
/Bytes
instead of a stack allocated StaticArray
, or making UUID
a class
. Don't know if that's appropriate here. Maybe this is something wrong with StaticArray
, but I don't know its implementation details.
If this is important for you, you can just copy/paste the UUID implementation as your own (its a single file) and make either change I mentioned above.
Yup, StaticArray#to_slice
can't be returned from a method... or well, it can with undefined behavior.
Some questions and thoughs:
UUID
?UUID
should be implemented with a backing read-only Slice(UInt8)
. Then that would be heap-allocated and it would be fine (though it wouldn't be that efficient)to_slice
can just return a copy of the static array... it's less efficient, but it works and there's really no other way around itThere was some talk about efficiency during implementation. Before UUID was added to the stdlib I was creating them and mutating them but I am not sure that should be a feature in the stdlib implementation of UUID.
Slice
instances must point to heap memory not stack memory or they're unsafe. So I'd say StaticArray#to_slice
must copy.
It'd be useful to audit users of that method and see what they need to convert to a slice for. If it's just slice's extra API then StaticArray has an API problem.
The to_slice method is used all over the place in the internals of the stdlib: allocate a stack buffer, read to it (via that method), etc. It's just that it's incorrect to return that slice from a method.
@asterite Maybe it should be named #to_unsafe
then?
Maybe.
#to_unsafe
typically returns a pointer. #to_slice
should copy, #to_unsafe_slice
could create a slice of the stack.
Actually, let's fix UUID. Let's document to_slice and move on. This problem is not common at all. It's not common to implement a data structure banked by a static array and wanting to return a slice or pointer to it (I still don't know what's the use case for Uuid is)
Unsafe methods should be obviously unsafe. Segfaults are surprising behaviour. Compromising memory safety is to me not an option.
StaticArray is kind of low level and unsafe.
I agree with @RX14 on this, however I'm not sure about doing a copy with to_slice
.
Usually when you call to_slice
you sort of know it's an inexpensive call, and doing an allocation there would not be "inexpensive"..
I would prefer to force the user to be explicit about the copy with the need to call .to_unsafe_slice.dup
.
@asterite the UUID type is meant to create/read a UUID of varying flavors (random, md5, sha1, ...), to query it (there are many variants and versions), all that encoded as a 16-bytes / 128-bit struct, instead of having an external 32-bytes String allocation. UUIDs can also be represented in different formats (bytes, string with or without dashes, urn, ...). Having a type that can generate, parse, validate, format and compare all these forms transparently is a great addition to the stdlib.
@ysbaddaden I understand this. I think avoiding memory allocation is good.
My thoughts: Let's remove to_slice
. Let's instead introduce a method to return a StaticArray
if needed. Then the user can call that method (gets a copy), call to_slice
on it and then it can be passed to an IO. Then there's no problem.
We should document that to_slice
on StaticArray
returns a Slice
that points to the underlying StaticArray
and that one shouldn't return that from methods, and show an example of how to use that when reading/writing IO.
The thing is, let's not change the API every time someones stumbles upon a problem.
Specially if the problem is on our side in the std and it's easy to fix.
Most helpful comment
@asterite Maybe it should be named
#to_unsafe
then?