struct Test
def initialize()
@test = StaticArray(UInt32, 4)
p pointerof(@test)
end
def array_ptr()
p pointerof(@test)
end
end
test = Test.new
test.array_ptr
results in
Pointer(StaticArray(UInt32, 4):Class)@0xbfa6b750
Pointer(StaticArray(UInt32, 4):Class)@0xbfa6b8c8
From what I can see from the LL dump, this is a result of the struct being constructed in one memory location, and then relocated elsewhere. My current workaround for this is calling a function after the initialiser, which works as expected.
Fixing this may require changes in how variables are constructed, and thus may not be feasible. I suggest that a warning be emitted for the use of pointerof on an instance variable in a constructor; this would expose any latent issues.
This also happens with self (just got bit by this bug):
class Foo
def initialize
self2 = self
puts "pointerof(self) = #{pointerof(self2)}"
end
end
foo = Foo.new
puts "pointerof(foo) = #{pointerof(foo)}"
prints:
pointerof(self) = Pointer(Foo)@0x7fff5d3114c8
pointerof(foo) = Pointer(Foo)@0x7fff5d311748
…14c8 ≠…1748.
The first might be a bug, but it would be interesting to know your use case. Since a struct is passed by value, I wouldn't rely on pointers to it, because they can quickly become invalid.
The second case is not a bug: self and self2 are two different variables, their memory address is different.
My actual code looks something like
struct Test
def initialize()
@test = StaticArray(UInt32, 4)
external_fn_taking_pointer(@test.to_unsafe)
end
end
struct Test2
def initialize()
@test = Test.new
end
end
test2 = Test2.new
which I would expect to produce a consistent memory address (as the array is a member variable of another member variable)
For comparison, the equivalent C++ works as expected:
#include <cstdio>
#include <cstdint>
struct Test
{
uint32_t test[4];
Test()
{
printf("%x\n", test);
}
};
struct Test2
{
Test test;
};
int main()
{
Test2 test;
printf("%x\n", test.test.test);
return 0;
}
results in
bfab0e40
Sounds a bit strange that structures of any kind should be "inited in one place" and then copied - if that happens for every type instantiation it would be a performance drag.
As of 0.24.2 this use-case is not valid. We now enforce the type of the instance variable when initializing.
Error in test.cr:12: instantiating 'Test:Class#new()'
test = Test.new
^~~
in test.cr:3: Can't infer the type of instance variable '@test' of Test
The type of a instance variable, if not declared explicitly with
`@test : Type`, is inferred from assignments to it across
the whole program.
The assignments must look like this:
1. `@test = 1` (or other literals), inferred to the literal's type
2. `@test = Type.new`, type is inferred to be Type
3. `@test = Type.method`, where `method` has a return type
annotation, type is inferred from it
4. `@test = arg`, with 'arg' being a method argument with a
type restriction 'Type', type is inferred to be Type
5. `@test = arg`, with 'arg' being a method argument with a
default value, type is inferred using rules 1, 2 and 3 from it
6. `@test = uninitialized Type`, type is inferred to be Type
7. `@test = LibSome.func`, and `LibSome` is a `lib`, type
is inferred from that fun.
8. `LibSome.func(out @test)`, and `LibSome` is a `lib`, type
is inferred from that fun argument.
Other assignments have no effect on its type.
Can't infer the type of instance variable '@test' of Test
@test = StaticArray(UInt32, 4)
^~~~~
@sdogruyol Issue is still valid, see code below:
struct Test
def initialize
@test = StaticArray(UInt32, 4).new(10_u32)
p pointerof(@test)
end
def array_ptr
p pointerof(@test)
end
end
test = Test.new
test.array_ptr
I would change this from bug to feature.
agreed, this isn't entirely unexpected given struct semantics
I don't understand why copy is required at returning from #new. I think it should be able to handle #new like following on generating code because #initialize guarantees that all instance variables are set in some way.
struct A
@foo : Int32
@bar : Pointer(Int32)
def initialize(@foo)
@bar = pointerof(@foo)
end
end
a = uninitialized A
a.initialize(100)
Which extra process is there before or after calling #initialize?
Which kind of #initialize process does need a copy?
Is it just because there is =?
I would like to have an option to disallow any copy operation to a defined struct, like C++ can do this with putting the specific constructor declaration into private.
I (and some people) expect that struct is fixed location and fixed duration of lifetime. This semantic is useful to generate efficient code and sometimes required to call underlying C library, without loosing the experience of the user of wrapped library. For example, considering that the author of C library suggests to their function foo to pass local resource:
void user_func(...)
{
struct X data;
Y *ptr;
ptr = foo(&data, ...);
/* play with `ptr`, but actually read or write to `data` */
bar(ptr, ...);
baz(ptr, ...);
}
Although the content of data is public, no C programmer wants to copy data because it does not make sense.
To implement this in Crystal we need to write:
lib LibFoo
struct X
# omit
end
alias Y = Void
fun foo(X*, ...) : Y*
fun bar(Y*, ...) : Void
fun baz(Y*, ...) : Void
end
def user_func(...)
data = uninitialized LibFoo::X
ptr = LibFoo.foo(pointerof(data), ...)
# play with `ptr`
LibFoo.bar(ptr, ...)
LibFoo.baz(ptr, ...)
# Is it guaranteed to `data` exists until here?
end
Developers who use Crystal consider this is unsafe and unwant to do. So I have to put this safe without loosing flexibility on calling bar, baz and so on.
pointerof to use a C libraryBefore putting data into a struct, considering this:
struct Y
@ptr : Pointer(LibFoo::Y)
def initialize(@ptr)
end
def bar(...)
LibFoo.bar(@ptr, ...)
end
def baz(...)
LibFoo.baz(@ptr, ...)
end
end
def bind_to_Y(..., &)
data = uninitialized LibFoo::X
ptr = Y.new(LibFoo.foo(pointerof(data), ...)
ret = yield ptr
data
ret
end
# with this, a user can:
def user_func(...)
bind_to_Y(...) do |y|
# play with `y`.
y.bar(...)
y.baz(...)
end
end
This looks safe but actually not safe.
|Operation|Safety|
|-------|------|
|Passing to an argument of another method|Safe if the calling method does not do any of followings|
|Copying to another local variable in the block|Safe|
|Passing to nested block|Unsafe unless executed inside this "scope"|
|Copying to a captured variable|Can be Unsafe|
|Returning y from block|Can be Unsafe|
|Passing to raise|Unsafe (because one can implement this with longjmp)|
|...|...|
Moreover, we have to nest to use multiple objects at once. This is inconvenient.
def user_func(...)
bind_to_Y(...) do |y|
bind_to_Y(...) do |z|
# play with `y` and `z`, `y` gets copied (right?).
y.bar(...)
z.baz(...)
y.baz(...)
end
end
end
data to an instance variable of struct to manage in pairBecause data and ptr is paired together, I may put it to a single struct:
struct Y
@data : LibFoo::X
@ptr : Pointer(LibFoo::Y)
@refdata : ...
def initialize(@refdata)
@data = uninitialized LibFoo::X
@ptr = LibFoo.foo(pointerof(@data), @refdata)
end
def bar(...)
LibFoo.bar(@ptr, ...)
end
def baz(...)
LibFoo.baz(@ptr, ...)
end
end
# with this, a user can:
def user_func(...)
y = Y.new(...)
# play with `y`.
y.bar(...)
y.baz(...)
end
This is not work as expected because it is copied when return from #new.
This is the best semantic of providing the experience of LibFoo, however, it is still unsafe:
|Operation|Safety|
|-------|------|
|Passing to an argument of another method|Safe but the passed value still points the original variable|
|Copying to another local variable in the block|Safe but it still points the original variable|
|Passing to nested block|Unsafe unless executed inside this "scope", also it still points the original variable|
|Copying to a captured variable|Can be Unsafe, also it still points the original variable|
|Returning y from method|Unsafe|
|Passing to raise|Unsafe (because one can implement this with longjmp)|
|...|...|
Since above does not work, we can make it to do delayed initialization:
struct Y
@data : LibFoo::X
@ptr : Pointer(LibFoo::Y)
@refdata : ...
def initialize(@refdata)
@data = uninitialized LibFoo::X
@ptr = Pointer(LibFoo::Y).null
end
def bar(...)
LibFoo.bar(self, ...)
end
def baz(...)
LibFoo.baz(self, ...)
end
def to_unsafe
if @ptr.null?
@ptr = LibFoo.foo(pointerof(@data), @refdata)
end
@ptr
end
end
# with this, a user can:
def user_func(...)
y = Y.new(...)
# play with `y`.
y.bar(...)
y.baz(...)
end
In this case, LibFoo.foo will be called multiple times, if it has been copied before the calling one of these method.
y = Y.new(...)
z = y
z.bar(...) # LibFoo.foo will be called for `z`
y.baz(...) # LibFoo.foo will be called for `y`
Any copying before calling its methods is safe, but copying after calling any of its methods is possibly unsafe in same condition described above. This is too complicated.
We need a temporary object to bridge the user of binding library and underlying C library, but we don't want to fill the sea of heap memory with the garbage of temporary object, which can be used many many times. However, currently we can not handle a pointer from C library with a struct gracefully. Because it is freely and implicitly copied one to another. Disallowing copy can let the user of binding library to use the struct only in safe way, without loosing efficiency.
If you want a real world use case, here it is.
I have ideas of the methods to disallow copying:
#dupReference. (see below)Value#dup becomes an "special method" which makes a copya = b where b is a Value should be interpreted as a = b.dupa = A.new where A is a Struct class can be interpreted as a = A.new.dup#initialize(another : self)a = b where b is a Value should be interpreted as a = b.class.new(b)a = A.new where A is a Struct class never involves copy.struct can never "monkey patch" to reenable copy operation of the struct.To protect Reference from copying is easy. Because it requires #dup to make a copy, so just put #dup to protected or private:
class A
protected def dup
self
end
end
a = A.new
b = a.dup
Showing last frame. Use --error-trace for full trace.
error in line 8
Error: protected method 'dup' called for A
@asterite suggested to disallow to use pointerof to instance variable of struct in #9700. The language reference says using pointerof is (always) unsafe. If so, just disallowing it does not make sense, I think. Any invalid usage of pointerof should not be responsible to Crystal's developers, however you should provide the detailed specification such as "when returned pointer goes invalid". This should be guaranteed by language in some way.
(ex.)
|Source of pointer|Guaranteed Lifetime|
|-|-|
|Any Reference type|Until no variable references it|
|Any instance variables of Value type in Reference (class)|Until no variable references the container of it|
|Any instance variables of Value type in #initialize of Value (struct)|Until return from the method|
|Any instance variables of Value type in any other method of Value (struct)|Until return from the caller's scope|
|Any local variable of Value type|Until return from the current scope|
|Any captured variable of Value type|Until return from the block|
|Any constant variable of Value type|Forever|
|(Non Interpolated) String literal|Forever|
(This table is not true. This depends how the language guarantees (i.e., decided by Crystal's developer))
It works like that because of how new is defined, and how structs are passed by value.
@waj was working on improving this situation but it's probably something hard to change
Most helpful comment
@sdogruyol Issue is still valid, see code below: