Crystal: Array.new(Int, &block, Int32 -> T) returns wrong class if subclassed

Created on 17 Sep 2020  路  21Comments  路  Source: crystal-lang/crystal

This new method ignores its subclass and always returns the Array type.

def self.new(size : Int, &block : Int32 -> T)
  Array(T).build(size) do |buffer|
    size.to_i.times do |i|
       buffer[i] = yield i
    end
    size
  end
end

All 21 comments

Isn't that expected since you're explicitly using Array(T).build and not the subclass' name?

I think he means when subclassed it should create the subclass. But then I think it's a bit impossible to implement this (if someone can think of a way, please send a PR).

OK, I'm naive. I just wrote this in my subclass. Why would it not work in Array(T)#new(size : Int, &block : Int32 -> T)?

class A < Array(Int32)
  def self.new(size : Int32, &block : Int32 -> T)
    a = self.new(size)
    i = 0
    while i < size
      a << yield(i)
      i += 1
    end
    a
  end
end
a = A.new(10) { 1 }
p a.class.name
p a

Oh, I mean, try doing it by modifying the new method in Array, not defining a new one in A. Also make sure it works with a subclass that's also generic.

To expand a bit more, the compiler sees that T is unbound in the type definition and it's a return type of the block, so to compute it it must type the block assuming Int32 is passed to it. From there it can deduce the type of T. And then we do Array(T).build to say "create an array of T". If we change it to just build or self.build then the T is never used and build just tries to create an Array(T) without T having a type and it fails.

One way to fix this would be for the compiler to assume that if there's a free variable T defined and we call a class method on an uninstantiated generic type, assume it's on an instantiated generic type using that T. But maybe it's more confusing than good.

Thank you. I'm confused about why T is unbound, since my subclass is defined this way:

class Subclass < Array(Int32)
end

I would have expected that Array::T would be known to be Int32, and that the compiler thus would know that T is Int32 wherever it is used, including as the return type of the block in:

class Array(T)
  def self.new(size : Int32, &block : Int32 -> T)
  end
end
class Subclass < Array(Int32)
end

There is no forall and type inference here, the type is explicit.

Assuming I get by this hurdle, there should be one of two coventions for overloading clone().

class Subclass < Array(Int32)
  def clone
     super
     # do subclass work here
  end
end

or

class Array(T)
  def initialize(that : Array(T))
    # Copy the array here.
  end
  def clone
    self.class.new(self) # Subclasses implement the copy constructor.
  end
end

The second makes more sense to me.

Oh, sorry for the confusion. I'm talking about making that new that you mention create an instance of a generic subclass if called on a generic subclass. Let's try it:

class MyArray(T) < Array(T)
end

my_ary = MyArray.new(10) { |i| i }
p typeof(my_ary) # => Array(Int32)

That doesn't work as you expect it to work. The reason is that the definition of that new method has Array(T) hardcoded into it:

def self.new(size : Int, &block : Int32 -> T)
  Array(T).build(size) do |buffer|
    size.to_i.times do |i|
       buffer[i] = yield i
    end
    size
  end
end

So the question for you is: can you find another way to define that method (without redefining it in the subclass!) that works with MyArray(T) and with your Subclass that you mentioned in your comment?

Note: to try out things you can simply reopen Array and redefine that method.

Oh, maybe I see what you mean. When that T is deduced to be whatever type, that needs to stick into the T of the current generic type. And I think that's basically what I'm proposing here: https://github.com/crystal-lang/crystal/issues/9755#issuecomment-694180552

So, ideally the new method would be defined like this:

def self.new(size : Int, &block : Int32 -> T)
  build(size) do |buffer|
    size.to_i.times do |i|
       buffer[i] = yield i
    end
    size
  end
end

Note that we are just calling build now. However, when we call MyArray.new(10) { 1 } that T gets to be Int32 and self goes from MyArray(T) to being MyArray(Int32). But that's not something that is happening right now. self can't change its type, and I'm not sure it's possible.

This is how I would expect Array#clone to work, and my prototype here appears to do what I want.

class XArray(T)
  @buffer : Pointer(T)
  getter size : Int32

  def initialize(that : XArray(T))
    puts "Superclass copy constructor called, self is a #{self.class.name}"
    @size = that.size
    @buffer = GC.malloc(sizeof(T) * that.size).as(Pointer(T))
    that.size.times do |step|
      @buffer[step] = that[step]
    end
  end

  def initialize(@size : Int32)
    @buffer = GC.malloc(sizeof(T) * @size).as(Pointer(T))
  end

  def [](index)
    @buffer[index]
  end

  def []=(index, value)
    @buffer[index] = value
  end

  def clone
    self.class.new(self)
  end
end

class Subclass < XArray(Int32)
  def initialize(that : Subclass)
    puts "Subclass new called."
    super
  end

  def initialize(size)
    super
  end

  def initialize(that : Subclass)
    puts "Subclass copy constructor called."
    super
  end
end

a = Subclass.new(1)
b = a.clone
puts "Type of b is #{b.class.name}"

Yes, that works. But I thought this was about Array.new, not clone.

Oops, I have so many bugs I'm confusing them. More coming.

OK, I added def initialize(size : Int, &block : Int32 -> T) to my example, and XArray.new works as I would expect. So, maybe my complaint is that Array#new is implemented when it should be Array#initialize.
````crystal
class XArray(T)
@buffer : Pointer(T)
getter size : Int32

def initialize(size : Int, &block : Int32 -> T)
@size = that.size
@buffer = GC.malloc(sizeof(T) * that.size).as(Pointer(T))
that.size.times do |step|
@buffer[step] = yield
end
end

def initialize(that : XArray(T))
puts "Superclass copy constructor called, self is a #{self.class.name}"
@size = that.size
@buffer = GC.malloc(sizeof(T) * that.size).as(Pointer(T))
that.size.times do |step|
@buffer[step] = that[step]
end
end

def initialize(@size : Int32)
@buffer = GC.malloc(sizeof(T) * @size).as(Pointer(T))
end

def
@buffer[index]
end

def []=(index, value)
@buffer[index] = value
end

def clone
self.class.new(self)
end
end

class Subclass < XArray(Int32)
def initialize(size : Int, &block : Int32 -> T)
super
end

def initialize(that : Subclass)
puts "Subclass new called."
super
end

def initialize(size)
super
end

def initialize(that : Subclass)
puts "Subclass copy constructor called."
super
end

end

a = Subclass.new(1)
b = Subclass.new(10) { 1 }
p b.class.name
```

Cool, that's a good solution: using initialize instead of new. But that means this is only possible for that method. Array has a class method build, and that trick isn't possible there. But we could definitely doing for at least the methods that are called new.

Array#build can easily be replaced by an initialize that yields the buffer. That would be cleaner. Then build could get a deprecation notice. Something like def initialize(&, unsafe_size : Int32, &block)

OK, I see what you were getting at:

If we initialize Array with this style:

Array.build(unsafe_capacity: 10)  { |index| SomethingReturningT }

We can't just call self.new to create the array, because nothing tells Array that it contains T.

As I'm still a relative newbie, I haven't used that style of creating Generics, and perhaps should not get into the habit.

Sigh. I can't fix this bug by replacing self.new(size : Int, &block : Int32 -> T) with initialize(size : Int, &block : Int32 -> T) because the compiler does not catch the return type of the block and bind T to that when I use initialize instead of self.new. Is that a bug?

When you define an initialize like this:

def initialize(arg, &block : -> T)
  yield
end

The compiler automatically generates a self.new method from it:

def self.new(arg, &block : -> T)
  x = allocate
  x.initialize do
    yield
  end
end

As you can see, the T isn't mentioned anywhere in that expansion. Hence, it doesn't work.

One way to solve it is to make allocate check for any free variables with the generic's type argument and use that to create the instance. I'm going to give that a try.

Well, actually, I don't have time, sorry. But yeah, that's what I meant with "this can't be easily fixed". That's the bug, or the limitation.

Sure, I respect your time. Should I file a bug? It seems asymmetrical enough to be a bug.

Sounds good.

I just don't think this can be fixed or changed.

Was this page helpful?
0 / 5 - 0 ratings