Crystal: Unify structs and C structs

Created on 6 Apr 2016  路  3Comments  路  Source: crystal-lang/crystal

_Follow-up to Allow adding methods to C structs_

Wrappers of C libraries are currently hindered because C/lib structs are not full-featured and have very strange and buggy behavior when it comes to constructors (see #557), so in order to make a proper interface for the library one has to make another type that represents this struct and convert to and from this type. Note, however, that these small structs are often abundantly used in libraries, and things like a huge array of structs is not rare. Copying and converting such an array N times per second is not good for performance, and neither are the ubiquitous conversions between the two types for function arguments and return values.

lib L
  struct S
    a : Int32
    b : LibC::Char*
  end
end

struct S
  def initialize(@a : Int32, @b : LibC::Char*)
  end

  protected def initialize(s : L::S)
    @a, @b = s.a, s.b
  end
  protected def to_c : L::S
    L::S.new(a: @a, b: @b)
  end
end

I would like to note that currently structs and lib structs behave mostly the same, and have the same field layout, as this example shows:

str = "abc"
arr = [S.new(5, str.to_unsafe), S.new(7, str.to_unsafe)]

larr = [] of L::S  # needed because "can't execute `larr[1]` - Array(L::S) in `larr` was never instantiated"
larr = (pointerof(arr) as Pointer(Array(L::S))).value

p larr[1] #=> L::S(@a=7, @b=Pointer(UInt8)@0x435274)

Obviously, the example is vile, but my point is if the layout is the same then we should merge these two types into one thing.

The only problem is the ability to add fields anywhere in the code, which breaks everything.

struct S
  @c = ":("
end

So my suggestion is to make structs defined inside lib to be just like normal structs, but without the ability to add more fields to them. The syntax will also create the usual members in that struct, to not break backwards compatibility: initialize with 0 args that zeroes all fields, initialize with N args that sets all members, and property for each member. It is also important to be able to override all of these (maybe zeroing all bytes is not wanted, and maybe some special behavior for setters is wanted).

Sure, the inability to add fields still makes the structs "not full-featured" but how often does one really need to add fields to an existing type? This is a bad practice anyway.

The next step would be discussing what happens with inheritance. I think it should be possible to inherit from these structs and even add fields, as that doesn't affect interoperability. It would also be nice to have downcasting that drops the extra fields...

accepted draft compiler

Most helpful comment

This is something we will definitely consider. I at least know @waj doesn't like it that there are two different struct types, one for C and another for Crystal, and the cost of copying between them, both in terms of more code written and executed, is real.

My main doubt is: what happens with C unions? Maybe they too, could be like Crystal structs, only that when you read/write instance variables they act in a different way (they read/write from/to the union).

If we go this way maybe it's better if such structs are marked with an attribute. For example @waj suggested "extern". Maybe something like this:

@[Extern]
struct Foo
  @x : Int32
  @y : Int32
end

This would tell the compiler to maybe generate the "zero" initializer (zero all fields), generate properties (although these will still be kind of magical, because to_unsafe can be implicitly used there), and disable checking that @x and @y are initialized in all of the initialize methods. Then these structs can live in Crystal's side (not inside a lib) and they can appear in docs, etc.

All 3 comments

This is something we will definitely consider. I at least know @waj doesn't like it that there are two different struct types, one for C and another for Crystal, and the cost of copying between them, both in terms of more code written and executed, is real.

My main doubt is: what happens with C unions? Maybe they too, could be like Crystal structs, only that when you read/write instance variables they act in a different way (they read/write from/to the union).

If we go this way maybe it's better if such structs are marked with an attribute. For example @waj suggested "extern". Maybe something like this:

@[Extern]
struct Foo
  @x : Int32
  @y : Int32
end

This would tell the compiler to maybe generate the "zero" initializer (zero all fields), generate properties (although these will still be kind of magical, because to_unsafe can be implicitly used there), and disable checking that @x and @y are initialized in all of the initialize methods. Then these structs can live in Crystal's side (not inside a lib) and they can appear in docs, etc.

Some issues to implement this (though I think it's still possible):

  1. When setting a C struct field, to_unsafe is automatically invoked if needed, while that doesn't happen in regular structs
  2. When storing a Proc inside a C struct field, this is represented as a single pointer, and a check is done so that a closure can't be passed. In regular crystal structs this is stored as a normal Proc (two pointers).

But I think all of that can be solved: 1 by doing something special in the semantic phase (the same thing we are doing with C structs) and 2 by changing this logic for structs marked as C structs.

I still think this is a good way to do it, syntactically:

@[Extern]
struct Foo
  @x : Int32
  @y : Int32
end

We could even have this:

@[Extern(:union)]
struct Foo
  @x : Int32
  @y : Float64
end

Which means the fields overlap, and of course some type safety is lost, but one doesn't have to create two separate definitions and a wrapper.

I would still keep C structs and unions in lib declarations, because one might now need to expose these to a user.

I will need to check this syntax and semantics with @waj and @bcardiff (might take a few weeks) and then, if agreed, try to implement it.

There is some uncertainty about this particular approach, so I would like to continue exploring possible solutions.

I am starting to doubt the approach of confining C constructs inside a special lib namespace. After all, the only things present there that actually communicate with an external library are the functions and the globals. Who's to say that the enums that the external library defines aren't also perfectly usable in its object-oriented interface? Nothing wrong with constants and tidy structs either.

I don't know any other language that insists on isolating external libraries this much. Julia only has a construct to call a C function and a construct to get a C global. Nim has a construct to mark a function as a C function, otherwise it looks normal. I'm not aware of any language with a special syntax for "C struct".

Just to prove that functions and globals are all that's needed out of lib, they are all I use in one of my libraries.

If I was designing this, I would take the exact same approach as the mentioned languages. Completely ditch lib and give freedom to the developer.

The only concern here is that it may not be so easy to hide these functions anymore. But the recently added private module immediately solves this problem.

Was this page helpful?
0 / 5 - 0 ratings