for reference see #130, #1214
I was inspired after reading http://pling.jondgoodwin.com/post/favor-composition-over-inheritance/ to write this proposal. Please see the following code example for how this could work for Zig.
// interfaces are just structs without fields
const Dump = struct {
// declare an undefined (unimplemented) method
// invoking an undefined method is a compile error
// self is specified with type `Dump` but is just a placeholder
pub fn dump(self: @This()) anyerror!void = undefined;
};
// another interface
const QuantifySize = struct {
pub fn quantifySize(self: @This()) Size;
pub const Size = enum { small, medium, big };
};
// this is an implementation of QuantifySize interface
const Small = struct {
// inherit QuantifySize
pub inherit QuantifySize;
// this particular impl is static and does not require any fields
pub fn quantifySize(self: Small) Size {
return Size.small;
};
};
// this struct will be inherited
const Vechicle = struct {
pub mass: f64,
pub volume: f64,
pub color: []const u8,
fn init() {
return Vehicle{
.mass = 0,
.volume = 0,
.color = "blue",
};
}
};
const Spaceship = struct {
// inherit all pub members: { dump }
// field name not required: Dump has no fields
// `*Spaceship` is implicit-castable to `*Dump`
pub inherit Dump,
// implement Dump interface
// self is now typed
pub fn dump(self: *Spaceship) anyerror!void {
std.debug.warn("this ship's size is {}\n", self.quantifySize());
}
// inherit all pub members: { QuantifySize, quantifySize, Size }
// field name not required: Small has no fields
pub inherit Small,
// inherit all pub members except: { shunned }
// `*Spaceship` will not implicit-cast to `*Vehicle` because we did NOT inherit all members
pub inherit ship: Vehicle { mass, value },
// another possible inherit but not public
inherit engine: ImpulseDrive { thrust, engage },
crew_max: u32,
fn init() {
return Spaceship {
.ship = Vehicle.init(),
.engine = ImpulseDrive.init(),
.crew_max = 4,
};
}
};
pub fn main() !void {
var roci = Spaceship.init();
try call_dump(roci); // implicit-cast to `Dump`
std.debug.warn("ship mass is {}\n", roci.mass);
std.debug.warn("ship color is {}\n", roci.color); // compile error: `color` not inherited
std.debug.warn("ship color is {}\n", roci.ship.color); // ok: explicit access as aggregate
try roci.engage();
try roci.thrust();
}
fn call_dump(obj: Dump) !void {
try object.dump();
}
Maybe usingnamespace would be used here?
const S1 = struct{
x: u32;
fn s1Func(self: S1) void{
// do stuff with x
}
}
const S2 = struct{
usingnamespace S1;
y : u32;
fn s2Func() void{
// do stuff with y
}
}
But what about the division between namespace members and instance members. Embed both? Embed only instance members..?
s2instance.s1Func( ..expects S1..) // example of issue
Usually when you are dealing with inheritance in OOP languages, it's all about instances.
Maybe
usingnamespacewould be used here?
Overloading that keyword which has the word "namespace" implies that it does not include instance members.
But what about the division between namespace members and instance members. Embed both?
I'm inclined that the default inherit is to bring in all (pub) fields, methods and defs.
But if differentiation between { fields, methods } and { defs } is to be made, we could allow defs syntax for inherit:
pub inherit QuantifySize; // inherit defs: { Size }
this is just bikeshedding, but would rather have usingnamespace changed to using and have it import defs (including functions/methods) and then have an embed keyword which import fields. this way data and behavior are not so tightly coupled.
If you take the "be explicit" goal of Zig, perhaps you should just use methods in unions. Then you have a very explicit set of code that shows exactly what you mean.
const AUnion = union {
typeA: AStruct,
typeB: BStruct,
...
fn foo(self: AUnion) anyerror!bool { .... }
}
Comptime code can probably help here as well as tagged unions to sweeten up the coding a bit. The nice part about this is that AStruct and BStruct do not necessarily have anything to do with each other.
Inheritance fails in some fundamental ways. Consider points. A struct with x and y, right? What about polar coordinates? They are points. They have a similar interface, but their interior data does not look at all similar and their methods' implementations are quite different. Perfect use for unions if I want to be able to pass around a generic point.
DRY and similar ideas like normalization in relational databases sound good on the surface, but when you actually maintain a code base for a few years it gets a lot more nuanced and you start regretting being too tricky. Note how heavily performance-oriented code bases such as game code use things like ECS rather than "normal" OO decomposition. Code spends a lot more time being extended and maintained than it does being created. Designing Zig to be maintainable first is where the win is IMHO.
Sorry for the rant. We now return you to your regularly scheduled programming.
@kyle-github The issue with the union idea is how do you introduce a new type to that union when you didn't create the union? Whereas with composition, it doesn't matter what the thing is that you're embedding and where it came from.
const ExtendedDuration = struct {
embed time.Duration;
// ... new fields etc here //
}
// ... //
fn sleep(d: time.Duration) void { ... }
sleep(ExtendedDuration.init(...));
@Tetralux that is precisely what I am talking about. So much of C, C++ and Java language platforms and tooling are designed to help break your code into small separate pieces that everyone assumes that this is a good idea. It isn't IMHO.
In your example, you can just use time.Duration as a normal field in the struct and you can provide the sleep function taking an ExtendedDuration. Your implementation (which could call the underlying one). Easy to find. Explicit encapsulation. Hard to break.
What happens in your example when the owner of time.Duration adds a field that conflicts with a field you added? What happens in your example when the owner of time.Duration adds a method that conflicts with a method you added? This is the fragile base class problem.
I get that this sounds a lot like I am just putting off the issue, but it is a very real consideration for large scale projects. I do not see a reason for Zig to support and encourage patterns that are anti-patterns for long term software maintenance.
Sorry for the rant :-(
The astute reader will note that I may be flip-flopping on this issue. I remember commenting over a year ago on the issue of struct embedding and how that was used fairly often in C. That is still true, but only needed for C interop and now there is a whole set of attributes that can be applied on a struct that will be shared with C.
Since then I have been looking more info things like the fragile base class and specific classes of maintenance problems we've been running into that cause us (at work) to do stupid things like having library owners and downstream library users run the same (copied) set of API tests so that the library users can figure out when the API subtly changed and broke something downstream.
@kyle-github
I'd expect trying to embed a struct with fields that have the same name as fields in the struct you are embedding into, to be a compile error, since the whole point is that you can deref fields of the embedded struct on the enclosing struct.
Though, you could also have some sort of annotation you could apply to the embed time.Duration declaration which would cause the embedded fields to become shadowed by the ones in the struct instead, in cases where you explicitly know that is what you want.
You can also name the embedded struct: embed base: time.Duration so that you could refer to the embedded part explicitly.
Indeed, maybe in order for you to add the annotation for shadowing, you _have_ to name the embedded struct.
Perhaps you should also be able to set default values on embedded fields.
const Duration = struct {
seconds: usize,
// ...
};
const ExtendedDuration = struct {
embed base: time.Duration;
seconds = 60; // `f=V` not `f:T` or `f:T=v`
};
You could also require that you name the embedded struct and do base.seconds = 60 for explicitness.
This got long, sorry :-(
@Tetralux I think we are talking past each other here a bit. The fragile base class problem is well known, and well understood and the methods that are proposed here provide the ability to code the fragile base class problem in Zig.
The question is not whether or not you can embed, but whether you should. Based on my experience with this exact problem, to me this is an area where Zig has an opportunity to fall back to its goal of explicit, clear behavior and make it much more difficult to create the fragile base class situation.
More points:
Systems languages need to work for the long term. There are many code bases in C today that are 20+ years old and are regularly used: Linux, Apache, Gnome, X Windows, macOS kernel... These are the kinds of things that Zig is aiming at. Initial coding should not be painful, but it also isn't the place you spend most of your time.
I think it was Dennis Ritchie who said that debugging is much harder that writing code, so that if you spend all your brain power writing the code, by definition you cannot debug it. I am paraphrasing a bit. The key point is that clarity and simplicity _matter_ when the code is going to live for decades. Tricks and little things that helped you save 5% of your lines of code when writing the project usually are things you regret later.
I have high hopes for Zig. It seems like it takes so many of the lessons that we have learned over the last five decades of systems programming and uses the best practices and careful, but not intrusive, means to enforce those.
I think we should probably take this offline. I cannot tell if anyone else cares or is just eating popcorn :-)
I think it was Dennis Ritchie who said that debugging is much harder that writing code, so that if you spend all your brain power writing the code, by definition you cannot debug it. I am paraphrasing a bit.
Kernighan's Law:
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
@kyle-github
It'll take me a while to take all that in, but the part about the implementation.
The compiler knows statically where the embedded field is and just uses that as a comptime offset to access the fields. It's no different than if you dereference a field that literally in the struct.
Also, the laws you are quoting sound like you should spend little effort thinking about how to write the code because you need to spend half your effort on debugging. Which I agree with - just to be clear.
Further, all I'm looking for here is that I have some common functionality about several types and I want it to be easy to just say "use the same impl as this other struct."
Or do you have a better suggestion to make as to how Zig could fix or otherwise address the need for that @kyle-github?
Kernighan's Law:
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Thanks @daurnimator! Apparently my memory for quotes is not that good before my morning caffeine :-)
@tetralux, Where you say:
Further, all I'm looking for here is that I have some common functionality about several types and I want it to be easy to just say "use the same impl as this other struct."
Then, yes, there is a better (IMHO) way to do it. First, this is inheritance, not composition, that you are talking about. You are trying to inherit implementation. That is exactly what is now understood to be relatively fragile.
The way I would normally do this is something like:
const StructA = struct {
... some fields ...
fn foo(self:StructA) niftyError!i32 { ... }
... more methods ...
}
const StructB = struct {
... some fields ...
fieldA: StructA,
... some methods ...
fn foo(self:StructB) niftyError!i32 { return self.fieldA.foo(self.fieldA); }
... other stuff ...
}
You make it explicit. If someone adds fields or methods to StructA, _nothing_ happens to StructB users. This seems like a violation of DRY, but it really isn't. You are encapsulating StructA in StructB.
Could there be some syntactic sugar at some point for this? Perhaps, but languages are moving in the direction of making things a lot more explicit (see all the @override stuff from Java).
Thanks for the comments and pushing me on this. It has helped me refine why I am not fond of this idea. The core of what I object to in the examples is that the exporting of the functions of StructA (in the example) is implicit. If there was some way to make them explicit so that you did not have to write the boilerplate above, but still had complete control over which functions you supported, then I could get behind the idea.
Firstly thank you everyone for input and replies. It is truly appreciated.
Summary changes from original proposal:
inheritdelegateimplementimplement and does keyword become unclear at that point?TERM | DESCRIPTION
--- | ---
base | delegation subject
derived | @This()
field | struct data member
struct method | regular struct method with param[0] self
method | const struct method
member | field or method
This proposal was inspired by article Favor Composition Over Inheritance. Some shared goals are to reduce fragile inheritance issues, and be more explicit. With that said, while I intended to keep this proposal on point as much as possible, if there are ancillary changes to Zig I will include them and mark as such.
delegate.If base has any fields they are aggregated just as with regular struct aggregation. The key difference is that delegation fields/methods are also directly resolvable to derived namespace. The type and address of delegated members are delegated to the aggregate at comptime.
To combat the fragile nature of namespace collision, either full or explicit delegation may be performed. When full delegation is performed, all base fields/methods are delegated and is probably appropriate for only very simple cases where multiple bases are not involved. Explicit delegation is more verbose and locks down exactly what is delegated and is recommended for more complex composition cases. Also, a powerful consequence of full delegation is that implicit casting to base type is gained.
implement.Multiple interfaces with the same method names may be implemented in the same derived. However, the method signatures must exactly match.
A derived struct which implements an interface gains implicit casting to interface.
[ANCILLARY] #1717 (an accepted proposal) syntax has been incorporated into this proposal.
delegation = 'delegate' name ':' base [member_list] ',' ;
delegation = 'delegate' base [member_list] ';' ;
name = identifier ;
base = identifier ;
member_list = '{' member [',' member]... [','] '}' ;
member = identifier ['=' default] ;
delegate is a new keyword for delegation.
name must be specified if base has fields or omitted if base has no fields. If base has fields, derived will aggregate base. A colon is required after name for consistency with regular aggregation syntax.
base is the source struct for member delegation.
member_list is optional and specifies which members are to be delegated, otherwise all members are delegated.
member may optionally provide a default value that is available to derived or descendants of derived.
const A = struct {
one: u32,
};
const B = struct {
padding: [8]u8 = undefined,
delegate a: A,
};
fn foo() void {
var b = B{ .one = 1 };
_ = b.one; // 1
_ = &b; // 0x1000
_ = &b.a; // 0x1008
_ = &b.one; // 0x1008
}
implementation = 'implement' interface ';' ;
interface = identifier ;
implement is a new keyword for which declares this struct implements an interface.
interface must be a struct which does not contain fields, and contains undefined methods. It may contain other namespace defs but only undefined methods are candidates for implementation.
const Format = struct {
const Error = error{FormatError};
const format = fn(self: @This(), out: *Stream) Error!void;
};
const Data = struct {
implement Format;
one: u32,
two: u32,
const format = fn(self: @This(), out: *Stream) Error!void { ... };
};
A struct which delegates all members from base can implicitly cast to base.
const A = struct {
one: u32,
two: u32,
};
const B = struct {
delegate a: A,
};
const C = struct {
delegate a: A { .two },
};
const bar = fn(a: A) void { ... }
const foo = fn() void {
var b = B{...};
var c = C{...};
bar(b); // ok
bar(b.a); // ok
bar(c); // error
bar(c.a); // ok
}
const Format = struct {
const Error = error{FormatError};
const format = fn(self: @This(), out: *Stream) Error!void;
};
const Cursor = struct {
implement Format;
pos: Size = 0,
const Error = error{SeekError};
const seek = fn(self: *@This(), where: Where) Error!void;
const Where = union(enum) {
pos: Size,
forward: Size,
backward: Size,
begin: {},
end: {},
};
const Size = u64;
const format = fn(self: @This(), out: *Stream) Error!void {
out.print("position={}", self.pos) catch return error.FormatError;
};
};
const Read = struct {
implement Format;
delegate cursor: Cursor, // error: Cursor.format collision
delegate cursor: Cursor { .pos, seek },
const Where = Cursor.Where;
const Error = error{ReadError};
const read = fn(self: *@This(), buffer: []u8) Error!void { ... };
const format = fn(self: @This(), out: *Stream) Error!void {
out.print("read@{*} { ", &self) catch return error.FormatError;
try self.cursor.format(out);
out.print(" }") catch return error.FormatError;
};
};
With this example we see that both Cursor and Read make use of a format interface. At first it might seem disappointing that Read cannot fully delegate Cursor but this is actually a good thing. In object parlance, a "reader" is not a "cursor" but it has a cursor which works very well for aggregation and delegation. In fact, we can see that through delegation the following is now possible:
const foo = fn() !void {
var read: Read = ...
out.warn("read position before {}\n", read.pos);
try read.seek(Read.Where{ .begin = {} });
out.warn("read position after {}\n", read.pos);
dump(read);
}
const dump = fn(subject: Format) !void {
try out.print("DUMP:\n");
try out.print(" ");
try subject.format(out);
try out.print("\n");
}
A method return type cannot be inferred. An error or value type must be explicit.
Undefined methods may only exist in structs without fields. They are essentially interfaces.
Namespace collision resulting from keyword delegate is a compile error.
Namespace collision resulting from keyword implement is a compile error.
old_syntax = name [':' type] ['=' default] ',' ;
new_syntax = '.' name [':' type] ['=' default] ';' ;
struct member decls of fields vs. namespace defs is currently resolved by using a comma terminator for fields. I think it makes sense to switch to new_syntax to differentiate.
delegation syntax terminates with either a comma or semicolon to stay consistent with old_syntax. However, this can be simplified to always terminate with semicolon if new_syntax is adopted.
const Foo = struct {
.some_flag;
.one: u32 = 1;
delegate .truck: Truck;
delegate Format;
const init = fn(backing: *std.mem.Allocator) Foo {
return Foo{
.some_flag = true,
.truck = Truck.init(),
};
};
};
@mikdusan While this is a lot nicer, I still have the same objections. By making explicit sharing/delegation optional and making the easy option one that allows the fragile base class case to exist more easily, it seems like the proposal actively encourages coders learning Zig to immediately use the bad habits they picked up in other languages.
Using one of @andrewrk's favorite expressions, if we look at the null hypothesis here, what happens if you do your examples in Zig of today? What do you lose? Is that loss important?
What is the problem that this solves? What needs struct inheritance and substitution but cannot be solved in a way that uses Zig today?
My perspective from several decades of coding is that time saved writing the code the first time is usually lost many times over in time you spend debugging and maintaining code because you used short cuts when writing the code.
I really like that approach on interfaces, both semantic-wise and syntax-wise.
But a few things:
with the new function syntax, this would still be a function type assignment:
const Format = struct {
const format = fn(self: @This(), out: *Stream) Error!void; // format is type, not function
};
i like the approach of defining an interface function as "undefined":
const Format = struct {
const format : fn(self: @This(), out: *Stream) Error!void = undefined;
};
But: this is already valid Zig code which defines an undefined function pointer named format.
As we impose several restrictions on interfaces ("only undefined functions, nothing else"), i propose to introduce a new type interface:
const Format = interface {
fn format(self: @This(), out: *Stream) Error!void;
};
pro:
cons:
We need a way to pass interfaces to functions or store references to them. Imho it does not make sense to pass interfaces by-value, but only by-reference. Thus, an interface object could be implemented as a pointer tuple of a type-erased pointer to the structure as well as a pointer to a static v-table.
As struct that implement a certain interface must implement all of the members and no "real" inheritance exists, the type erasure should not be a problem, if all interface functions take the interface by-reference
I'll note for future reference, to avoid the Rustian problem of things like
Peekable(Rewinder(Reader(T))), which is obviously messy, leads to code bloat, but also requires the order to be that which you gave.
You should be able to compose interfaces together:
const Stream = Reader + Rewinder + Peekable;
// or...
fn Stream(comptime T: type) type {
return Reader(T) + Rewinder(T) + Peekable(T);
}
fn read(s: *Stream) !usize { ... }
Something like that.
I just read the proposal again and found another problem:
A struct which delegates all members from base can implicitly cast to base.
This would imply we accept slicing which is usually a problem and/or programming error.
I would recommend to allow implicit casting only on pointer types and not objects itself. If we actually want to slice to object, we can just access the derived object field and make it explicit.
closed: I have some more ideas and they diverge enough from this proposal to warrant closing.
Most helpful comment
If you take the "be explicit" goal of Zig, perhaps you should just use methods in unions. Then you have a very explicit set of code that shows exactly what you mean.
const AUnion = union { typeA: AStruct, typeB: BStruct, ... fn foo(self: AUnion) anyerror!bool { .... } }Comptime code can probably help here as well as tagged unions to sweeten up the coding a bit. The nice part about this is that AStruct and BStruct do not necessarily have anything to do with each other.
Inheritance fails in some fundamental ways. Consider points. A struct with x and y, right? What about polar coordinates? They are points. They have a similar interface, but their interior data does not look at all similar and their methods' implementations are quite different. Perfect use for unions if I want to be able to pass around a generic point.
DRY and similar ideas like normalization in relational databases sound good on the surface, but when you actually maintain a code base for a few years it gets a lot more nuanced and you start regretting being too tricky. Note how heavily performance-oriented code bases such as game code use things like ECS rather than "normal" OO decomposition. Code spends a lot more time being extended and maintained than it does being created. Designing Zig to be maintainable first is where the win is IMHO.
Sorry for the rant. We now return you to your regularly scheduled programming.