See #8242 for original discussion/inspiration for this proposal.
We want authors of types to permit initialization from certain types in a normal variable declaration, like:
var x : MyIntWrapper = 30;
But we don't want to simply invoke a regular initializer in this case (e.g. new MyIntWrapper(30)). For example, a "MyVector" type's initializer could accept an integer to represent length, but probably doesn't want to allow initialization from a statement like:
var x : MyVector = 20;
In this proposal a normal user initializer ("proc init(...)") would only be invoked with a new expression.
This proposal also would introduce a new language initialization construct: init=. init= would allow for initialization with or without a new expression:
record R {
var x : int;
proc init=(x : int, y = 1) {
this.x = x * y;
}
}
var x = new R(10); // init=
var y : R = 10; // init=
var z : R = new R(10, 2); // init=
Furthermore, regular init methods would no longer be considered for copy-initialization:
record Q {
var x : int;
proc init(x:int) {
this.x = x;
}
proc init(r:Q) {
writeln("init(Q)");
this.x = r.x;
}
proc init=(r:Q) {
writeln("init=(Q)");
this.x = r.x;
}
}
var A = new Q(10);
var B = A; // prints "init=(Q)"
This means that the compiler would no longer emit a warning for completely-generic init methods. If the user did not write a init= method, the compiler would generate one for them (like current copy-initializers).
The compiler will will consider init to be 'more specific' than init= for the following statements:
// using record 'Q' from previous example
var A : Q; // declaration without initialization expression
new Q(10); // new-expr
new Q(someOtherQ); // NB: possibly confusing, may seem like a copy-constructor call
this.init(...); // sibling initializer call
We think this proposal would help to eliminate special-case code for wrapper types like 'bigint', and could also enable elegant initialization of atomic variables:
var x : atomic int = 5; // x.write(5);
1) Should we allow a completely generic formal for init= ?
I continue to really like this proposal (barring lurking surprises that we haven't anticipated), but I proposed it, so that's potentially not very interesting.
Should we allow a completely generic formal for init= ?
I don't see any particular to disallow it (do you?). I expect that authors of classes intended to be bulletproof would probably avoid doing so (unless, say, the class itself had a completely generic field that they were planning to initialize in this way?)
Maybe just a compiler warning? I think it could be useful in a case like:
record R {
var x;
proc init=(x) {
this.x = x;
}
}
var r = new R(10);
var z = r; // init=(r) , recursive record!
Perhaps the more-problematic case is a single init= with a fully-generic formal, whereas having at least one init= with a type-expr is enough.
@benharsh: Belatedly, I think we should generate a warning or error (which could just be a normal ambiguity error) if you have both init() and init=() overloads that match for a given new expression. So, specifically, I think that the following case from your example:
var A = new Q(O);
should trigger an ambiguity similar to how calling:
proc foo(x: int) { ... }
proc foo(x: int, y=0) { ... }
foo(0);
would. At least, I don't see any benefit in adding a tie-breaker for this case and can imagine it would cause confusion (the user has created an initializer that they probably think will be called in some cases but never will).
Quoting from record Q at the top:
var A = new Q(10); // calls init(int)
var B = A; // prints "init=(Q)"
var X = new Q(10);
var Y = new Q(X); // init or init=? compiler error!
If I understand everything correctly, this feature is unambiguous w.r.t. whether init or init= is intended for all argument types except the same type as this, i.e., Q.
var q = new Q(5); // init(int)
var q: Q = 5; // init=(int)
Is there merit for special-casing it such that:
var Y = new Q(X) // init(Q); No longer ambiguous
var B = A // init=(Q); Effectively copy-initializer
It seems odd that init[=](Q) is the only ambiguous case in this design (effectively meaning it is prohibited as a compiler error).
Edit: I prefer a defensively designed language, and I can't think of a case where init(Q) and init=(Q) do different things, so having the user pick one signature and not both is okay with me.
Edit2: Does var B = A; still work if copy-init is disabled with when the init= feature is introduced? When would a user ever want to init(Q) instead of init=(Q)?
It seems odd that init= is the only ambiguous case in this design (effectively meaning it is prohibited as a compiler error).
I may be misunderstanding you or missing something that you're seeing, but I think there could be other ambiguities beyond the "copy initialize" / "initialize with my own type" case. For example, given:
record R {
proc init(x: int) { // or even `proc init(x: int, y: real = 4.2) {`
...
}
proc init=(x: int) {
...
}
}
In this proposal, the call var r: R = 3; would unambiguously call init= but the call var r = new R(3); might be ambiguous between the two initializers. So we'd need to make a call between:
init= caseinit casewhere my current intuition is the ambiguity error, but maybe the fact that the init() case could have additional defaulted arguments suggests that we should choose one of the other options.
In this proposal, the call var r: R = 3; would unambiguously call init= but the call var r = new R(3); might be ambiguous between the two initializers.
Ah I see, that's where my confusion was. I thought what was proposed with the second case of var r = new R(3) would unambiguously call init(int) because that's the state of affairs today.
I don't think I like it being ambiguous because I could no longer do the following:
record R {
proc init(var x: int, var y: real = 4.2) { ... }
proc init=(var x: int) {
// Dispatch to other initializer.
this.init(x);
}
}
But maybe that's a different (but related) feature request.
I don't think I like it being ambiguous because I could no longer do the following
I started to get cold feet about the ambiguity as well once I realized that it wouldn't only apply when the signatures matched precisely... (but was hoping someone would point out something I missed... :) ).
In terms of breaking the ambiguity, my intuition would be to have the new R(42) case prefer the vanilla init() over init=() since it's what is typically used in the situation of new expressions.
Thanks for all of the great feedback. I'll update the proposal to indicate that init is favored in new expressions.
I've added another example that clarifies that new Q(otherQ) would still prefer init over init=.
The compiler will currently print a warning for a user initializer that can be resolved to something that looks like the copy initializer. Is the rule init > init= simple enough that we could do away with that warning?
Is the rule init > init= simple enough that we could do away with that warning?
That's my current thinking:
= then only init=() will be considered.init=()new t() then both t.init() and t.init=() will be considered with the tie going to the formert.init=(), the compiler will create a default one for you(Then I think we've still got a bit of a question about the conditions under which creating an init=() without an assignment overload鈥攐r vice-versa鈥攚ill generate a warning...?)
Two alternative proposals for consideration -- Maintainability of codes that have an "ambiguous" new t() that could call either init or init= was expressed (the current "Proposal 0" dictates the compiler will prefer init over init= on ambiguity).
init= except for the copy-initializer. (And even then, maybe?) There's too much guess-work when creating a class with one mutable field member x as to whether the user intends init=(x).new t(), always initialize with init. No ambiguity because init= is never considered.=, always initialize with init=. No ambiguity because init is never considered.This proposal is much more explicit and my personal favorite. There is a clean separation between the two initializing expressions. However, it does cause more boilerplate code to be required, especially when you want your e.g., init= to delegate to init.
Revert back to the original plan of an ambiguity causing compiler error. Rationale is the same as the original; it's not clear which the user intends.
=, always initialize with init=.new t(), both init and init= are candidates. If ambiguous, generate a compiler error. The question of a delegating initializer is still a problem here.
class R {
var a;
// Both initializers not allowed if init and init= are ambiguous.
proc init(a) { this.a = a }
proc init=(a) { this.init(a); }
}
But maybe this isn't a problem if the user is forced to remove one of the conflicting initializers (because the other functionally serves the same purpose).
Question: Is init= allowed to have default arguments? This may affect which proposal is the best to choose.
// Same number of arguments.
proc init(a, x, y) { ... }
proc init=(a, u = 1, v = 2.0) { ... }
// Different number of arguments.
proc init(a, x, y, z) { ... }
proc init=(a, u = 1, v = 2.0) { ... }
I don't have an answer at the moment, but have been thinking about default-args as well. I think the final proposal and spec should include examples involving default args for completeness/clarity.
I think the answer depends in part on how we handle generics for init=. Some of us have been discussing different ways to handle type aliases and fully/partially-instantiated types with initializers. An idea thrown around is to concatenate components of the type-expression when invoking init or init=, but we're still wresting with that idea and its potential consequences. E.g.,
var r : R(int) = 5; // R.init=(int, 5);
I just noticed that currently a generic 1-arg initializer needs to have a where clause to avoid being a copy-initializer. It'd be nice if we didn't need where clauses in this case, so that seems a nice property of init= (although I havn't read all of the above stuff yet; just noting I ran into this).
Should init= have to set type fields? My intuition is no.
record GR {
type t;
var x: t;
// Does init= have to set the t field?
proc init=(other: GR) {
this.t = other.t;
this.x = other.x;
}
// Or, is the type/instantiation of GR already known when init= runs?
proc init=(other: GR) {
this.x = other.x;
}
}
var a = new GR(int, 1);
var x = a; // x.init=(a) -- compiler needs to know x's type to call init=
var y:GR(int) = a; // y.init=(a)
Should init= have to set type fields? My intuition is no.
I felt similarly for a moment, but I do think they should set type and param fields. If they cannot, then init= would not be allowed to invoke other regular initializers, and I think we want to support that pattern.
Furthermore, I think it would be interesting to allow users to write their own init= that can infer type information from the RHS. For example:
record Vector {
type eltType;
var dom : domain(1);
var data : [dom] eltType;
proc init=(other: []) {
this.eltType = eltType;
this.dom = other.domain;
this.data = other;
}
}
var v : Vector = [1, 2, 3]; // v.init=([1,2,3]);
writeln(v.type:string); // Vector(int);
I will include other examples in the proposal I'm working on.
Catching up on this thread... I tend to agree that init= should have to explicitly set the generic fields. If we were using it only for the copy initializer I could imagine relaxing that, but since we're thinking of this as a general way to do "initialization via =" I think we need to be able to handle more general patterns.
(I also think that it'd be easier to keep it explicit and see where that gets us and then consider relaxing things to potentially make it more implicit later than vice-versa).
Oh, I meant to mention that I'm receptive to Bryant's proposal 1 and proposal 2 but, like Ben, want to make sure we've got the generic story firmed up better before making any hard and fast calls there.
Thinking about the notion of user-defined coercions on the way home yesterday (#5054), I wondered whether the init= proposal could obviate the need for a distinct user-defined coercion concept.
Specifically, I think of such coercions as being most useful for initializations and argument passing for copy-in contexts:
record MyInt { ... }
proc +(x: MyInt, y: MyInt): MyInt { ... }
var i: MyInt = 42; // would like to have "42" become a `MyInt`
writeln(i + 2); // would like to have "2" become a `MyInt` in order to call +(MyInt, MyInt)
With init=, we'd no longer need user-defined coercions for the first case. What I was realizing is that maybe it would suffice for the second case as well, particularly given recent discussions like this one relating the in intent to variable initialization.
@mppf, @benharsh, thoughts on this?
@mppf, @benharsh, thoughts on this?
It seems like an interesting idea but I'm not really caught up enough on init= to have more of an opinion yet. It reminds me of C++'s implicit coercions with 1-argument constructors (noting of course that we have a different name, so avoid the worst of the C++ choice).
Just a quick note that PR #10953 modified a large number of tests whose initializers could be cleaned up if they didn't need to assert "I'm not a copy initializer" via their where clauses.
Specifically, I think of such coercions as being most useful for initializations and argument passing for copy-in contexts:
record MyInt { ... } proc +(x: MyInt, y: MyInt): MyInt { ... } var i: MyInt = 42; // would like to have "42" become a `MyInt` writeln(i + 2); // would like to have "2" become a `MyInt` in order to call +(MyInt, MyInt)With
init=, we'd no longer need user-defined coercions for the first case. What I was realizing is that maybe it would suffice for the second case as well, particularly given recent discussions like this one relating theinintent to variable initialization.
In today's deep-dive, it was pointed out that there was a flaw in my reasoning for the i + 2 case above, which is the following: Since the default argument intent for records is const ref, the +() operator I wrote above on the MyInt type wouldn't cause the init= overload to fire by my "in intents are essentially equivalent to variable initialization" argument. If I were to define proc +(in x: MyInt, y: MyInt) { ... } then my argument would be more likely to fire, but then I'd end up invoking copy initializers for all arguments, which is probably not what I'd like.
In spite of that, I continue to think that treating in and initialization as symmetrically as possible is desirable and continue to think that init= is probably a better road to user-defined coercions than what we'd been thinking about before.
BTW @mppf this was the comment I was referring to today, and am happy to remember that you did in fact see it even though you didn't remember that today. :)
We have init= implemented and are discussing if init= should enable automatic conversions in #16576 / #16582 / #16554.
Closing.
Most helpful comment
@benharsh: Belatedly, I think we should generate a warning or error (which could just be a normal ambiguity error) if you have both
init()andinit=()overloads that match for a givennewexpression. So, specifically, I think that the following case from your example:should trigger an ambiguity similar to how calling:
would. At least, I don't see any benefit in adding a tie-breaker for this case and can imagine it would cause confusion (the user has created an initializer that they probably think will be called in some cases but never will).