Chapel: Should "post-initialize" methods support arguments?

Created on 24 Jan 2018 · 9Comments · Source: chapel-lang/chapel

In the core proposal for new initializers (issue #8283), the assumption is that
"post-initialization" methods don't / can't take arguments, such that only one
post-initialize method can be created per class / record type. This issue asks
whether we should support a means of creating post-initialize overloads that
take arguments to increase power / flexibility (and possibly confusion).

The current leading proposal is to have a call myC = new C(...args...);
translate into pseudo-code as follows:

myC = allocateMemoryForObject(C);
myC.init(...args...);
if (canResolve(myC, "postInit", ...args...) then
  myC.postInit(...args...);   // pass the arguments from `new` to the post-initializer if valid
else
  myC.postInit();             // otherwise, call the 0-argument version

Other proposals for supporting arguments on post-initialization routines could also be
proposed / considered. (One that has mostly been panned was to have the formals list
"match" that of the corresponding init() routine, but this felt onerous in that multiple
init()s couldn't share a single postInit(), and it also felt fragile w.r.t. minor adjustments
in formal argument lists — e.g., the renaming of an argument or a change to a default
argument value).

This issue is designed to capture thoughts and discussion around this potential
extension.

Tradeoffs with this proposal include:

Advantages:

gives the post-initializer access to the same arguments as the initializer itself making its role in implementing new... expressions a bit more first-class.

Disadvantages:

somewhat fragile; if a user got something slightly wrong in their post-initializer formals, they'd get an auto-call to the 0-argument version (assuming it existed), which could be hard to understand and debug
may not be necessary...
- the class author could always tuck init() arguments away in fields of the object for use by the post-initializer if needed
- we don't have enough experience with post-initializers to have a sense of this feature's importance
- if method calls were supported in a class as in issue #8289, postInit()s would likely have reduced need for additional arguments because more could be done in the init() routine

Language Design

Source

bradcray

Most helpful comment

If we accept this proposal, I suggest defining that the values of the actual arguments are cached at the point of invoking init() and reused as the actuals for postInit(), instead of being recomputed.

vasslitvinov on 24 Jan 2018

👍2

All 9 comments

vasslitvinov on 24 Jan 2018

👍2

the class author could always tuck init() arguments away in fields of the object for use by the post-initializer if needed

I can think of a few cases where the type author can't ferry the arguments from init() to postInit() through object fields.

If postInit() requires a type argument that the object can't be generic on. For example, if the class needs to be concrete, but its postInit() needs to use a type argument provided to init(). The class can't store the type without becoming generic. Objects created with different type argument wouldn't be compatible, even though the author wants them to be.
Similarly, if the postInit() needs to make use of a param argument to init() as a param, but the class doesn't want its objects to be of different types due to it.
Less "can't" than "shouldn't", and more of a stretch, if the values to be passed to postInit() aren't fundamentally object properties, stashing them in the object inflates the memory usage of the object.

For an example (of bullet 3 above, though github isn't formatting it as such), if a class maintains some mutating state, and the postInit() needs to set it up according to some initial conditions provided to init(). The object may have no need to remember its initial conditions throughout its lifetime, but requiring object fields to pass them to postIinit() would seem to require that (without gymnastics on the part of the author).

Or for another example, if the arguments are to control the behavior of postInit() like a verboseInitialization argument, or a filename for postInit() to open and consume, and the object has no need of the value once initialization is complete. Then it seems wasteful to require the object to store it forever for no reason other than to make it available to postInit().

(Particularly if the arguments to be passed through are comparable in size to the object itself, and the app needs lots of the objects.)

cassella on 25 Jan 2018

👍1

If we accept this proposal, I suggest defining that the values of the actual arguments are cached at the point of invoking init() and reused as the actuals for postInit(), instead of being recomputed.

I'm well down the widely-panned "matching formals" path, but I think that

postInit() shouldn't allow its arguments to be specified with default values. I think it would be confusing to follow if proc init(var x = 42) { ...} and proc postInit(var x = 54) { ... }. Plus, it would mean updating two places if you wanted to change the default (and wanted it to be the same default in both places). I think postInit() should always get what init() got, even if init() got it from itself.
I don't know how to write it instead of the ...args... formulation above, but I think you'd want postInit() to be invoked not necessarily with the actual arguments to init(), but with the formal arguments received by the particular init() that was ultimately resolved and called.

class C {
  proc init(x: int = 42) { ... }
  proc postInit() { writeln("nada"); }
  proc postInit(x: int) { writeln("tada: ", x); }
}

var c = new C();

I think you should end up getting tada: 42, not nada, even though the new statement provided no arguments to the initializer.

But I also think you're explicitly saying you don't want this "mostly panned" approach.

In my mind, ideally init()'s author would be able to communicate to the compiler which postInit() to invoke, and what args to pass it. I can't think of any good syntax for it. E.g., terrible syntax would be something like

proc init(x, y, z) {
  ...
  **postInit(x);
}

(On the plus side, the post-init routine could be named anything the author wants.)

cassella on 25 Jan 2018

Or maybe postInit() is the syntax, but not the function that gets called after init(). That is, instead of the syntax above, a class author could put their post-init work in a method callLater(), and init() could say

postInit(callLater(x));

postInit(callLater, x);

(The latter form looks less like callLater(x) is being executed in init(). And between postInit(callLater(foo())) vs postInit(callLater, foo()), the latter seems easier to intuit an unambiguous evaluation time for foo().)

The compiler/runtime would then cause to be strung together all the calls made like that.

Hmm. An init() routine could even register more than one post-init routine to be called. Or have different branches register different post-inits. Or even repeat in a loop.

cassella on 25 Jan 2018

A few other thoughts on multiple postInit()s.

How do multiple postInit()s interact with this.init() calls? E.g., in the case of

class C {
  proc init(x: int) { ... }
  proc postInit(x: int) { ... }
  proc init(x: real) { this.init(x:int); }
  proc postInit(x: real) { ... }
}

var c = new C(1.1);

Should both postInit()s be called, since both init()s are? If not, which one should? This is only an issue if the super.postInit() calls are made implicitly. If C.postInit() is responsible for calling super.postInit() explicitly, then it could call this.postInit() instead, if appropriate.

Throwiness

Here's another case I don't see immediately how to handle with just one postInit():

class C {
  proc init(x: int) { ... }
  proc init(x: real) throws { ... }
  proc postInit() /* throws? */ { ... }
}

If I vaguely understand the error handling model enough, a throwing function where one isn't expected can cause problems in certain strictness levels. So postInit() shouldn't be marked throws when the non-throwing init() is called. However, the throwing init() may need to be able to throw from its postInit(), so the postInit() needs to be marked throws. But if there's only one postInit(), it has to be one or the other.

cassella on 29 Jan 2018

I'm having trouble visualizing a situation where one would want to continue onto the postInit() if the init() function throws an error. This is an excellent point though - we'd probably want the structure of the AST w.r.t. error handling and initializers to be something like this in the case of classes:

try {
  init() call
  postInit() call // only occurs if the init() call returned without throwing an error
}

but it is probably worth opening a separate issue for the new initializers story and error handling

lydia-duncan on 29 Jan 2018

Sorry, I didn't explain that well. This is based on Buffers again. One buffer.init(out error:syserr) doesn't throw. But buffer.init() /*throws*/ has the intention of throwing once throwing from initializers is supported. But the place it will throw will be in postInit() (in the absence of initDone() (or further initializer design changes)). So at that point, if there's one postInit(), it must be marked throws. But then calling the non-throwing init(error) will call that same postInit() throws.

Even if there's state in the object such that postInit() doesn't go down the path that can throw, won't it seem to the compiler that that calling that non-throwing init() could still throw? E.g. even the non-throwing init() would need to be in a try, in certain levels of strictness?

cassella on 29 Jan 2018

There's a couple of things going on here:
1) I think you're viewing postInit() as called from init(), meaning that the init() functions must be marked as throws - I don't think that's something to worry about, since the postInit() call occurs strictly after init() calls.
2) you are correct that having a postInit() marked throws will impact the creation of all instances for that type if we only allow a single postInit() function. It would probably be reasonable to want a path where the new would not potentially throw an error, but it might also be possible to refactor the contents of init() and postInit() so that the thrown error only occurs in one init() function. In the case you are describing, it is likely that there would instead be two init() functions: one that throws an error and one that takes an out error. I think we would need experience with error handling and initializers to know how likely these situations are, though.

lydia-duncan on 29 Jan 2018

I believe that the overwhelming sentiment on this issue within the team is to not have postinit() support arguments, at least until such time as we have a real-world case that necessitates it. In that light, I'm going to close this issue and we can reopen/reconsider it if/when we have such a case in hand.

bradcray on 21 Mar 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

How do you add a package to Mason that requires changes to the Chapel run-time/compiler?

LouisJenkinsCS · 3Comments

Build REST API layer for Chapel

buddha314 · 3Comments

Should "new initializers" support field initialization prior to super.init()?

bradcray · 3Comments

Overloading assignment (=) on class types

bradcray · 4Comments

How should import statements handle multiple modules in the same statement?

lydia-duncan · 3Comments