As a Chapel Programmer, I want to be able to use noinit to squash default initialization of certain types, particularly large arrays, and to be able to define my own initializers that respond to noinit such that fields within them can be left uninitialized as well because sometimes the speed of avoiding initialization trumps the safety benefits that initialization brings.
Acceptance criteria:
noinit is supported for arrays and users can write their own initializers that respond to noinit
For reference, CHIP 10 and CHIP 12 have a bit of information on the vision for this feature, though I think we've also worked up some more detailed examples at times that don't seem to be reflected there.
A handful of existing noinit tests fail under --force-initializers:
I think I'd be OK with futurizing noinit tests while switching to initializers given that (IIRC), we never had it working for arrays to begin with which is the important case, and will likely need/want to reimplement it from scratch in an initializers world.
While designing this, we should keep in mind that if we're using a comm layer that can allocate memory and need to register that memory for best performance (currently just ugni), we want the registration to follow the first-touch so that we get the desired NUMA locality. As of this writing the first touch is always our default initialization, so we can do the registration call right after that. If in the future we don't do the default initialization then we'll want to delay the registration call until after whatever does the first touch. For use cases where the declaration includes a user-written initial value that obviates the need for default init it seems like it ought to be straightforward for the compiler to insert (however indirectly) the registration call. But for cases where the first touch is done elsewhere we'll need to think about what to do. Note that there's no functional effect from not registering, but when program performance is driven by the cost of remote references the difference between registering and not can be quite dramatic.
I think we have recently been thinking a record author would write proc noinit() to implement support for noinit, rather than having a param argument on the default initializer.
An interesting alternative (that might make sense to the compiler but maybe not to humans) would be to express the noinit as copy-initialization from a special type, e.g. proc SomeType.init=(rhs: noinit) ).
Besides the exact naming for the initializer run on var x: MyRecord = noinit;, I have the following questions:
Supposing a record is initialized with var x: MyRecord = noinit; and that runs MyRecord.noinit(). Should a MyRecord.postinit() function be run immediately after MyRecord.noinit()? Or should records initialized in this manner always skip postinit ?
What happens with var x: MyRecord = noinit; when proc MyRecord.noinit() does not exist? I can see these options:
xx completely uninitializedWhat does var y: int = noinit do? In some ways this is similar to the above case (because int isn't a record, in a strict sense it doesn't (yet) have initializers, so it can't have a proc noinit()).
yy completely uninitializedWhat happens when a no-inited variable reaches end of scope? Here I understand the options to be:
proc deinit that operates accordingly.What happens when a no-inited variable reaches end of scope? ... It could be that the compiler doesn't add a deinit call for no-inited variables at all
The problem that I see with this is that the compiler could easily lose track of a variable being noinited when it is passed by in intent. Consider for example
{
var x: MyRecord = noinit;
acceptInIntent(x); // copy-elision applies
}
proc acceptInIntent(in arg: MyRecord) {
// compiler runs arg.deinit() here at the end of this function,
// but that might not make sense... e.g. MyRecord might have some
// key fields not set.
}
This kind of pattern currently comes up in compiler-generated initializers for classes and records.
What happens with
var x: MyRecord = noinit;whenproc MyRecord.noinit()does not exist?
What doesvar y: int = noinitdo?
I think that case is generally better served by split-init. However, issue #15808 describes a case where one might wish to do something like split-init in a case where it is not currently implemented. If we do not change the split-init rule for on blocks, the situation will be that the programmer will need to opt in in some way to moveing the variable across locales. While noinit could be a way to express that, it seems to me that doing so would have noinit solve two problems rather than one. I would prefer instead that noinit be focused on the case of data structures where the data structure author writes code to support not initializing some elements. (i.e. cases similar to the array case).
This causes me to lean away from the option of leaving these values uninitialized.
I understand that we have been reserving noinit for this, but since (in my understanding anyway) noinit is really about allowing a type to do less initialization (e.g. allocate array elements but do not initialize them) - why not just expect record authors to provide initializers that do that? See also https://github.com/chapel-lang/chapel/issues/15703#issuecomment-657249927
In particular, suppose a record has a buffer that it manages and the record author would like to provide a feature where the buffer is allocated but the elements are not initialized. We have been thinking that the record author would write this:
record Container {
type eltType;
var buffer: [1..n] eltType;
proc noinit(type eltType) {
this.eltType = eltType;
this.buffer = noinit;
}
}
and then the user could write
var c : Container(int) = noinit;
But, is that really better than having the record author write this:
record Container {
type eltType;
var buffer;
proc init(type eltType, param initElts=true) {
this.eltType = eltType;
this.buffer = {1..n}.buildArray(int, initElts=initElts);
}
}
and then the user could write
var c = new Container(int, initElts=false);
?
(Either way, as I understand it, the user of the container will have to do other relatively involved things: explicitly move any non-pod elements into the storage & indicate to the data structure when all elements are initialized).
I don't really see why this feature needs language support. I think it could be implemented at the library level.
But, is that really better than having the record author write this
This may be unreasonable on my part, but I have a different reaction to exposing aspects of ordinary array initialization to the user/record author versus exposing noinit-specific created methods.
This may be unreasonable on my part, but I have a different reaction to exposing aspects of ordinary array initialization to the user/record author versus exposing noinit-specific created methods.
I'm not trying to propose that D.buildArray specifically be the interface. Merely trying to say that I don't currently see why we need a keyword to implement the noinit concept. Either way, for arrays, I think we will need some special methods that will be used along with noinit. (e.g elementInitializationComplete() in https://github.com/chapel-lang/chapel/issues/15703#issuecomment-657247606 ). Could the process of constructing a noinit array also just be done by calling a special method?
An interesting alternative (that might make sense to the compiler but maybe not to humans) would be to express the noinit as copy-initialization from a special type, e.g. proc SomeType.init=(rhs: noinit) ).
I think we (@benharsh and I) discussed this when we were introducing init=, but didn't get excited about it for whatever reason (I think some forms we kicked around included init=() (no arguments), init=(noinit), etc. I think our general sense was that either form was essentially introducing a special-case and that proc noinit() had the advantage of seeming more straightforward / self-explanatory (to someone accustomed to using it on an array of reals, say).
Supposing a record is initialized with var x: MyRecord = noinit; and that runs MyRecord.noinit(). Should a MyRecord.postinit() function be run immediately after MyRecord.noinit()? Or should records initialized in this manner always skip postinit ?
That's an interesting question. I'd expect them to skip .postinit() by default ("since I'm not initializing it, I'm not running the thing that should happen once its been initialized"). I also think postinit() is less interesting for records than for class hierarchies鈥攊n that if there's something I wanted to do in postinit(), I could do it directly within init() / noinit() for a record, but not always for a [parent] class. So I don't think there's loss of capability by adopting this behavior.
What happens with var x: MyRecord = noinit; when proc MyRecord.noinit() does not exist? I can see these options:
I would expect:
Leave x completely uninitialized
What does var y: int = noinit do? In some ways this is similar to the above case (because int isn't a record, in a strict sense it doesn't (yet) have initializers, so it can't have a proc noinit()).
I would expect:
Leave y completely uninitialized
What happens when a no-inited variable reaches end of scope? Here I understand the options to be:
I would expect:
It could be that the compiler doesn't add a deinit call for no-inited variables at all
What happens when a no-inited variable reaches end of scope? Here I understand the options to be:
I would expect:
It could be that the compiler doesn't add a deinit call for no-inited variables at all
Did you see https://github.com/chapel-lang/chapel/issues/8792#issuecomment-657242695 ? I think the problem shown in that comment makes this strategy unworkable. The reason is that a noinited variable cannot be passed by in intent until it is fully initialized - due to the way deinitialization works - even if the in intent will "move" the variable. For this reason, the prototype in PR #16064 uses the other option - of tracking at runtime with a field whether the elements need deinitialization. This seems to be needed at runtime in any case for arrays to perform resizing per your other comment .
I would expect:
Leave x completely uninitialized
Do you know of any use cases where this is useful? At the moment I feel I understand the purpose of no-init for arrays but am less confident that no-initializing a stack variable or field is useful (as compared to split initialization).
It could help with an explicit way to write #15808 - but similarly to https://github.com/chapel-lang/chapel/issues/15703#issuecomment-658741024 we have the problem that the way of eventually setting it differs depending on whether or not x had proc noinit() run or if it was completely uninitialized. (Noinit -> initialize with =; but uninitialized -> initialize with moveInitialize).
If we consider the case where somebody might want to opt-in to something like a split init pattern across an on statement - if noinit does that, they would write e.g.:
{
var A:[D] eltType = noinit;
on AnotherLocale {
forall i in D {
moveInitialize(A[i], something());
}
}
}
But what if they wanted this code to be generic on the type? Meaning they don't know it's an array? Then would they write:
{
var A: SomeType = noinit;
on AnotherLocale {
moveInitialize(A, something());
}
}
and expect moveInitialize to move initialize the elements of the array rather than completely replacing the LHS? Does that imply that we will need not only proc noinit() but also something like proc moveinit() ?
Or, should noinit be a low-level feature that just behaves differently for each type?
After a discussion with @bradcray the plan for now is to allow noinit as a feature only for arrays of POD types. Discussion for things like moveInitialize will be put off into a separate issue. These features might still be worthwhile but are not specifically connected to this effort any longer since = will do for initializing a POD type.
PR #16064 helps with array no-init. I am currently thinking that array no-init should leave the array memory totally uninitialized and if we want to have records that also support some kind of no-initialization of storage that they should do so with a param initElts or similar argument to their initializer. We are adding the = noinit syntax for arrays because it is particularly useful for array declarations. I don't think that implies that noinit needs to be (or can be) a general feature that all records are expected to support in a uniform way.