Chapel: rules for when deinit is called

Created on 31 Oct 2018  路  11Comments  路  Source: chapel-lang/chapel

To the extent the language specification says anything about it, I think it says that variables local to a block are de-initialized at the end of the block. This issue proposes that we adjust the rule.

Here is what I want the rule to be:

  • Local variables, including temporaries are deinitialized at or before the end of the enclosing block

    • the exception is values that are returned; these are instead transferred to the caller

  • Global variables (possibly including temporaries - see #11526) are deinitialized at or before program shutdown
  • What does "or before" mean in the above? The compiler can move the deinitialization from the end of the enclosing block / end of the program to an earlier point if it is valid to do so. A valid point for a variable has the following properties:

    • the variable is no longer used directly (i.e. the variable itself is dead)

    • there are no references potentially pointing to that variable

    • there are no borrows potentially from that variable

  • Because deinitializers can have user-visible side effects (such as flushing a buffer, printing something to standard out, updating a file), the exact point at which these run (as well as the program behavior) can vary across Chapel versions.

Implementing such analysis would allow the resolution of #11492.

Examples to think about:

I/O

For the case in #11492, we have a statement like f.writer().write(A); and would like the channel that is the result of writer to be deinited sooner than it is today. This approach would allow that.

Array Slices

proc makeArray() {
  var A:[1..100] int;
  return A;
}
proc main() {
  ref slice = makeArray()[1..10];
  // point 1
  writeln(slice);
  writeln();
  // point 2
}

In this example, there is a reference to a call-expression temporary storing the slice, which in turn refers to a call-expression temporary storing the whole array. As a result, the compiler may not deinitialize either the array or the slice array-view of it at point 1. It must perform this deinitialization after these are no longer used, at point 2.

Owned and Borrows

proc main() {
  var instance: borrowed C = new owned C(1);
  // point 1
  writeln(instance);
  writeln();
  // point 2
}

This case is similar to the array slice case. Here instance is a borrow from an unnamed temporary storing an owned C. The owned C may not be deinitialized until after instance is dead; that is only at point 2.

User Variables vs Temporaries

Many of the important cases to call deinit earlier apply to temporaries, but I'm proposing that this behavior also apply to user variables. E.g.:

var someRecord = new MyRecord();
... code not mentioning someRecord ...

Here someRecord could be deinitd immediately after it is created.

RAII Lock

Chapel doesn't have a C++-style RAII lock type and it's unclear if we will add that in the future. However if we did add it, the lock type could use an attribute (see #7355) to communicate to the compiler that deinit for that type should not be moved. What would a C++-style RAII lock look like in Chapel? E.g.

{
   var lock = takeLock();
   // lock deinit function releases lock, so lock released at the end of this block
}

I suspect that there are clearer ways of writing this pattern. Python with contexts come to mind as a good alternative.

Language Design

All 11 comments

Maybe instead of the array example you could use this code:

class Data {
  var x : [1..10] int;
}

record MyArr {
  var data = new owned Data();

  proc this(r : range(?,?,?)) {
    return new MySlice(r, data.borrow());
  }
}

record MySlice {
  var dom : domain(1);
  var data : borrowed Data;

  proc init(r : range(?,?,?), data : borrowed Data) {
    this.dom = {r};
    this.data = data;
  }

  proc writeThis(f) {
    f.write(data.x[dom]);
  }
}

proc arrayMaker() {
  return new MyArr();
}

proc main() {
  const ref slice = arrayMaker()[1..5];
  writeln(slice);
}

Which I think demonstrates the same issue but with less internal/special behind-the-scenes code (i.e. array views).

If we're using the rule "at or before" can you actually rely on f.writer().write(A); doing what you want it to do? It seems like we should either say that temporaries are deinit'd at the end of the statement (with the ref-to-call-expr case being an exception), or say they're deinit'd at the end of the scope.

Edit: Otherwise I think it would become difficult for users to reason about the lifetime of their temporary/user variables by reading the code.

I had the same reaction / question as @benharsh just above, but he typed it faster than I managed to.

If we're using the rule "at or before" can you actually rely on f.writer().write(A); doing what you want it to do? It seems like we should either say that temporaries are deinit'd at the end of the statement (with the ref-to-call-expr case being an exception), or say they're deinit'd at the end of the scope.

Yes, I think it could be relied upon in that case, because the compiler would do a certain minimal level of analysis once at the time we add this feature, and that would be enough to cover that case. In more complex cases, I'd expect the behavior of programs to potentially change over time.

We could certainly consider specifying in more detail beyond what I put there. In particular, one reasonable approach would be to specify that the temporaries are either deleted at the end of the statement or at the end of the block. (I.e. without necessarily saying which one the compiler will choose in a given case).

I think it'd be interesting to be able to say something like:

  • in a ref/borrow case, the a temporary is destroyed at the end of the block
  • otherwise it's destroyed at the end of the statement

but the question is, can we describe in enough detail to a user what "a ref/borrow case" actually means? I certainly am confident that we can get the compiler to detect such cases conservatively, but what about the boundary of analysis, where the compiler thinks it might be a ref/borrow case but it's actually not? It's easy to create aliasing situations, for example.

So, is there a simple conservative rule you'd use to differentiate between the two, that we can specify and explain to users? If not, I think we should allow some variability depending on the strength of the analysis in the implementation.

(One common trick from alias analysis is the "address not taken" rule. Maybe there's something similar we can do here. But, in the I/O example, the channel temporary is passed by const ref to write, so it's not going to be that simple).

Maybe instead of the array example you could use this code:

I'm happy that you posted that, but I wanted to emphasize problems users can run in to with simple use of the built-in types and functions.

I was thinking about a rule like "free temporaries at end of statement unless the statement is variable declaration". It seems like that would be enough for records*, but maybe not for owned/borrowed classes? But maybe those need to be a special case anyways? For example:

class C {
  var x : int;

  proc deinit() {
    writeln("destroy: ", x);
  }
}

proc makeOwnedC() {
  return new owned C(5);
}

proc store(ref lhs : borrowed C, rhs : borrowed C) {
  lhs = rhs;
}

proc main() {
  var x : borrowed C;
  writeln("---- before ----");
  store(x, makeOwnedC().borrow());
  writeln("---- after ----");
  writeln(x);
}

This program prints:

---- before ----
---- after ----
{x = 5}
destroy: 5

* "seems like that would be enough for records": if we can ignore passing a record to an extern function by ref...

@benharsh - your owned/borrow example is an interesting one. I'd personally lean against special-casing owned/borrowed because records can contain owned/borrowed and then we have to special-case the records and then it doesn't feel so special as much as what happens half the time :)

Anyway, for this code:

proc main() {
  var x : borrowed C;
  writeln("---- before ----");
  store(x, makeOwnedC().borrow());
  writeln("---- after ----");
  writeln(x);
}

we can observe that there was a borrow from the owned temporary. Maybe that would trigger destroying that temporary at the end of the block. Maybe the idea you have is sufficient for refs.

As far as I can tell at the moment - Brad, BenH, Vass and I are on the same page for the general direction here.

However Brad and BenH would like more specific rules that can be in the specification (and read, understood, and relied upon) than what I put in the issue description.

What remains to do is to come up with the exact specific rules.

I suggest that "valid points" must be at statement boundaries. This should implicitly include the end of the block.

So, we have several options identified:

  • deinit all variables at end of block (what we did before PR #14691)
  • deinit at last mention or end of block according to expiring value analysis (on master after PR # 14691)
  • deinit all user variables at end of block; call temps in variable declarations at end of block; call temps in other statements after the statement (https://github.com/chapel-lang/chapel/issues/11534#issuecomment-435117383)

I'd like to add another option:

  • deinit user variables at last mention. For call temporaries in a variable declaration, deinitialize just after the variable declared would be deinitialized (so e.g. for a slice captured by ref, the slice is deinitialized after the last mention of the ref). For call temporaries in other calls, deinit them after the statement.

PR #15041 implements a simpler rule for deinit points that enables this program to behave as originally expected.

A variable is dead:

  • after copy elision if it occurred (after the last mention is
    used to copy-initialize a variable or in intent argument)
  • for the results of nested call expressions not involved in initializing
    a user variable or reference after the end of the statement
  • otherwise, at the end of the scope in which it is declared
Was this page helpful?
0 / 5 - 0 ratings