Chapel: lifetime checking: specifying function return lifetime

Created on 3 Feb 2018  路  13Comments  路  Source: chapel-lang/chapel

I know of the following interesting patterns:

  • lifetime is based upon 'this'
  • lifetime is infinite
  • lifetime is based upon a minimum of the lifetimes of (certain?) actual arguments

What syntax can be used to specify these, in case the default behavior is not sufficient?

Here are a few examples where we'd like to be able to specify the lifetime of the return:

Function returning an element of a global hashtable

proc getGlobalHashtableElement( const ref key ) const ref {
  return globalHashTable[key];
}

Here, lifetime inference is likely to assume that the function could return the key argument. But, once key is used to find the hashtable element, the key is no longer used.

Method returning an array element

proc MyType.getElement( const ref key ) const ref {
  return this.array[key];
}

Function returning an element of an outer local hashtable

proc outerFunction() {

  var D:domain(string);
  var hashtable:[D] MyClass;

  proc innerFunction( key )  {
    return hashtable[key];
  }

  var element:MyClass = nil;
  {
    var key = "test";
     element = innerFunction( key );
  }
}

The annotation for innerFunction might need to indicate lifetime matches an outer variable rather than an argument.

Language Design

All 13 comments

Providing a way to say 'infinite' might be useful, but I don't think Rust-style lifetime annotations make sense for Chapel.

I think "infinite" is somewhat the same as saying "it's a raw pointer" so perhaps the way we'd say that is by using a "raw" pointer type (whatever that is).

We could call it Unsafe(T).

The case "copyObjectTree" returns an Owned thing. My understanding is that we need to concern ourselves with the lifetime only of borrowed things. (Correct?)

proc copyObjectTree(...) : owned MyTree {...}

To indicate the lifetime, I propose the syntax borrowed(variable or field reference(s)). Which indicates the shortest of those variable(s)/field(s)' lifetimes. Gives us an excuse to introduce the borrowed keyword. :) For example:

proc getGlobalHashtableElement(key) borrowed(globalHashTable) const ref {...}

proc MyType.getElement(key) borrowed(this) { ... }

It is meaningful to specify the lifetime at the variable granularity, rather than just "global" vs. "local scope". That is because even the global variables are destructed in a particular order. So, for example:

var Table1 ...;
var Table2 ...;

proc get1(key) borrowed(Table1) return Table1.get(key);
proc get2(key) borrowed(Table2) return Table2.get(key);

// illegal: adds a value that may be destructed while Table1 is still live
Table1.put(get2(some key));

// OK
Table2.put(get1(some key));

We can write the same code inside a function so all scopes are local.

BTW "infinite" is imprecise. For example, if a global variable "owns" a class pointer, it needs to delete it upon program shutdown. I assume the deletion is implicit, like for any owner.

By contrast, a global variable of an "unchecked" policy would not do that. The user needs to delete explicitly.

The case "copyObjectTree" returns an Owned thing. My understanding is that we need to concern ourselves with the lifetime only of borrowed things. (Correct?)

proc copyObjectTree(...) : owned MyTree {...}

I think this is a reasonable way to describe that pattern, yes.
If Owned is the way to say an owned pointer, it would become
chpl proc copyObjectTree(...) : Owned(MyTree) {...}

To indicate the lifetime, I propose the syntax borrowed(variable or field reference(s)). Which indicates the shortest of those variable(s)/field(s)' lifetimes. Gives us an excuse to introduce the borrowed keyword. :) For example:

I think the "excuse to introduce the keyword" is a joke that I don't get.
A reasonable alternative name would be scope or return scope (these are closer to D's terms).
Or it could be lifetime.

The other element you are describing here is to put the description of the return value lifetime near the return type. We could instead put the description of which arguments are available for lifetime inference of the return in each argument.

proc return-scope-borrowed-lifetime MyType.getElement(key) { ... }

Also, another way to write your first example might be this:

proc getGlobalHashtableElement(key) borrowed() const ref {...}

since for the lifetime inference purpose, the minimum lifetime of the empty set is "infinite" lifetime.

We could also instead of marking variables that are borrowed, mark variables that aren't:

proc getGlobalHashtableElement(cannot-return-borrow-of key) const ref {...}

(where obviously we'd come up with a better keyword name).

It is meaningful to specify the lifetime at the variable granularity, rather than just "global" vs. "local scope". That is because even the global variables are destructed in a particular order.

We could do that, but the current analysis does not. Instead it merely concerns itself with what is allocated in what block. I don't know if it would impact the analysis. I'm fairly happy at the moment with this more "coarse grain" approach, though.

BTW "infinite" is imprecise. For example, if a global variable "owns" a class pointer, it needs to delete it upon program shutdown. I assume the deletion is implicit, like for any owner.

By contrast, a global variable of an "unchecked" policy would not do that. The user needs to delete explicitly.

If a global variable "owns" a class pointer, it would only trivially participate in lifetime inference. Since it "owns" a class pointer, it will be assumed that a borrow from that global variable has at least as short a lifetime as the global variable. It's a fair point though that the global variables are destroyed in a particular order - it's similar to adjusting the checking to pay attention to the order of deletion within a block.

Good alternative options.

I am not excited about using borrow() to indicate "infinite" lifetime. Even though logically that's what it means, some users may get confused by seeing "borrow" where the returned value is "owned".

If we need to also annotate when a function does / does not "invalidate" references to a particular thing (such as an array's domain resize would invalidate references to the array - or Owned.clear() would invalidate borrows of that Owned), that might also require some sort of annotation. (This would support cases we don't currently check for as described in #8382).

proc getGlobalHashtableElement(key) borrowed(globalHashTable) const ref {...}

Two reasons this might be the right general idea:

  1. Near the return value since it's about return value lifetime
  2. The borrowed thing might not be an argument (it might be an outer or a global variable).

I think I'd prefer to use borrowing but if we end up with a borrow keyword, it might work in that environment too. Anyway, if such a keyword introduces a list of variables/arguments, the rule is that the function can only return a borrow of one of those arguments (and not some other variable / argument). That allows appropriate checking. On the other side, the compiler will assume the minimum lifetime of the variables/arguments mentioned in that list when inferring the return lifetime.

While we can quibble about the syntax, I agree that this is the ideal place to put the return lifetime. I'm still a fan of & over borrow, etc. but I'm not going to die on that hill.

There will be cases when the return value is not strictly a borrow of any variables. Array indexing is one example; returning a field of a record could be another one.

So maybe we want to present this annotation as "the returned thing has the lifetime of one of these:". In which case I would prefer the keyword to be "lifetime", not "borrow". (Even though I suggested "borrow" above.)

Or we can take the position that borrowing will be a common case. Users will not need to know about lifetimes except in weird corner cases. Even if borrowing is indirect, as in array indexing, we could still present it as borrowing-like.

In this case, a "borrow" variant makes a good sense.

Considering the three proposed variants:

  • borrow - concise, yet conveys the general idea
  • borrowing - "this is what this function is doing" - intuitively this appeals to me the most, except it is the longest
  • borrowed - "this is what the function has done", OK to use

There will be cases when the return value is not strictly a borrow of any variables. Array indexing is one example; returning a field of a record could be another one.

As far as the checker is concerned, these are "borrows" from the array/record containing the fields.

Since we've talked about this, several things have changed that might change the design direction.

  1. We how have a borrowed keyword. It might be reasonable to hang the lifetime specification syntax off of it.
  2. Instead of just specifying the lifetime of a function's return value, we probably need to be able to specify the relationships between the lifetimes of the arguments, in case one is assigned to the other. See #9853.

If we use borrowed for the lifetime annotation, that might make it harder to write a generic function that accepts/returns both borrows and also owned etc.

Was this page helpful?
0 / 5 - 0 ratings