Chapel: nilable types

Created on 15 Mar 2019  路  34Comments  路  Source: chapel-lang/chapel

This issue proposes that Chapel class types include a type modifier indicating if they can store nil or not. This would be in addition to the existing owned shared borrowed or unmanaged modifier.

Why have compile-time checking against nil?

Chapel currently creates a halt on a nil pointer dereference if compiling with --checks. But, these checks are removed with --no-checks or --fast. That means that programmers who want to practice defensive coding (for example, when writing a function that should behave reasonably with any arguments) need to add nil checks to their code if they want the code to behave correctly in any compilation mode. This can be particularly important in large projects.

In C-like languages, derefercing a nil pointer is a common error that can be difficult to debug. Sure, the debugger can easily tell you where the nil dereference occurred - but the challenge usually amounts to figuring out why the value came to be nil.

Meanwhile, there is a trend in recent languages to encode in the type system whether or not a value can be nil. Another way to view this is that it replaces nil with an option type.

  • In Swift, a nil-able pointer is an "optional reference" - meaning you have to opt in to the possibility it stores something like nil.
  • In Rust, option types are used to a similar effect.
  • Scala and Kotlin also encode nullabliity in the type system.
  • C# 8.0 will add nullable reference types.

Chapel includes several types of references:

  • class instance pointers, in borrowed, owned, shared, or unmanaged
  • ref and const ref
  • c_ptr can store nil but we expect that to continue since this type works like a pointer in C and those can be NULL

Of these, this proposal is only concerned with changing the behavior of class instance pointers. These can currently store nil and nil is the default value for them. ref and const ref already cannot be nil. This proposal does not strive to extend nil/not-nil types to c_ptr since that type exists for interoperation with C.

Why use a type-based strategy?

One could imagine specifying nilable and not-nilable on a per-argument basis with something like argument intents. However this strategy falls apart in more complex cases:

  • creating an array of classes
  • creating a generic data structure where the caller indicates whether something is nilable
  • creating a generic function that returns the same type as its argument

In contrast, a strategy of attaching nil-ability to class types is demonstrated in other languages and has clear answers for the above challenging cases.

Proposal

This issue proposes that class types are assumed to be non-nil by default. A type modifier is available to indicate a class type is possibly nil. This proposal follows the Swift syntax for these. (see https://developer.apple.com/documentation/swift/optional ). We expect that this syntax will eventually be generalized to create optional record types (or optional int, etc) however the focus of this proposal is on the nilability of class instances.

var a: MyClass; // MyClass cannot be nil, so compiler error on this line
var b = new MyClass(); // b has type `MyClass` and compiler knows it cannot store nil
var c: MyClass?; // The ? indicates nilable MyClass, so this line compiles, c starts out nil
var d: owned MyClass; // MyClass cannot be nil, so compiler error on this line
var e: owned MyClass?; // The ? indicates nilable MyClass, so this line compiles, e starts out nil

proc f(fArg: MyClass) {
  // No need to do nil-checking inside of this function, since type system indicates arg cannot be nil
}

proc g(gArg: MyClass?) {
  f(gArg); // compilation error: argument is MyClass?, so might be nil, but is passed to fArg, which accepts only a not-nil MyClass
  f(gArg!); // asserts gArg is not nil at runtime, converts type to MyClass
}

Note that the runtime check for gArg! will be present even with --fast, unless the compiler can prove that the check is unnecessary. A failure here causes the program to halt (it does not raise an error).

Swift additionally has several convenient operators to help with such types:

  • if let notNil = possiblyNil { } creates a conditional with the new type
  • possiblyNil?.someMethod() returns nil if possiblyNil stores nil, and otherwise wraps the result of someMethod in an optional
  • possiblyNil ?? default supplies a default value for the case in which the left-hand side of the ?? in fact stores nil. The result is non-nil.

A reasonable alternative to the if let notNil = possiblyNil { } syntax is for the compiler to infer that within a guard like if possiblyNil { ... f(possiblyNil) ... } the value of possiblyNil cannot be nil.

Another interesting extension would be to allow one to specify nilability when infering a type, as in var x:? = notNil;.

In any case, the implementation effort could begin with simply supporting ? on types and ! for unwrapping.

Examples

Demo

   proc fnExpectingNotNil(x) { .. }
聽聽 proc getValue(x: C) {
聽聽 聽 return x.value; // no check needed here
聽聽 }
聽聽 getValue(nil); // compile-time error
  聽var a: C; // error, MyClass can't be nil/has no default value!
聽聽 var x: C?; // ok, has nil default value
聽聽 getValue(x); // compile-time error because x is nilable
聽聽 getValue(x ! ); // x! adds a run-time check
聽聽 聽 聽 聽 聽 聽   聽 聽 // (unless the compiler can prove to skip it)
 聽 getValue(someComplexFunctionReturningNil()); // compile-time error
 聽 getValue(someComplexFunctionReturningNil() ! ); // adds rt check

Arrays

聽 聽var A: [1..10] C?;

   // possible to create an array where the element access returns not-nil:
   var B: [1..2] C = [new C(), new C()];

   // how is array.this implemented?
 聽聽proc _array.this(index:int) : _array.eltType

Identity function:

 聽 聽 proc identity(arg) { return arg; }

 聽 聽 proc mycopy(arg) {
 聽 聽 聽 var ret = arg; // local type inferred to indicate nilability
 聽 聽 聽 return ret;
 聽 聽 }

Creating a Cycle

聽聽 聽 聽 class Node { var next: Node?; }
聽聽 聽 聽 var a = new shared Node();
聽聽 聽 聽 var b = new shared Node();
聽聽 聽 聽 a.next = b.borrow(); // coerce from notnil to nilable
聽聽 聽 聽 b.next = a.borrow();

Linked List

聽聽 聽 class Node {
       var data: real;
       var next: owned Node?;
     }
聽聽 聽 var head = new Node(0);      // : borrowed Node - i.e. head cannot be nil
聽聽 聽 var current: Node? = head; // Forget :Node? -> compiler error below on `current =` line
聽聽 聽 for i in 1..numLocales-1 do on Locales[i] {
聽聽 聽 聽 current.next = new owned Node(i);
聽聽 聽 聽 current = current.next;
聽聽 聽 }
聽聽 聽 current = head;
聽聽 聽 while current != nil {
聽聽 聽 聽 fnExpectingNotNil(current!);
聽聽 聽 聽 current = current.next.borrow();
聽聽 聽 }

Choosing the not-nil of two arguments:

``` chapel
聽 聽 proc choose(maybeNil: C?, notNil: C): C {
聽 聽 聽 var ret:C = notNil; // needs initializer
聽 聽 聽 if maybeNil != nil {
聽 聽 聽 聽 ret = maybeNil!;
聽 聽 聽 }
聽聽 聽 聽return ret;
聽 聽 }
````

Language Design

All 34 comments

I like this proposal and the proposed syntax.

I would like to discuss a couple of minor mods :

  • Failure with the ! operator - as in maybeNil! when maybeNil==nil - should throw, not halt. The reason why our current nil checks halt do not apply here iirc because the proposal is to execute this check even upon --fast. My motivation for having it throw is aesthetics. Although maybe halting offers better performance? Or we can skip this check upon --fast?

  • I don't like the let clause, I would prefer the compiler to infer from the conditional in an if or while. If we do so, the ! is not needed in the Linked List above:

     while current != nil {
       fnExpectingNotNil(current!);  // OK without '!'
       current = current.next.borrow();
     }
  • I like the :? syntax to cast to the nilable type, as in var c = new C():?.

@vasslitvinov - thanks for your feedback.

I like this proposal and the proposed syntax.

I would like to discuss a couple of minor mods :

  • Failure with the ! operator - as in maybeNil! when maybeNil==nil - should throw, not halt. The reason why our current nil checks halt do not apply here iirc because the proposal is to execute this check even upon --fast. My motivation for having it throw is aesthetics. Although maybe halting offers better performance? Or we can skip this check upon --fast?

I proposed that it halt because that is what Swift does. Since we're drawing both error handling design and the ! operator from Swift, it's probably worth emulating.

I think it would be reasonable to have a helper function that throws if the Optional is nil and returns it otherwise.

My motivation for having it throw is aesthetics.

I would say that a nil-dereference error is not likely to be something one could respond to in calling code. Hence halting seems to make sense.

Even if we did decide that one should be able to respond to it, I'm not sure I'd want to require the calling function to be marked as throws. I worry that it would end up with most functions being throws which in some ways defeats the point of the error handling design we have. When we designed error handling, we had in mind that certain errors (such as out of memory or failure to start a task) would not require calling code to be throws or handle the errors (because that would impact almost all the code) but would rather be handle-able with those constructs even if the compiler did not insist on it being handled. But implementing such a thing will probably need to look more like C++ exceptions than Swift ones.

  • I don't like the let clause, I would prefer the compiler to infer from the conditional in an if or while. If we do so, the ! is not needed in the Linked List above:
     while current != nil {
       fnExpectingNotNil(current!);  // OK without '!'
       current = current.next.borrow();
     }

I'm not really thrilled with the let clause either and it seems that we'd probably write it differently. I've also considered the inference strategy but the problem there is that the variable current will have two types, won't it? Or are you saying that - within that conditional - the variable current can be assumed to coerce from C? to C.

One issue here with the inference strategy is that there might be programs that appear to have reasonable inference properties but in fact do not. For example

ref current: C? = ...;
begin {
  someOtherTaskSettingCurrent(current);
}
if current != nil {
  waitForSomeOtherTask();
  fnExpectingNotNil(current);
}

I'm not really excited by the possibility that programmers have to think about what the body of the conditional does with the variable impacting the type of the variable.

I feel generally favorable about this proposal. Caveat: I have not spent much time yet thinking about the convenience operators yet.

If I'm understanding, the proposal seems to switch between C? and nilable C to indicate a C which can store nil. Is the proposal to support both of these? (or did you switch syntax at some point and not update everything? or maybe are still deciding between the two yourself?)

Thinking about these two syntaxes as possible options... On one hand nilable C is pretty self-descriptive. On the other hand, it's a lot more typing and gets into questions of whether one can type owned nilable C and/or nilable owned C (and/or owned nilable shared C... following up on issue #12639). Meanwhile C? is very nicely compact and conveniently dodges the "order of keywords" issue, but could raise potential confusion for users w.r.t. our formal query syntax (which I think is a more intuitive application of question marks, I think). That said, maybe this is just a learning curve that's worth the trouble. Or maybe there's some other postfix symbol we could attach to indicate nilable, at the risk of confusing Swift converts to Chapel... (C@?).

At the risk of stating the obvious: To preserve backwards compatibility, we could have C mean "nilable" and the new syntax mean "non-nilable C." This would simplify the conversion of current code to the new strategy, but has the downside of being the opposite of what a defensive programmer would want (more work/annotations to get the safer mode).

In this line:

 var current: Node? = head; // Forget :Node? -> compiler error

at first I thought the comment was wrong: that since head is known to have type Node, there would be no compiler error. But reading on, I believe it's because current is later set to next which has type Node? is that right? If so, clarifying the reason for the error in the comment would probably help future readers not trip in the sae way. Speaking of which, the .borrow() in the following isn't strictly necessary is it?

   current = current.next.borrow();

If I'm understanding, the proposal seems to switch between C? and nilable C to indicate a C which can store nil. Is the proposal to support both of these? (or did you switch syntax at some point and not update everything? or maybe are still deciding between the two yourself?)

I meant for this proposal to focus on C? but had switched from nilable C in fact. I'll see if I can find any more nilable in code samples and correct.

Meanwhile C? is very nicely compact and conveniently dodges the "order of keywords" issue, but could raise potential confusion for users w.r.t. our formal query syntax

For a second I worried that the parser would get confused. But I think we are OK because they have different orderings (prefix ? vs postfix ?).

proc typeQueryFn(arg: ?t) { }

proc nilableFn(arg: C?) { }

If so, clarifying the reason for the error in the comment would probably help future readers not trip in the sae way.

I'll fix, thanks

For a second I worried that the parser would get confused. But I think we are OK because they have different orderings (prefix ? vs postfix ?).

I agree it's unlikely to be a problem for the parser... I'm just thinking of the humans here.

An interesting question is, for a possibly nil variable, such as x:C?, what happens if you call a method on it, like x.mymethod() ?

Some ideas:
a) fail to compile (you can only call methods on not-nil)
b) add a runtime check as it does today (i.e. not under --fast)
c) add a runtime check is it does today for prototype modules, but fail to compile for non-prototype modules

We can try (c) and see how it goes. It will be interesting to see how much annoyance this will give us in our modules and perhaps in our tests. Somehow I do not like (a).

(b) vs. (c) probably depends on the direction we see for our prototype-vs-production-modules story going forward.

Swift uses garbage collection so doesn't have owned and shared.

If I have something like this:

var own:owned MyClass? = new owned MyClass();
... own! ...

I can think of the following options:

  1. own! is an error - you have to write own.borrow()! or something longer if you wanted to create a variable of type owned MyClass and do ownership transfer into it.
  2. own! is the same as own.borrow()!
  3. the own! results in transferring the ownership from the variable own (to a temporary)
  4. the compiler tries to pattern match based on how the own! is used, to avoid the ownership transfer if the result is immediately converted into a borrow?

Edit: This seems related to the question of whether or not we would allow cast versions of ! (e.g. own:owned MyClass or own:borrowed MyClass) which can differentiate more clearly and could be the "something longer" in option 1.

In Swift, you can cast C? to C with x as! C, but it gives a warning "Forced cast from 'C?' to 'C' only unwraps optionals; did you mean to use '!'?"

Re this:

var own:owned MyClass? = new owned MyClass();
... own! ...

Based on discussions with @bradcray - my current expectation is that:

  • own! will return a borrow (and halt on nil in the process)
  • cast will be available to do more complex things (like owned MyClass? -> owned MyClass) and will throw (because there is already precedent for throwing casts with string -> int conversions)

We've known that class downcast will need to change with this proposal.

Historically, if I have class Parent { } and class Child : Parent { }, I can cast using the runtime type:

var c: Parent = (new owned Child).borrow();
writeln(   c: Child   ); // this cast expression results in nil if c isn't a (runtime) subtype of Child

However this seems to be nonsensical when Child refers to a non-nilable class type.

We could support c: Child? to give nil if the runtime type was not a subtype of Child.

Should c: Child result in an error possibly being thrown or should it be a compilation error?

Should c: Child result in an error possibly being thrown or should it be a compilation error?

With the caveat that I don't have much experience in this world yet, my instinct would be to have it throw an error in the event that c's dynamic type wasn't a subtype of Child, similar to how we're making strings that don't contain an int value throw when cast to int, for example. For someone doing this mistakenly, that might be as useful as the compilation error, since it would flag that they had to catch something (assuming they weren't already in a try block); for someone who wanted to do it, it seems strictly more powerful than the compilation error.

In that event, should we continue to have c: Child? result in nil, or should that throw an error too?

My thought was to leave that resulting in nil, to provide a path forward for people coming from C++ / using the current capability.

Given code like

var x: borrowed MyClass = ...; // x is not-nil
... x! ...

Should this unnecessary x! be an error, a warning, or simply do nothing?

I'd lean toward nothing by default or a warning if someone opted into it since it seems like it could come up in generic contexts?

Would you lean towards it also doing nothing for a record?

Hmm, as a starting point, I'd probably be more inclined to make that an error to avoid misunderstandings. I suppose one could say that applying ! to a non-nilable class could also reflect a misunderstanding, but at least that case is the appropriate general type family (classes).

Should we allow code like this:

var x: MyClass; // not nilable, can't have default `nil`
x = (new owned MyClass()).borrow();

?

In particular, the question is if x needs to be initialize where declared, or if it's acceptable to set it in a following statement provided that there are no intervening statements (for example).

I would personally require the initialization to avoid the slippery slope of "what if I then make this minor change...?" for as long as possible. By analogy, we don't permit this split initialization for consts, params, or refs.

An interesting question is, for a possibly nil variable, such as x:C?, what happens if you call a method on it, like x.mymethod() ?

Some ideas:
a) fail to compile (you can only call methods on not-nil)
b) add a runtime check as it does today (i.e. not under --fast)
c) add a runtime check is it does today for prototype modules, but fail to compile for non-prototype modules

Related to this, right now we have forwarding from "owned" to other methods, so you can call myOwnedVariable.someClassMethod() (for example). We also have coercion to borrowed.

In the event that we have an owned? variable, just as with a borrowed? variable, these coercions / forwardings would no longer apply. Right?

I don't think I'm understanding the interaction you're seeing here (between owned? and coercion-to-borrowed, say... I would've expected owned? to coerce to borrowed?).

That's right, but the point is just that you won't be able to call a method on borrowed? or on owned? in a module without using !.

I.e. it's consistent and reasonable if we choose https://github.com/chapel-lang/chapel/issues/12614#issuecomment-485481031 to prohibit calling methods on possibly-nil types (possibly only in non-prototype modules).

But it seemed like an implication perhaps we hadn't considered, which is why I brought it up. In particular, I think it's more likely in practice that an owned variable will need to be nilable than a borrow will be.

Am I remembering correctly that Chapel has traditionally permitted methods to be called on nil values?

Am I remembering correctly that Chapel has traditionally permitted methods to be called on nil values?

It depends on what you mean. I don't think you could ever literally do nil.foo(). However, you can write

var mynil:C;
mynil.foo();

but that results in a compile-time error today (from the nil-checker) and historically was reported as a run-time error if --checks were enabled. More complex examples are currently still a run-time error (vs compile-time).

As it stands now, we have someValue! and someType?. One might imagine that ! applied to types would get the non-nilable type and ? applied to values would get the nilable variant of that value. Does that seem like an appealing generalization, or is it in the category of "Don't allow it for now"?

However, you can write

var mynil:C;
mynil.foo();

This is the pattern I was thinking of. I was thinking we supported it, but must've been using --fast or thinking of some far-off time in the past (I checked back to 1.14.0 and it's been an error for at least that long). In a way it's too bad, as I was going to suggest that calling methods could therefore always be legal on nilable classes and that any halts/errors could be put into the compiler-generated setter/getter routines on the fields (i.e., calling a method on a typed nil would be legal but trying to access its fields would not be since they don't exist). I think this is arguably just kicking the ball further down the field, but it seemed appealing to me somehow.

Anyway, back to your last question, I think I'm coming around to the notion that I'd have to do something like:

var myC: C? = ...;
(myC!).myMethod();

or otherwise make myC into a non-nilable C before calling a method on it because it forces the user to think about how they call the method, whether they want to halt or throw or do something else, etc. That said, this is the kind of change that I was alluding to today where I find myself very curious how painful this is going to be in practice and whether it makes me happy overall or not (speaking of which, I'd be happy to help with the conversion of tests to get more experience with the new features at a time where that seems appropriate, and I think we could enlist others to do so as well).

As it stands now, we have someValue! and someType?. One might imagine that ! applied to types would get the non-nilable type and ? applied to values would get the nilable variant of that value. Does that seem like an appealing generalization, or is it in the category of "Don't allow it for now"?

I think I could go either way on this for now. It strikes me as reasonable and orthogonal, but it's always easier to add features like this later than to remove them... (well, once they've made it into users' hands). Of the two, the ! on types strikes me as being something I'd be more likely to use at first blush.

Related, given:

type t = C?;  // or `foo(C?);` given `proc foo(type t) ...`    

is t? legal/reasonable? (i.e., can I make a nilable nilable again without error, which might be useful in a generic programming context?)

type t = C?;  // or `foo(C?);` given `proc foo(type t) ...`    

is t? legal/reasonable? (i.e., can I make a nilable nilable again without error, which might be useful in a generic programming context?)

Yes, it is. It's similar to how you can do borrowed t also.

I'm running into some questions related to owned? - see #13088

I've created #13161 to ask if MyClass should perhaps mean any nilability and MyClass? mean nilable and MyClass! mean non-nilable.

Anyway, back to your last question, I think I'm coming around to the notion that I'd have to do something like:

var myC: C? = ...;
(myC!).myMethod();

or otherwise make myC into a non-nilable C before calling a method on it because it forces the user to think about how they call the method, whether they want to halt or throw or do something else, etc. That said, this is the kind of change that I was alluding to today where I find myself very curious how painful this is going to be in practice ...

I'm okay forcefully unwrapping the nilable objects in the near term. I do think that more features are important in the long term because I'll likely have a lot of these patterns:

var myC: C? = ...;

// As is.
if myC {
  [ref|var] myActualC = myC!;
  myActualC.myMethod();
  // do more stuff with myActualC
}

// if-let with notional syntax
if [[const] ref|const|var] myActualC = myC? {
  myActualC.myMethod();
}

// optional chaining
myC?.myMethod(); // returns `<return value of myMethod()>?`

But I do understand that those features are not important for the base design.

@BryantLam -

Re

// if-let with notional syntax
if [[const] ref|const|var] myActualC = myC? {
  myActualC.myMethod();
}

I've created #13639 to discuss specifically the design of nil-checking and conditionals.

Re

var myC: C? = ...;

if myC {
  ref myActualC = myC!;
  ...
}

I don't think it's a good idea to allow aliasing references to the result of myC!.
We are discussing this over in #13621 - please see in particular my comment https://github.com/chapel-lang/chapel/issues/13621#issuecomment-517687656 and then let's talk about it there.

Nilable class types were one of the big features of 1.20. As this is implemented and released, I'm closing this issue.

Was this page helpful?
0 / 5 - 0 ratings