Language: Null safety feedback: Unsure about `late` with regards to static analysis

Created on 11 Jun 2020 · 5Comments · Source: dart-lang/language

From what I understand, the late keyword says that a variable can be considered non-null while not being initialized at declaration. This amounts to a "pinky-swear" between the programmer and the compiler though, and there is no enforcement done prior to runtime. Rather, if the programmer ended up lying and tried to access the variable prior to initializing it, then the runtime will throw a LateInitializationError.
Further, it's my understanding that the primary motivation for adding the late keyword was to allow classes to have NNDB variables as members.

If any of the above is not correct, please let me know.

Now given that, I can't help but wonder why we're left in a state where static analysis can not be performed. It seems to me that one way to ensure that a class member is initialized before being read is to declare the member final. When that is done, either an initial value must be supplied in the declaration or in the constructor (here we'll assume that not providing a default value in the constructor is just shorthand for explicitly providing a default of null).

Similarly, I wonder why there isn't an analog there for NNDB variables, a keyword which signals to the compiler that the class member is not final (that is, its value can vary), but which must be either initialized at the point of declaration, or inside of each constructor. That should be a simple thing for the compiler to check, and having done so, it should then be able to have full confidence that that class member will always have a non-null value inside of it.

Am I missing something that would make such an approach ineffectual or infeasible? Is there a reason why we instead have comparably toothless late instead? Is there some other way, with the current NNDB implementation, to, at compile time, ensure that a NNDB class member is initialized before being read? And if not, are there any plans (abstract or concrete) to add something, whether a keyword or annotation, which will allow the compiler to check that the class member was initialized before it had the chance of being read?

question

Source

MichaelFenwick

Most helpful comment

cc @kwalrath @filiph @mit-mit

Thanks for the good feedback! We try to anticipate what will need explaining, but when you have been living with the feature every day it's easy to lose perspective. For the record, here's a brief summary.

You have top level variables and fields which are nullable, and which are not initialized. They'll get initialized to null, and you can overwrite them later (unless they're final, of course). You'll have to guard all of the uses of them appropriately to check that they're not null.

You can have top level variables and fields which are non-nullable, with initializers. They're guaranteed to never be assigned null (well, when you're running a mix of legacy and null safe code, the legacy code can violate that, but otherwise you're guaranteed). So uses of them don't have to be guarded. For fields, you can also put the initialization in constructor initializer lists, as usual.

For local variables, we also do some definite assignment so that you can declare a non-nullable local variable without an initializer, and the static analysis will guarantee that you don't use it without first initializing it. Note: this feature is not fully functional in the tech preview, but it will be ready in the beta release.

All of this is great, but there are some patterns that you just can't fit into the above. For example, if you're trying to build up two objects that refer to each other:

class Team {
  Mascot mascot;
}

class Mascot {
  Team team;
}

void setup() {
  Mascot m = Mascot();
  Team t = Team();
  m.team = t;
  t.mascot = m;
}

You can't initialize the respective fields in the initializer list. You can just make them nullable:

class Team {
  Mascot? mascot;
}

class Mascot {
  Team? team;
}

void setup() {
  Mascot m = Mascot();
  Team t = Team();
  m.team = t;
  t.mascot = m;
}

and that works, but:

It means that client code has to check the fields for null explicitly every time they use it, even though you intend the field never to be reset to null
It means that you can't stop client code from resetting the field to null, even though you don't really intend null to ever be a valid value there.

What to do? Well, you can use setters and getters to get an ok API:

class Team {
  Mascot? _mascot = null;
  Mascot get mascot => _mascot as Mascot;
  void set mascot(Mascot m) => _mascot = m;
}

class Mascot {
  Team? _team = null;
  Team get team => _team as Team;
  void set team(Team t) => _team = t;
}

void setup() {
  Mascot m = Mascot();
  Team t = Team();
  m.team = t;
  t.mascot = m;
}

This is pretty nice. Your users get a non-nullable Mascot if they read the mascot field, and you're guaranteed that once the mascot field is initialized, it will never be reset to null. There's the unfortunate bit that if you forget to actually initialize the team or mascot field after creation, your user is going to get a runtime error. If you really wanted to, you could make the return types of the mascot and team getters nullable, but this doesn't really capture the intent, and just punishes your users.

The big downside to the above is that it's a royal pain! My one line field bloated out into a field, a getter, a setter. I have to say "Mascot" three times! Yuck!

So the first use of late is just to provide sugar for the pattern above. Instead of the above, I can just say:

class Team {
  late Mascot mascot;
}

class Mascot {
  late Team team;
}

void setup() {
  Mascot m = Mascot();
  Team t = Team();
  m.team = t;
  t.mascot = m;
}

This is much cleaner and it gives the same guarantees to the user as the big setter/getter example above. Moreover, the compiler itself can more easily take advantage of those guarantees. The compiler can easily see that once a late variable is set, it will never be reset to uninitialized, which means that if it sees two accesses (reads or writes) it can elide the runtime checking on the subsequent checks. Nice!

Note too, that late can be really nice for patterns where you can initialize the variable in the constructor, but you can't initialize it in the initializer list.

// Without late:
class C {
  final int count;
  int someComputation() {// something here}
  int? _x = null;
  int get x => _x as int;
  C(this.count) {
     _x = someComputation();
  }

// With late
class C {
  final int count;
  int someComputation() {// something here}
  late final int x;
  C(this.count) {
     x = someComputation();
  }

Note that I can't put the call to someComputation into the initializer list because you're not allowed to call virtual methods in initializer lists. But with late, I can just initialize it in the constructor body. Pretty neat! In fact, in this case, we can go one better:

// With late + lazy initialization
class C {
  final int count;
  int someComputation() {// something here}
  late final int x = someComputation();
  C(this.count) {
  }

Putting the initializer directly on the late field means that it's computed lazily, on first access, but if you're not performing side effects in the computation it's pretty much the same. And it has the benefit that now, if a user gets confused and tries to initialize x, the static analysis will tell them it's not allowed. You can't always use this pattern, but when you can, it's really nice.

So that's the basic idea behind late.

You don't have to use late, and in many situations there's no reason to.
- But sometimes, no amount of static analysis can save you, so you have to fall back to runtime checking.
- You can just make your fields nullable, and push that checking onto the user (sometimes that's the right thing to do)
- But often, you can improve the overall safety of the program and give better UX by using late

leafpetersen on 11 Jun 2020

👍8

All 5 comments

A late variable can be initialized (late var x = someExpression()) which means that its initializing expression will be evaluated when the variable is evaluated for the first time. So there's no dynamic checking involved in this case.

It can also be uninitialized (late int x;), in which case it will be a dynamic error to read it if it has not yet been assigned a value. It's also a compile-time error to read it if it has definitely not been assigned so far, but if it's just _possible_ that it has a value then the read is allowed statically and may fail dynamically.

allow classes to have NNDB variables as members

It is certainly possible to have a non-late instance variable whose type is non-nullable, it just has to be initialized during object creation. For instance, it can have an initializer in the declaration, it can be initialized using an initializing formal parameter in generative constructors (this.x), it can be in an initializer list (C(): x = 2 ...).

So you don't have to make an instance variable final in order to ensure that it is initialized. But if you _do_ make it late and don't initialize it then you may encounter a dynamic error.

Not providing a value is not the same thing as providing null. A variable (any kind) whose type is non-nullable cannot have the value null.

why there isn't an analog there for NNDB variables, a keyword which signals
to the compiler that the class member is not final (that is, its value can vary),
but which must be either initialized at the point of declaration, or inside of
each constructor

This is exactly what you'll have if the variable has a non-nullable type and is not late. Note that the variable is not initialized inside a constructor (in the constructor body), it is initialized by an element in the initializer list of the constructor, or by an initializing formal parameter, or by an initializer in the variable declaration. Having it happen a bit before the constructor bodies are executed yields a simple discipline which prevents half-done objects from being visible to user code.

eernstg on 11 Jun 2020

(FYI this is transferred to dart-lang/language as it is a language semantics)

mraleph on 11 Jun 2020

That makes some more sense. I got the impression from https://dart.dev/null-safety that the late keyword was needed for any class members which weren't initialized at the point of declaration. The only example on that page using a class is an example of the late keyword, and how class members without the late keyword act isn't really described. I'd recommend updating that documentation to make that part a bit clearer.

MichaelFenwick on 11 Jun 2020