Language: Should we provide a way to query the status of `late` variables?

Created on 23 Apr 2019 · 29Comments · Source: dart-lang/language

As proposed, the only way to query the state of a late initialized variable to see if it has been written would be to try to read it, and catch the resulting error. We explicitly discourage catching errors, so this seems non-ideal. Should we provide this functionality? And if so, how do we expose it?

For comparison, Kotlin provides a way to do this via reflection.

If we do provide this, then what is the right behavior for it on late int x = 3 + 3? Is this variable considered initialized before the first read/write (since it has an initializer)? Or is it considered uninitialized since the initializer has not yet been run (and may in principle never complete with a value and cause the variable to actually be initialized)?

nnbd

Source

leafpetersen

Most helpful comment

My gut feeling is that we are breaking an abstraction, and that is always bad. In the long run it may cause us as much anguish as the "is optional parameter passed or not" query would have.

If a late variable needs to be queried, it's no longer just a late variable, it's an optional variable. Use a nullable variable if the type is not nullable, or an Option class if it is, or use your program state, or do something else to know that it has been initialized. Don't try to fit a square peg in a round hole. This is not what late variables are for.

lrhn on 4 Nov 2019

👍2

All 29 comments

In Kotlin, you can do it without reflection
https://kotlinlang.org/api/latest/jvm/stdlib/kotlin/is-initialized.html

tatumizer on 23 Apr 2019

In Kotlin, you can do it without reflection

Not really, at least as I understand it. The .isInitialized property is defined on property references which are part of the reflective API.

leafpetersen on 23 Apr 2019

You are right, I was confused by example

if (foo::bar.isInitialized) {    
   println(foo.bar)
}

In Kotlin, there's special syntax :: for reflection which makes the operation syntactically similar to regular operator. In dart, reflection would require much more ceremony though.

tatumizer on 23 Apr 2019

My quick answer is "no", let's try to get by without it, at least for now. If you need access to that bit, you can always not use late and (more or less) implement the same behavior yourself manually with a "was initialized" field you can control:

// Instead of:
class Memo<T> {
  late T value_;

  void store(T value) {
    value_ = value;
  }

  T getOrDefault(T defaultValue) {
    if (<magic...>) return value_;
    return defaultValue;
  }
}

// Do:
class Memo<T> {
  T? value_;
  bool hasValue_;

  void store(T value) {
    value_ = value;
    hasValue_ = true;
  }

  T getOrDefault(T defaultValue) {
    // Can't check for null value_ here since T might be a nullable type.
    if (hasValue_) return value_!;
    return defaultValue;
  }
}

My hunch is that this need is rare enough that the workaround isn't too onerous. It might be nice to eventually provide access to the state somehow, but that tiptoes towards something meta-programmy around working with identifiers as storage locations instead of as the values they contain. I'm not sure if we're ready to bite that off just now. This is a feature I think we could add later without undo pain today.

Here's some more stuff that might be related:

Detecting optional parameter passage

We have an analogous issue around detecting if an optional parameter was explicitly passed or not. The language does track that extra bit of data because it needs it to determine whether the default value should be used or not. (I hoped we could eliminate that bit of magic in Dart 2.0, but alas.) Consider:

original([param = "some secret value only original knows"]) {
  print("original got $param");
}

forward([param]) {
  original(param);
}

main() {
  original();
  original(null);

  forward();
  forward(null);
}

There's no way to implement forward() such that a call to it behaves like an unwrapped direct call to original(). Direct calls to original() pass a hidden bit of "was param passed?" state that forward() has no way to capture on calls to itself.

Ancient Dart used to support this with a ?parameter operator. The idea was that this would let you forward parameters. The above then becomes:

forward([param]) {
  if (?param) {
    original(param);
  } else {
    original();
  }
}

The problem, of course, is that it's scales exponentially (!) in the number of parameters you want to forward. To forward a function with five optional parameters, you have to write out all 32 combinations.

The operator was killed but, unfortunately, the hidden state wasn't.

I bring this up because it's another place where the runtime tracks state that the user can't access. If we want to solve it for late variables, we may want to see if that solution can cover this too.

A third potential use is conditional logic mixed with definite assignment for locals. I think we want to support:

int i; // Non-nullable.

if (condition) {
  i = 1;
} else {
  i = 2;
}

i.isEven; // OK.

It's not inconceivable that a user might want to write something like:

int i;

if (condition) i = 1;
stuff();
if (another) i = 2;
moreStuff();
if (<i wasn't caught by either of the above two cases) i = 3;

i.isEven;

Capturing a parameter's name

When you throw an ArgumentError, it's useful to include the name of the argument in the error message:

sqrt(int i) {
  if (i < 0) throw ArgumentError.value(i, "i", "Must be positive.");

  // ...
}

Right now, you use a string literal, which is brittle in the face of typos and parameter renames. C# 6.0 added a nameof() notation to extract the name of the given identifier as a string:

sqrt(int i) {
  if (i < 0) throw ArgumentError.value(i, nameof(i), "Must be positive.");

  // ...
}

You can also use it to capture other named elements like classes and maybe members. It is unfortunate that the syntax is much longer than the string literal.

A C programmer would say this is a good place to use the preprocessor and the stringizing operator.

munificent on 23 Apr 2019

👍1

Another vote for "no" – worried that we'd lose the ability to optimize this in many contexts if we need to be able to query the value.

Revisit if there's a screaming user requirement.

kevmoo on 23 Apr 2019

Also "no" from me.

It smells like reflection, and it's at a local variable level, where we never did reflection before.

Giving that ability would mean that we are forced to delay evaluation of late x = 3 + 3; because the specification says that the initializer is evaluated on the first access. The compiler can totally recognize that the initializer is constant and convert it to var x = 6; with no late-ness overhead. I guess that's essentially Kevin's optimization concern again.

And if our experience with allowing you to query a parameter for being passed or not was anything to go by, this ability might cause more problems than it solves. We will likely see late bool x; used as a tri-value state (with isAssigned(x) == false being the third), rather than just doing the proper thing.
It smells just like a different kind of null, except that it stacks with that, so now you can have late bool? x; as a four-valued variable ("unknown", "yes", "no", "neither"?)

So, too scary for me.

(I do want a ##i operator that "symbolizes" an identifier in scope).

lrhn on 24 Apr 2019

👍1

It smells like reflection, and it's at a local variable level, where we never did reflection before.

It's worse than that. If it was just for locals, we could easily statically tell if the query capability was ever used and if not still optimize the lateness away when possible. But because public fields can be marked late, a modular compiler can't tell if there isn't some other code out there doing someObject.someField.isInitialized.

We will likely see late bool x; used as a tri-value state (with isAssigned(x) == false being the third)

Everyone knows the correct three states are true, false, and fileNotFound.

munificent on 24 Apr 2019

Ok, this seems like a clear no.

leafpetersen on 29 May 2019

I want to re-open this at least briefly to discuss this again in light of some experience with the migration. In migrating the core libraries, we have a number of examples of code that ends up being migrated to look like the following:

E singleWhere(bool test(E element), {E orElse()?}) {
    late E result;
    bool foundMatching = false;
    for (E element in this) {
      if (test(element)) {
        if (foundMatching) {
          throw IterableElementError.tooMany();
        }
        result = element;
        foundMatching = true;
      }
    }
    if (foundMatching) return result;
    if (orElse != null) return orElse();
    throw IterableElementError.noElement();
  }

This isn't terrible, but it does have some redundancy in that you are forced to explicitly represent the state of the late variable (which must be tracked by the compiler anyway) .

Does this example cause anyone to change their mind about this?

leafpetersen on 1 Nov 2019

Isn't this just the wrong application of late? I'd thought to write it as:

E singleWhere(bool test(E element), {E orElse()?}) {
    E? result;
    for (E element in this) {
      if (test(element)) {
        if (result != null) {
          throw IterableElementError.tooMany();
        }
        result = element;
      }
    }
    if (result != null) return result;
    if (orElse != null) return orElse();
    throw IterableElementError.noElement();
}

matanlurey on 1 Nov 2019

The tricky bit here is that E can be nullable, which means that null cannot be used as evidence for not being initialized.

eernstg on 1 Nov 2019

Agree, you would still need the extra boolean, but it's true that it's duplicate effort to have an extra "isInitialized" boolean and a late field. So:

E singleWhere(bool test(E element), {E orElse()?}) {
    E? result;
    bool foundMatching = false;
    for (E element in this) {
      if (test(element)) {
        if (foundMatching) {
          throw IterableElementError.tooMany();
        }
        result = element;
        foundMatching = true;
      }
    }
    if (foundMatching) return result as E;
    if (orElse != null) return orElse();
    throw IterableElementError.noElement();
  }

would be the corresponding non-late approach.
It contains an unnecessary E cast instead of an unnecessary late-init-check. Not sure which one is cheaper.

Another approach is to do it with two loops:

E singleWhere(bool test(E element), {E orElse()?}) {
  var it = this.iterator;
  while (it.moveNext()) {
    E result = it.current;
    if (test(result)) {
      while (it.moveNext()) {
        if (test(it.current)) {
          throw IterableElementError.tooMany();
        }
      }
      return result;
    }
    if (orElse != null) return orElse();
    throw IterableElementError.noElement();
  }

We are storing a state into a variable instead of keeping it in the control flow. The value of result depends on that state, but we don't have dependent types (and we don't want to allocate an Option like value to combine the state and value).

lrhn on 1 Nov 2019

It would be convenient to be able to use the data that the implementation must maintain for a late variable without initializer:

E singleWhere(bool test(E element), {E orElse()?}) {
    late E result;
    for (E element in this) {
      if (test(element)) {
        if (result.isInitialized) {
          throw IterableElementError.tooMany();
        }
        result = element;
      }
    }
    if (result.isInitialized) return result;
    if (orElse != null) return orElse();
    throw IterableElementError.noElement();
  }

This code is nicer than the code that we'd use today:

We avoid allocating the extra variable bool foundMatching = false; and we avoid maintaining it along the way. We would also have the added benefit that a lint could be raised at if (result.isInitialized) if static analysis could prove that result.isInitialized is always true at that point, or always false. So it _is_ tempting to consider a mechanism like this.

Given that it is semantically quite different from a member access, we might prefer a specialized syntax for it. However, no specialized operators seem to work really well, and we could also treat isInitialized as a "magic" language-defined extension getter which is only available on a late variable with no initializer, and only when no other member with that basename is available (that is, no instance members and no extension members).

This would also make it non-breaking, and we could add it at any point in the future where we have the resources to do it.

PS: I don't think it smells like reflection, it smells much more like using a resource which is guaranteed by the language semantics to be available anyway, and it's definitely not costly in a way that resembles reflection, so why not. ;-)

eernstg on 4 Nov 2019

My gut feeling is that we are breaking an abstraction, and that is always bad. In the long run it may cause us as much anguish as the "is optional parameter passed or not" query would have.

lrhn on 4 Nov 2019

👍2

breaking an abstraction

Why wouldn't it be just as reasonable to consider a late variable along with its magic isInitialized getter to _be_ an abstraction? Part of the motivation is exactly that we can use isInitialized without thinking about whether it's implemented via a special value (a variable whose type is non-nullable can use null) or a separate boolean variable (when the type is potentially nullable).

The x? query on an optional parameter was different in that there is no guarantee that a reasonable implementation of the rest of Dart will provide support for determining whether that parameter was passed on omitted, and there was no appetite for investing a non-trivial amount of overhead into supporting it (e.g., by having an extra boolean named parameter for each named parameter).

With a late variable with no initializer the language _does_ promise to be able to determine dynamically whether an initialization has taken place (such that we can throw if we're reading it too early, and we can throw if a final late is "initialized twice"), so we _can_ support isInitialized without incurring an overhead. It's actually a manually re-implemented boolean variable like foundMatch that amounts to an overhead, both in that it takes space to allocate and time to update, and also in that it could be maintained incorrectly—so we might as well use an abstraction to ensure correctness and save the time & space. ;-)

eernstg on 5 Nov 2019

It does feel weird that the runtime has to track some state that the user can't access. At the same time, I think using late should be an implementation detail of a library. If we allow any user that has access to the name of some member to also access the magic "check late bit" API, then the fact that the member uses late is now part of its API:

// foo.dart
late int i;

// main.dart
import 'foo.dart';
main() {
  print(i.isInitialized);
}

Here, if the maintainer of "foo.dart" removes late from i, they will break "main.dart". The nice thing about the old ? operator for parameters was that it was at least implicitly encapsulated because it only applied to parameters, which naturally have local scope.

We could say that you can only use the magic "is initialized" API inside the library where the member is declared, but in that case, it pushes even more towards having a different syntax instead of it looking like a getter.

But perhaps the simpler way to think about this problem is: If you don't like that the runtime has a bit you can't access, instead of adding a way to access it, you can just choose to not use late for that case and get rid of the hidden bit entirely.

munificent on 5 Nov 2019

👍1

... you can just choose to not use late for that case and get rid of the hidden bit entirely

Isn't it true that while doing so, you also have to change the type of the variable to nullable, for no logical reason? At least, that's my understanding of this paragraph in the spec:

It is an error if a potentially non-nullable local variable which has no initializer expression and is not marked late is used before it is definitely assigned

tatumizer on 5 Nov 2019

Thinking about this, an additional concern is that this potentially forces the compiler to maintain the "isInitialized" state even when it is not otherwise needed (especially in a modular setting). For example, I expect that in many cases the compiler will be able to prove that a late field is initialized before it is ever accessed, in which case it should just implement it as a normal field. If there is a possibility that some other part of the program will query the 'initialized' status of that field however, it becomes harder to eliminate that state.

leafpetersen on 5 Nov 2019

isInitialized call can be supported for any field, no matter whether it has this bit or not - it can return true for non-late ones and those late ones for which the compiler can prove initialized status, so making something "late" or removing "late" modifier won't be a breaking change.

tatumizer on 5 Nov 2019

@leafpetersen wrote:

prove that a late field is initialized before it is ever accessed

How would we prove statically that a _field_ with no initializer is always initialized before it is accessed, if that isn't because it always gets initialized in a constructor initializer list? In that case I don't think there is any reason for it to be late.

@tatumizer wrote:

isInitialized call can be supported for any field,

If anybody is worried that this could be confusing, we can always lint v.isInitialized when it is trivial (e.g., when v is not late, or when v is a local variable which is 'definitely assigned' or 'definitely not assigned' at the point where v.isInitialized occurs).

eernstg on 6 Nov 2019

Erik Ernst notifications@github.com wrote:

@leafpetersen https://github.com/leafpetersen wrote:

prove that a late field is initialized before it is ever accessed

How would we prove statically that a field with no initializer is
always initialized before it is accessed, if that isn't because it always
gets initialized in a constructor initializer list? In that case I don't
think there is any reason for it to be late.

There's also the constructor body.
If you can see that this is not leaked before the late field is
initialized in the body of the constructor, then it can also be optimized.
It still has to be late because it's not initialized in the initializer
list/parameters of the constructor.

@tatumizer https://github.com/tatumizer wrote:

isInitialized call can be supported for any field,

If anybody is worried that this could be confusing, we can always lint
v.isInitialized when it is trivial (e.g., when v is not late, or when v
is a local variable which is 'definitely assigned' or 'definitely not
assigned' at the point where v.isInitialized occurs).

I'd prefer to just add an Option type. I believe one is already in
general use. Then use a non-final optional field instead of a late field.
Perfectly represents the behavior you are after.

No matter what you do, there will be a check before accessing the value.
There are already so many ways to do that, there is no need for one more.

lrhn on 6 Nov 2019

Then use a non-final optional field instead of a late field. Perfectly represents the behavior you are after.

It certainly represents the behavior somehow, but "perfectly" is a bit of overstatement IMO :)
You can't model "late final" this way. More importantly, after introducing the whole sophisticated machinery for nnbd, it would be difficult to explain why the user needs yet another Option type. Nullable variables are colloquially called 'optional' (that's probably what you meant in the cited sentence?), and on top of that, there's an Option type? Which in turn can be optional or not, final or not, late or not? People will start complaining the language has too many ways to represent optionals! :)

tatumizer on 6 Nov 2019

If you can see that this is not leaked before the late field is
initialized in the body of the constructor, then it can also be optimized.

Sure, if you're doing it really, really early in that constructor body. ;-)

class A {
  late int i, j;
  A() {
    i = 1; j = 2;
  }
}

class B extends A {
  set i(value) {
    print(super.i);
    print(j);
    super.i = value;
  }
}

main() => B();

(The tools don't currently implement enough to handle this code, but dart --enable-experiment=non-nullable runs and prints 'null' twice; it should throw instead when the uninitialized storage location is read by the A.i getter.)

eernstg on 6 Nov 2019

From discussion this morning, we will not provide support for this for now. There are too many concerns about this becoming part of the API, and about implications for optimizability, as well has how to present this nicely to the user. Adding an "external" extension method would be one approach, but this is unappealing, since without extensive custom static error checking, the method would be applicable in nonsensical places.

If we choose to do this in the future, there is an expressed preference to instead make this state available via a "pseudo variable/field" in the syntax, something along the lines of:

  late (int x, bool isInitialized);

or perhaps

  late int x {isInitialized};

leafpetersen on 7 Nov 2019

I know this was closed a while ago, but don't the issues described go away if the feature is implemented like that?

If the library author doesn't want isInitialized to become a part of the API, then they can just declare it as _isInitialized.
As far as optimization goes, I don't see how this would be any different than a regular bool field. And if declaring isInitialized becomes optional in the first place (which the second example seems to imply), then the library author would have to explicitly declare that the value is more important than optimizing it away.
And again, the issues of how the variable can be accessed outside the class become irrelevant if isInitialized is treated as just another variable/field. The only issue I can find with this is that it adds a non-final bool field to the class (which is clearly intended), which might trip up const constructors and the @immutable annotation, but again, that's how classes with non-final fields behave anyway.

I find myself writing the code with a separate bool value that I have to maintain quite often, and if the runtime can do that for me then that's great.

Levi-Lesches on 12 Jun 2020

I do think the API based problems go away if we, say, allow late{_initX} int x; to be queried for initialization as _initX.
It doesn't remove the the optimization based problems, and I personally don't think the feature would carry its own complexity weight. It's not that important, but we'd introduce a completely new syntax which only works to expose a bool getter.

I think it's actually better, on average, to not do anything here.

Now, if we had a different syntax for getter/setters, maybe one where you could encapsulate a field or other propertie, then it could fit in more easily.
Say:

int x { // declare setter and getter with same name and type easily.
  get => _x;
  set => _x = it;
}
late int y { // Use *same kind* of block syntax for init property of late variable.
  init _init;
}

Then the syntax would not be as foreign.

Even then I still think exposing initialization is a breach of abstraction and will prevent perfectly good optimizations, and that's reason enough to not do it.

lrhn on 12 Jun 2020

👍1

FWIW I ran into sort of wanting this just now. I don't think we should have a way to actually do it.

What I had was a field that I'd only set once, to a non-null value, and later I would either use it if it was non-null or else not use it. I could express this as a nullable mutable field, but the mutability doesn't really convey that it should only be set once. I could use a late final nullable field, but then I'd have to explicitly set it to null in the constructor, which is awkward. Maybe that's the right answer though.

Hixie on 20 Aug 2020

We don't have a way to represent "initialized to null and then set once to non-null" (would that be "single promotion" rather than "single assignment"?). That's just a nullable variable.

Using late final means that it's either uninitialized or initialized, but we have no way to tell which one from the outside. You can't initalize that in a constructor, and still be able to set it to non-null later.

lrhn on 20 Aug 2020

In the case I had, if it was ever initialized, it would be done in the constructor. I worked around it by adding an "else" branch that initialized it to null explicitly. It's a bit ugly but it works with late final.

Hixie on 20 Aug 2020

Was this page helpful?

0 / 5 - 0 ratings