Chapel: Default value for the 'locale' type

Created on 1 Nov 2017  Â·  24Comments  Â·  Source: chapel-lang/chapel

Summary of Problem

A variable with the locale type has the default value of nil. The language specification does not suggest what the default should be.

var x : locale;
writeln(x); // nil

Associated Future Test(s):
This future suggests that the default value should be here.
multilocale/sungeun/locale_default.chpl

Language Design

Most helpful comment

I think we can wrap this up. I don't see too many strong opinions. I am moving towards Locales[0] more strongly lately:

on Locales[X] {
  foo()
}
// ...
// somewhere far far away in the code hierarchy
// ...
proc foo() {
  var l: locale;
  // I know l is here, so I don't set it
}

I think users potentially relying on here being the default value is a bit too discomforting while reading code. And having a context-independent default value would be better.

I had an offline chat with Michael, and he is fine with Locales[0] being the default. I will move forward with that unless there are any new opinions.

All 24 comments

Huh... I thought we'd defined this at some point for some reason. My first thought was that it should be Locales[0] since it always exists, is the "0th" thing much like other default values. here is an interesting approach as well (guaranteed to be correct), but not as intuitive to me.

That Locales[0] always exists doesn't do much for me, as here will also always exist. But Locales[0] is probably the closest we have to the "0th" thing.

Part of the problem is that I don't have a good sense of how locale variables would be used, especially not in a way such that default initialization mattered.

Perhaps another example might change our perspective:

use BlockDist;

var D = {1..10, 1..10};
var Space = D dmapped Block(D);
var A : [Space] locale;
writeln(A);
// Should A[i] == A[i].locale ?

I agree that here also meets the "always exists" criteria. I was offering it up as a reason to choose Locales[0] over, say, Locales[1].

I think the main argument against having locale default to here is that it'd be the only case of a variable whose default value depended on where the code was running (I think?). So in a sense it goes against the "on-clauses should affect performance but not program behavior" principle (not that there aren't other cases that violate it, but...).

I think you're right that, in the common case, it's unlikely that people will write many codes that rely on the default locale value being a specific value so in a sense it doesn't matter deeply as long as it defaults to something that's a legal value. My instinct continues to think Locales[0] makes sense, but I'm not sure what caused Sung to think here would be better.

Here's a question: Are the communication requirements for having it default to Locales[0] greater than having it default to here? If so, that might be a good counterargument...

I think, but am not sure, that the Locales array is privatized. I don't think there should be communication for accessing Locales[0]. However, I think each locale is a wide class that lives on the node it represents. So something like Locales[0].id could cause communication.

As part of #15149, we should make a decision on this.

By looking at the conversation here, it looked to me like Locales[0] is the better choice than here. I don't think arguments for it are very strong or anything, but I can't think of any argument for here. And I agree that it probably will not matter as long as it is something not ridiculous.

And I am not sure whether there is any other alternative. Under #13526, I also proposed adding other special locale value(s) like "anywhere", but I don't think they should be the default values because they are very esoteric.

Considering that there has been some time since there is any discussion under here; @benharsh and @bradcray do you have any new opinions/concerns? Are you OK with Locales[0] being the default value?

I feel OK with either option, personally, and re-reading this issue consider my arguments in favor of Locales[0] as being as much devil's advocate / trying to understand both sides as anything. In truth I feel right on the fence; neither option is obviously preferable to me. Locales[0] might have the slight benefit of being the most trivial / least surprising / most like our other default values.

Putting on the other hat, one argument in favor of here might be that data structures like wide pointers would have their explicit locale fields automatically set up to be the right thing with no effort. I'm not sure that's a big deal, but I thought I'd mention it.

I think @mppf has chimed in on other issues that have touched up against this question, so I'm tagging him here as well.

I like here being the default value. One could write e.g.

var A = newBlockArr(0..#numLocales, locale);

and have A[here.id] point to the relevant locale. Of course that might not be super useful on its own, but imagine that the locale is stored in a record.

Also, string used to store a locale (now it does lower level stuff) and there it needs the default to be here.

We currently have an example of that in record channel and record file where we have the field var home: locale = here. Which can also be written as var home = here. I guess what I am trying to say is that the record implementer can have that behavior quite easily and probably in a more expressive way. OTOH a record can have a locale field that doesn't need to be here, in which case it may lead to extra work and/or confusion maybe?

That being said, I am still not against here.

This behavior is easy to control with a flag, but we probably don't want to change such a program behavior based on a flag.

The problem I have with here is that while it seems more elegant, it doesn't make sense to me as a default value because it's not actually a single value...it's context sensitive. (I see Brad has mentioned that above).

Because of that I'm inclined to choose Locales[0] as the only value that makes sense to me.

I think based on Engin's argument just above, I'm leaning more toward Locales[0] by the principle of least surprise. If I was familiar enough with the language to know what the following did:

var x: int;
var b: bool;
var r: real;
var e: myEnum;

then I'd be more surprised to see code like:

var l: locale;
var l2: locale = Locales[0];

and to find out what its behavior would be than I would to see code like:

var l: locale = here;
var l2: locale;

i.e., I think the cases where I wanted a locale variable to default to here, making that abundantly clear in the code seems valuable / less surprising than relying on a default value; and the default value of Locales[0] is more consistent with what we do for other types (default to the "zero" / initial value).

Here's my two cents. I should note that I don't use multilocale code super often - I'm not sure whether that means that my opinion on the default should have more weight or less.

OTOH a record can have a locale field that doesn't need to be here, in which case it may lead to extra work and/or confusion maybe?

I don't find this super compelling - I think in the common case, you want the locale to match the locale you are currently on to avoid communication, and that having the default be here provides that. It seems more exceptional for the default to be a different locale, and I don't think having it always be locale 0 is more likely to help those situations.

Going into Brad's examples:

then I'd be more surprised to see code like:

var l: locale;
var l2: locale = Locales[0];

and to find out what its behavior would be than I would to see code like:

var l: locale = here;
var l2: locale;

I think it is important to note that during the times when these codes will behave differently today, we already have a clue nearby that they will do so - they will be wrapped in an on clause:

on Locales[1] {
  var l: locale;
  var l2: locale = Locales[0];
}
on Locales[1] {
  var l: locale = here;
  var l2: locale;
}

To me, it is much more obvious looking at the first one that it should behave differently - the way l2 is being set shows that it is explicitly choosing a different locale than the one we are currently on. In the second one, if we chose to make the default always be locale 0, looking at the code does not indicate that l2 will be, there's no sign of a 0 anywhere.

I think in the common case, you want the locale to match the locale you are currently on to avoid communication

Storing Locales[0] into a locale value shouldn't require communication. (I'm not sure whether or not it will today or with Engin's PR, but we could definitely make it so that it didn't).

we already have a clue nearby that they will do so - they will be wrapped in an on clause:

That on-clause could be somewhere completely different in the code, though—there's no guarantee that you'd be able to see it, syntactically. For example:

forall a in myDistributedArr {
  var l: locale;
}

looking at the code does not indicate that l2 will be, there's no sign of a 0 anywhere.

To me, that's like saying in:

var i: int;
var c: myColor;

that it would be surprising for i to be set to 0 or for c to be set to myColor.red since there's no sign of 0 or red in the source code.

I think in the common case, you want the locale to match the locale you are currently on to avoid communication, and that having the default be here provides that. It seems more exceptional for the default to be a different locale, and I don't think having it always be locale 0 is more likely to help those situations.

You are right, but the Locales array is a locale private variable. So there is no comm in Locales[0] wherever you run it.

I think it is important to note that during the times when these codes will behave differently today, we already have a clue nearby that they will do so - they will be wrapped in an on clause:

I am not sure what you mean here. And I don't think we should rely on the proximity of the variable declaration to an on statement. What if there was a call in that on statement, and that function was defining a locale variable. Saying this out loud moved me one step towards Locales[0] being the default.

Storing Locales[0] into a locale value shouldn't require communication.

Sorry, I was more thinking about how the locale instance would be used, rather than the act of initializing the locale in the first place.

I was more thinking about how the locale instance would be used, rather than the act of initializing the locale in the first place.

I don't think using a remote locale value necessarily results in communication either (unless you use it on an on-clause of course). It's essentially a wide pointer that says "that address over on that locale" but accessing that wide pointer is a local operation.

That on-clause could be somewhere completely different in the code, though—there's no guarantee that you'd be able to see it, syntactically.
What if there was a call in that on statement, and that function was defining a locale variable.

I find these arguments compelling, but they make me lean more towards "no default" than "Locales[0]".

To me, that's like saying in:

var i: int;
var c: myColor;

that it would be surprising for i to be set to 0 or for c to be set to myColor.red since there's no sign of 0 or red in the source code.

I think this is a false equivalency, as neither of those variable types currently impact how we determine the default value of their type. There's no statement in our languages that changes their default value, but there are statements that change the default locale for local variables, and I think the default value of a locale instance created in scope should match the default locale for a variable created in the same scope, e.g.

var x1: int;
var l1: locale;
writeln(x1.locale == l1);

on Locales[1] {
  var x2: int;
  var l2: locale;
  writeln(x2.locale == l2);
}

I think those two writelns should print the same thing. (Note: that this program does not compile on master today, I get:

foo.chpl:2: error: variable 'l1' is not initialized
note: non-nilable class type borrowed locale does not support default initialization
note: Consider using the type borrowed locale? instead
foo.chpl:2: error: split initialization is not supported for globals

)

I don't think using a remote locale value necessarily results in communication either (unless you use it on an on-clause of course). It's essentially a wide pointer that says "that address over on that locale" but accessing that wide pointer is a local operation.

Now we're getting into situations where me being less experienced probably makes the tangent less valuable, so maybe I should stop on this particular train of thought. One last thing before I do, though - couldn't you use the locale instance you've created to accidentally create a variable or part of an array or something remotely when you meant to create it locally? Or is that just not going to happen, since you'd just pass in here in those cases?

couldn't you use the locale instance you've created to accidentally create a variable or part of an array or something remotely when you meant to create it locally?

Not unless you did something pretty explicit, like:

var l: locale;
on l do {
  var x: int;
}

or

var l: locale;
var myTargetLocales = [l, l, l, l];
var D: domain {1..n} dmapped Block(..., targetLocales=myTargetLocales);

I think the default value of a locale instance created in scope should match the default locale for a variable created in the same scope

To me, this shouldn't be a requirement. Because it doesn't feel related. What is related to me is that we get the same only writeln(x.locale == l.locale) in both cases, and that's obvious anyhow.

One last thing before I do, though - couldn't you use the locale instance you've created to accidentally create a variable or part of an array or something remotely when you meant to create it locally? Or is that just not going to happen, since you'd just pass in here in those cases?

with default value Locales[0] you can do something like:

on Locales[1] {
  var l: locale;
  on l {
     // oops I moved to locale 0
  }
}

or maybe more subtly

var myTargetLocs: [1..10] locale;
var d = {1..n} dmapped Block({1..n}, targetLocales=myTargetLocs);
// oops, everything is on locale 0

I think in both cases, the writer of this code should use explicit values and suffer if they don't. If I am being blunt :)

@bradcray beat me by few seconds again :(

I think in both cases, the writer of this code should use explicit values and suffer if they don't. If I am being blunt :)

I agree with that - but I think that makes me more inclined to have no default value for locale, as accidentally having communication is more subtle than getting a compilation error saying you need to give a locale instance an explicit value

But these cases seem to be more rare, so maybe it doesn't matter that much

This somewhat alludes to https://github.com/chapel-lang/chapel/issues/13526.

So, maybe we can have a defaultLocale (or unknown, nowhere, anywhere) as the placeholder default, and give runtime errors if this was encountered in an on statement? Hmm..

So, maybe we can have a defaultLocale (or unknown, nowhere, anywhere) as the placeholder default

This was @benharsh's opinion in the performance meeting yesterday too, but I'm not sure I agree. Taking reals as an example, I feel happy that the default real is 0.0 — a value I can use and reason about and make use of — rather than NaN — a value that's still well-defined, but that I'm probably going to need to overwrite in practice most of the time before using it. Making the default value for a locale nowhere or anywhere feels more like a NaN to me, so less useful. (that said, I can also see an argument that the number of cases where you're likely to rely on the default value of a locale is much smaller than it is for reals; still, I prefer Locales[0] just to keep it a simple rule and a useful value).

For this example:

var myTargetLocs: [1..10] locale;
var d = {1..n} dmapped Block({1..n}, targetLocales=myTargetLocs);
// oops, everything is on locale 0

note that making the default here doesn't really help this code much either (if the author's goal is to make a distributed array). And even for this one:

on Locales[1] {
  var l: locale;
  on l {
     // oops I moved to locale 0
  }
}

making it here just changes the comment to "oops, I put this on-clause here for no reason." I.e., I think people writing these patterns are writing near-pointless code with either of these two default values.

I think we can wrap this up. I don't see too many strong opinions. I am moving towards Locales[0] more strongly lately:

on Locales[X] {
  foo()
}
// ...
// somewhere far far away in the code hierarchy
// ...
proc foo() {
  var l: locale;
  // I know l is here, so I don't set it
}

I think users potentially relying on here being the default value is a bit too discomforting while reading code. And having a context-independent default value would be better.

I had an offline chat with Michael, and he is fine with Locales[0] being the default. I will move forward with that unless there are any new opinions.

Was this page helpful?
0 / 5 - 0 ratings