Roslyn: Proposal: guard statement

Created on 25 May 2016  路  57Comments  路  Source: dotnet/roslyn

When reviewing the scoping rules for out variables and pattern variables, the LDM discovered that there is sometimes an unfortunate interaction with a pattern of coding that we call the _guard pattern_. The guard pattern is a coding pattern where you test for the exceptional conditions early, handle them, and then bypass the rest of a method by returning, the rest of a loop by continuing, etc. The idea is to not have to deeply nest the "normal" control path. You'd write code like this:

    int parsedValue;
    if (!int.TryParse(s, out parsedValue))
    {
        // handle or report the problem
        // this block does not complete normally
    }

    // use parsedValue here

To take advantage of the out var feature, and avoid having to write the type of the out variable, you'd like to write

    if (!int.TryParse(s, out var parsedValue))
    {
        // handle or report the problem
        return;
    }

    // use parsedValue here

but that doesn't work because parsedValue isn't in scope in the enclosing statement. Similar issues arise when using pattern-matching:

    Tuple<string, int> expression = ...;
    if (!(expression is @(string key, int value))
    {
        // either the tuple was null or the string was null.
        // handle the problem
        continue;
    }

    // use key, value here

(The @ here assumes that tuple patterns are preceded by an @ character)

but that doesn't work because key and value aren't in scope after the if statement.

Swift addresses this category of issue by introducing the guard statement. Translating that into the concepts we've been proposing for C# with tuples and pattern-matching, the equivalent construct would be a new kind of _statement_:

statement
    : guard_statement
    ;
guard_statement
    : 'guard' '(' expression ')' 'else' embedded_statement
    ;

Open Issue: should we require parens around the expression as part of the syntax?

A guard statement is like an inverted if statement, or an if statement without a "then" block. It has the following semantics:

  1. The _expression_ of a _guard_statement_ must be a boolean expression.
  2. The _embedded_statement_ is reached when the _expression_ is false.
  3. The statement following the _guard_statement_ is reached when the _expression_ is true.
  4. The _embedded_statement_ may not complete normally.
  5. Any _pattern variables_ and _out variables_ declared in _expression_ are in the scope of the block enclosing the _guard_statement_.

This would allow you to handle the previous examples as follows:

    guard (int.TryParse(s, out var parsedValue))
    else {
        // handle or report the problem
        return;
    }

    // use parsedValue here
    Tuple<string, int> expression = ...;
    guard (expression is @(string key, int value))
    else {
        // either the tuple was null or the key was null.
        // handle the problem
        continue;
    }

    // use key, value here

/cc @dotnet/ldm

Area-Language Design Language-C# New Language Feature - Out Variable Declaration New Language Feature - Pattern Matching New Language Feature - Tuples

Most helpful comment

Why not just:

``` c#
guard (expr)
statement

?

The 'else' reads _super_ weird to me.  

With the above i can then write:

``` c#
guard (n != null)
    throw new ArgumentNullException(n)

Or

``` c#
guard (int.TryParse(out var v)) {
return;
}

// use v
```

All 57 comments

I'm thinking the keyword 'assert' would read better here than 'guard'. I'm not sure what guard is guarding.. it seems like it is named after the jargon-y term named after the common pattern of writing code. And I transpose the u and the a when typing guard all the time. Where as 'assert' is nice because it reads like I'm making a claim about the expression being true and implies that there is nothing special to be done when it is so.

@mattwar An assertion is making a claim about the expression, but the guard doesn't make such a claim. It _tests_ whether the condition is true or not, expecting that it could be either, and provides code to handle the "not true" case and "get out of there".

Why not just:

``` c#
guard (expr)
statement

?

The 'else' reads _super_ weird to me.  

With the above i can then write:

``` c#
guard (n != null)
    throw new ArgumentNullException(n)

Or

``` c#
guard (int.TryParse(out var v)) {
return;
}

// use v
```

@CyrusNajmabadi That sounds like Perl's unless statement

@gafter It seems to work just like a Debug.Assert expression, except it not debug and I get to say what happens when it fails. That's a strong association that every programmer already has.

Guard just doesn't tell me what it does. It tells me I have to do a web search to find out.

Also, i thnk i like 'unless' as well. it reads nicely. "unless n is not null, throw this thing. unless this int parses, return from what i'm going." 'guard' doesn't immediately translate well in my head.

Plenty of discussion over on #8181 which however does not discuss the scope rules. Also related is #6400 and the let statement in the pattern-matching specification.

Wouldn't it be simpler and more consistent if out var variables _were_ in scope in the enclosing statement?

To me this makes the most sense: out parameters are always set. It's common that they are set to a useful value only under some conditions, but it shouldn't be up to the language to guess what that condition is.

@svick The same issues occur with pattern-matching, and for the same reasons. We're pretty comfortable with the tentative scope rules we have today, but this is one particular use case where they rub us the wrong way.

I want to offer a counter-proposal which I think is simpler and satisfies most of the same scenarios:

_Proposal: Lifting of out variable declarations_. _Whenever a variable is declared with the out modifier, its scope is expanded to the enclosing block._

Example: TryGet

dictionary.TryGet("key", out var value);

// easier than the existing code:
Tuple<Func<string,bool>, string, string> value;
dictionary.TryGet("key", out value);

Example: pattern matching

if (!(o is out string s)) return;
Console.WriteLine(s);

Example: guards

if (!int.TryParse(input, out int i)) throw new ArgumentException(nameof(input));
Console.WriteLine(i);

Unexample: limited scope out varaibles

if (int.TryParse(input, out int i)) { Console.WriteLine(i); }
// Under this proposal, there's no way to limit the scope of "out arguments" to just the 'then' block

if (o is string s) { Console.WriteLine(s); }
// But there is still a way to limit the scope of pattern variables to just the 'then' block

Example: loop

for (out int i=0; i<10; i++)
{
  if (stop_early) break;
}
Console.WriteLine($"We got to {i}");

Example: foreach loop

foreach (out var x in GetEnumeration(out var dummy)) { ... }
// Note that "x" isn't definitely assigned after the foreach (in case the enumeration was empty).
// I just put this example to show that the rules are uniform: the "out" modifier
// always pushes out the variable's scope, regardless if it's the loop variable
// or an out variable

Example: normal variable declaration

void f() {
  out int x = 15;
}
// Although technically meaningful, it has the same effect as just "int x = 15" here,
// and I think it should be disallowed as confusing.

The idea is that in many cases (e.g. patterns, "is" testing, "then" clauses) then there's no need to expand to the enclosing scope. But in many cases (e.g. guards) there is. This proposal gives the user control of whether the variable's scope expands out, in an at-a-glance understandable syntax.

As the "unexample" above shows, there's one corner where this proposal doesn't satisfy: out arguments with limited scope. I think this corner is small enough (and the penalties of omitting this corner small enough) that it's worth the sacrifice to get a simple feature.

This is useful for out variable declaration, but I'd prefer let if I want to match a potentially incomplete pattern.

let (string key, int value) = expression else continue;
guard (int.TryParse(str, out var parsedValue)) else continue;

In both of these examples, the end of _embedded-statement_ must be unreachable, so in my opinion guard and else should be used here. In contrast, unless does not imply such requirement.

I'll also note that guard is a potential alternative mechanism for writing method contracts, because the proposed syntax in #119 just blows up the method signature. Rel: https://github.com/dotnet/roslyn/issues/11308#issuecomment-219213073

The one thing I like about guard is that it serves as a helper for validation logic to ensure that the method does bail out:

private void Foo(string s) {
    if (s == null) {
        Log.Warn("s is null!"); 
        // oops, forgot to return or throw
    }
    int l = s.Length; // boom
}

private void Bar(string s) {
    guard (s == null) else {
        Log.Warn("s is null");
        // compiler error, must use throw, return, break or continue to exit current block
    }
    int l = s.Length;
}

This proposed behavior that pattern and out variables survive into the enclosing scope is a welcome addition.

As for the guard keyword specifically, I don't really care, but I don't think that unless conveys the point well enough that the current block is not allowed to continue if the condition does not match. This fits more into validation. "Guard clause" does seem to be the accepted CS term for this specific form of condition, although outside of Swift (which has nearly identical syntax to this proposal) it seems to relate more to pattern guards.

It seems all we need is to invert the scope of the out var, so why not:

``` C#
if not (int.TryParse(s, out var parsedValue))
{
// handle or report the problem
return;
}

// use parsedValue here

```

In this way, you don't need to introduce a completely new statement, but only an alternative of if: the compound if not. It also solves another minor issue: the ! before the expression is hard to see and usually you need an extra level of parenthesis.

@qrli You're still introducing a new statement and a new keyword. An "inverted if" also doesn't imply that it would force exiting the current block.

@qrli I have request if! syntax in the past

But somehow I think it could possible if we just use else(expression)

Normally we must use else if or else { } plainly. So we could add else() as another way to use it

I'm not native english speaker so if this sound weird I would support unless instead of if not or guard

Doesn't this encourage e.g. pushing argument validation down the call chain?

I would personally like to see method contracts #119 implemented first. The concepts are different of course, I understand that.

Alternatively, how about D-style contracts, but changed so that variables declared in the in contract block are accessible to the body?

@HaloFour
It adds only a contextual keyword instead of a keyword, which is already better.
Exiting can still be enforced by compiler, if necessary. However, I don't think it is necessary to enforce exiting. There is nothing to continue and access them, as out variables are definitively assigned anyway. e.g. I may log the invalid input but continue process with default value.

@gafter Regarding "that doesn't work because key and value aren't in scope after the if statement"... couldn't we fix this by broadening the scope of the variables declared in the if statement? I think this would work because we already have wording in the spec that the variable is only "definitely assigned" if the match returns true.

More formally:


Currently the patterns proposal contains the following wording:

If the pattern appears in the condition of an if statement, its scope is the condition and controlled statement of the if statement, but not its else clause.

I suggest that the scope of the variable is broadened to be as if the variable was declared just before the if statement. This is because we already have the following restriction:

Every _identifier_ of the pattern introduces a new local variable that is _definitely assigned_ after the is operator is true (i.e. _definitely assigned when true_).

So if you write the following, the semantics are already as desired:

if (maybeX is Some x)
{
    // x is in scope and definitely assigned
}
else
{
   // x is in scope, but not definitely assigned (unusable)
}

The way the specification is currently written precludes the possibility of useful code such as:

void Handle(Option<T> maybeX)
{
    if (!(maybeX is Some x)) return;
    // x is now in scope and definitely assigned
}

Or:

if (!(dictionaryFromTheFuture.TryGetValue(key) is Some x))
{
    x = defaultValue;
}

// x is in scope and definitely assigned

And this code from the original post is now valid:

    Tuple<string, int> expression = ...;
    if (!(expression is @(string key, int value))
    {
        continue;
    }

    // use key, value here - they are in scope and definitely assigned

These examples also suggest that an unless (if (!...)) keyword could work quite nicely :grin:

@Porges The reasons for the current scoping rules are, among other things, to help you with code such as a series of if-then-else, so you don't have to invent a new name for each branch.

@gafter hmm, I would have thought that that would be a rarer situation than things like the above. Thanks for the reasoning, though.

Perhaps you should be permitted to shadow a non-definitely-assigned variable? :grinning:

Perhaps you should be permitted to shadow a non-definitely-assigned variable?

We need to know the scoping of the variables (so we know what is being assigned) before figuring out whether they are definitely assigned or not, so that doesn't work.

Maybe we could use return if

``` C#
void Guard(string s)
{
return if(!int.TryParse(s, out var value))
{
// handle or report the problem
// auto return after this block
}

// use value

}

int GuardError(string s)
{
return if(!int.TryParse(s, out var value))
{
// handle or report the problem
// ERROR it must return something
}

// use value

}

int Guard(string s)
{
return if(!int.TryParse(s, out var value))
{
// handle or report the problem
return 0; // need to return value
}

// use value

}

int GuardReturn(string s)
{
// Maybe this syntax
return (0) if(!int.TryParse(s, out var value))
{
// handle or report the problem
}

// use value

}

int GuardReturn(string s)
{
// Or this syntax
return if(!int.TryParse(s, out var value))
{
// handle or report the problem
0; // Auto return last line
}
/* Above code would be transpiled
if(!int.TryParse(s, out var value))
{
// handle or report the problem
return 0;
}
*/

// use value

// So we could
return if(!int.TryParse(s, out var value))
{
    int.Parse(0); // Auto return last line of block
}

}
```

@Thaina I kind of like that syntax, I assume reusing existing keywords is generally preferred to inventing new ones so long as they don't make things more confusing.

Would that syntax support throw as well?

And how about a statement form if returning is all it needs to do?

``` c#
void Guard(string s)
{
return if(!int.TryParse(s, out var value));

// use value

}
```

I think this guard/returnif is just a syntax to enforce error and will be ignore at compile time so everything will work normally include throw

@Thaina
Or use break if

C# break if (!int.TryParse(s, out var value)) { return 0; // or throw new ArgumentException(); }

@qrli I think that interesting. But would it be confused in while or switch block?

Also I like return more if it would auto return on void

Agreed, break can be a valid keyword in the same locations but it is semantically different from return. That could make readability more difficult.

Couldn't this be fixed by simply declaring value before the if statement?

@Miista

That would only handle the scope issues if using out var, but that wouldn't work for type-switch or variable patterns where you are not permitted to assign the result into an existing variable and the scope of the variable is limited to a prescribed scope:

if (!(foo is Bar bar)) {
    // bar is in scope here, but not definitely assigned
}
else {
    // bar is not in scope here
}
// bar is not in scope here either

guard (foo is Bar bar) else {
    // bar is not in scope here
}
// bar is in scope here and is defined

The other feature that guard has is to not permit code to fall out of the scope of the compensating block. It would be a compiler error if the block doesn't throw, return, break or continue.

Trying ensure :eyes:

``` C#
ensure (int.TryParse(s, out var parsedValue)) else return;

ensure (ptr != null) else
{
Environment.FailFast();
return;
}

```

@omariom
I think the issue with ensure is that it reads like part of code contract, while code contract logic may be completely removed in release build.

Just to :bike::house: how about

C# when (foo == null) { return; // throw, break or continue }

That doesn't seem different enough from if to make me think it would behave differently.

@bondsbw Fair enough, I just want to avoid the else clause in some way.

Everyone seems like the idea. But naming is hard..
At least we don't have to invalidate cache :smile:

It is not only about naming.

1st: The guard statement is mostly an inverted if. Without out vars, it is more or less interchangeable with if. It means we will have two ways to do the same thing, which reminds of the headache of C++.

2nd: "Invert if" is a routinely used refactor. With guard, there will be two ways to invert when no out var, and one way to invert to guard when there is out var. This is a bit confusing.

I'd like keep it as simple if if possible. One not-so-good idea is to allow if without body:

c# if (s == null) else { return; }

Why is guards needed? Since the out variable brings that variable into scope. The user can use && or other logic operator in the expression. Which acts as a guard.

``` c#
if( Int.TryParse(s, out x) && (x > 0)

Only case is in `switch` blocks (or `match` expressions), which current does support condition switch statements. This could be provide with a `when` condition.

switch (obj)
{
case obj Is Int64 x when x > 0
// etc
```

@AdamSpeight2008

The guard statement above is not related to pattern guards.

With if and switch the scope of out variables and variable patterns are limited to within the _then-stmt_ or _switch-section_ respectively:

if (int.TryParse(s, out var x)) {
    // x in scope here
}
// x no longer in scope here

As guard is intended to be used for validation where the condition would be negated. If using if in this scenario you could not bail out of the current method while retaining the out variable in scope:

if (!int.TryParse(s, out var x)) {
    Log.Warn("The string is not a number!");
    return;
}
// x no longer in scope here

So guard changes the scoping rules to allow the variable to be used in the enclosing scope:

guard (!int.TryParse(s, out var x)) else {
    Log.Warn("The string is not a number!");
    return;
}
// x is in scope here

And since guard is intended for validation purposes the _embedded-statement_ in the else clause is required to exit the enclosing scope via return, throw, break or continue.

@HaloFour Didn't you forget to write out var x? Changes in the scoping rules apply only to the newly created variables in out argument. In your example x is already in scope.

@stepanbenes That I did, I will fix the examples.

@HaloFour Thank you for explaining the difference but I feel that my point still stands.
Since in the negated case (the proposed usage case).

It would easier to just define the variable to use in the out variable position. In the existing scope.

``` c#
int value = 0;
if( !Int.TryParse( text, out value ) )
{
// log stuff
return ;
}
// do stuff with value

The example of guards proposed assume the usage ( of **out** variables ) will be in a conditional expression like `if` or `switch` etc.

bool result = Int.TryParse( text, out var value );
// value is defined and assigned ( possibly with default(T)
```

Also having guard statements change the scoping rules in this one use case, would get confusing where guards and non guard usage of out variable in the same section.

@ljw1004

Inclusive Scope
The out variable is accessible in the enclosing scope.

``` c#
if (int.TryParse(input, out int i)) { Console.WriteLine(i); }
// Under this proposal, there's no way to limit the scope of "out arguments" to just the 'then' block

**Exclusive scope**
The `out` variable is exclusive to the next enclosed scope, and not the enclosing scope.
I propose that `put` is use to indicate this.

``` c#
if (int.TryParse(input, put int i)) { Console.WriteLine(i); }
// Under this proposal, there's no way to limit the scope of "out arguments" to just the 'then' block

@AdamSpeight2008

It would easier to just define the variable to use in the out variable position. In the existing scope.

For out variables you're right, you can just declare the variable in the parent scope. However, with variable patterns that becomes impossible as the spec explicitly forbids assigning them to existing variables.

bool result = Int.TryParse( text, out var value );
// value is defined and assigned ( possibly with default(T)

Per the out declaration spec, value would not be in scope after that statement.

Also having guard statements change the scoping rules in this one use case, would get confusing where guards and non guard usage of out variable in the same section.

Considering that's mostly the point of guard I'm not sure that would really be confusing. This same feature with very similar syntax already exists in Apple Swift and it seems to be well received.

@HaloFour
Maybe the spec should be changed?
To promoting the out into existing scope, reusing and existing variable, for the implicit case.
As it supports both use cases. Then have some optional way of designating the two explicit cases.
eg using attributes.

Explicitly Local

``` c#
if( Int.TryParse( text, [local] out var value ) )
{
// value is defined and assigned within here.
}
else
{
// value is not define here.
}

**Guard**

``` c#
if( !Int.TryParse( text, [guard] out var value ) )
{
  // value is not defined here.
}
else
{
 // value is defined and assigned

I would go against attribute syntax. It too dirty and complicate

For local scope I think maybe we could add more syntax, maybe out let ?

``` C#
if( Int.TryParse( text, out let value ) )
// value is defined and assigned within here.
else // value is not define here.

// value is not define here ? maybe
```

For guard I still stand with return if
We don't actually want to guard, we actually want it to return for sure. So why not just return

@Thaina It's not about return the value, but into scope(s) should the out var exist in afterwards.
return !if looks likes the function ends and returns there, and the code after is potentially unreachable.

Also I don't think it would go against attribute syntax, as I see the attribute is attached to the out var parameter argument.

@AdamSpeight2008 Yes I know it not about return value
but it about return out of scope so return if is doing just that. And it is a new syntax so we could explain in document that it will do work in if block before return

And about attribute, I would not like it if attribute can cause syntax error IMHO. It should be syntax keyword that could go that far

@Thaina
It's subtle change the prevailing usage or return.
I'd go with a contextual keyword.

c# if( Int.TryParse( text, local out var value ) ) if( !Int.TryParse( text, guard out var value ) )

if( Int.TryParse( text, out let value ) )

or at least if( Int.TryParse( text, out local value ) )

Shorter and cleaner

guard is another problem, it cannot do that syntax. it not parameter specific. it is scope specific

@gafter @HaloFour
I just got a new idea inspired by "Destructuring (assignment?) statement #6400"
The let else statement is very similar to this guard else, except it takes a pattern instead of a conditional expression. So it feels like the pattern expression can be tweaked.

``` c#
let bool(true) = int.TryParse(s, out var parsedValue) else return;
// scope of out var is treated the same as the returned variable, so they are all accessible here

But the pattern part looks a bit weird. I guess the pattern syntax could extended to support function call, though I do not have a concrete idea yet. The ideal imagination for me is:

``` c#
let int.TryParse(s, out var parsedValue) else return;

@qrli Your idea is neat But it not satisfied the proposal to do some work and log something before return or throw

@Thaina Perhaps

let int.TryParse(s, out var parsedValue) else { Log.Something(); return; }

I don't like this idea that variables can start escaping their scope (or what it looks like their scope should be) if certain keywords were used... I think it makes the code more difficult to read.

I'm not sure it deserves a new language keyword / construct just to avoid having to write

int parsedValue;
if (!int.TryParse(s, out parsedValue))
    return;

// Do something with parsedValue

Now that the language has Tuples, why don't we instead argue that methods with out parameters now appear to return Tuples? (FYI this is what FSharp does.) After all, the only reason out parameters exist is because the language didn't have tuples from the beginning. For example, calls to Int32.TryParse would become:

var (success, parsedValue) = int.TryParse(s);

if(!success) return;

// Do something with parsedValue

So much cleaner and easier to understand, especially with pattern matching (if I can take a guess at the syntax:

switch(int.TryParse(s))
{
    case (false, *) : return;
    case (true, parsedValue) : // Do something with parsedValue
}

@Richiban

Per #12597 the conversation is probably mostly moot. Now the scope will _sometimes_ leak out into their enclosing scope based on which constructs you use:

if (!int.TryParse(s, out int parsedValue))
    return;

// use parsedValue here

I personally don't agree with it. It's inconsistent with existing C# behavior, it's inconsistent with itself and it's inconsistent with the rest of the C-family of languages. I'd rather the team did nothing at all and if the user intended to "leak" the scope that they would declare the variable separately, just as you demonstrated.

@HaloFour that's a shame...

By the way,

I personally don't disagree with it

did you mean "I personally don't _agree_ with it"?

@Richiban

did you mean "I personally don't agree with it"?

That I did.

Closing, as this is no longer useful with the new scoping rules.

Was this page helpful?
0 / 5 - 0 ratings