Zig: add syntax to destructure array initialization lists

Created on 26 Sep 2017  路  19Comments  路  Source: ziglang/zig

Latest Proposal


This proposal is an alternative to the rejected multiple expression values proposal (#83). It affects inline assembly improvements (#215). It depends on or at least is related to my comment in #346.

  • Add ability for functions to have multiple return values.
fn div(numerator: i32, denominator: i32) -> i32, i32 {
    return numerator / denominator, numerator % denominator;
}
````

 * If you want an error, it's recommended to use a struct:

```zig
error DivByZero;
const DivResult = struct {quotient: i32, remainder: i32 };
fn div(numerator: i32, denominator: i32) -> %DivResult {
    if (denominator == 0) return error.DivByZero;
    return DivResult {
        .quotient = numerator / denominator,
        .remainder = numerator % denominator,
    };
}
  • return statements can have multiple return values:
fn foo(condition: bool) {
    const x, const y = div(3, 1);
    const a, const b = if (condition) {
        return :this false, 1234;
    } else {
        return :this true, 5678;
    };
}

This is not general-purpose tuples. This is multiple assignment and multiple return values.

Real Actual Use Case: https://github.com/zig-lang/zig/blob/cba4a9ad4a149766c650e3f3d71435cef14867a3/std/os/child_process.zig#L237-L246

proposal

Most helpful comment

Based on my new understanding of Zig's direction (https://github.com/ziglang/zig/issues/3897#issuecomment-738954219), this proposal now seems far from the standard needed to justify a new feature. Without this feature, one can use named fields like this:

fn div(numerator: u32, denominator: u32) struct { quotient: u32, modulus: u32} {
    return .{.quotient = numerator / denominator, .modulus = numerator % denominator};
}

Removing the need to specify these field names is a small improvement that doesn't help as much as other proposals that have been rejected, and fixes an ugliness that is much less problematic than other parts that have been deemed acceptable (i.e. having to add field names to represent multiple values is much less of a problem than having to duplicate code).

All 19 comments

  1. It would make sense to have named return values:

    fn foo() -> int_val : i32, float_val : f32 {
    int_val = 0;
    ...
    if (...) return 10, 3.14;
    ...
    if (...) {
       float_val = 1.0;
       // implicitly converted to return int_val, float_val, compiler makes sure both were set
       return; 
    }
    ...
    if (...) {
      // compiler verifies float_val was set
      return 20, float_val;
    }
    int_val += 1;
    ...
    return int_val, float_val; // equivalent of return;
    }
    
    

The more return values the more this would help.

Nim does this.

It is similar to proposal:
https://github.com/zig-lang/zig/issues/83#issuecomment-259272218


  1. Will there be a chance to use undefined as "I do not care" return value?
fn foo() -> i32, f32 {
   if (...) return 1, undefined;
   ...
   return 1, 2.2;
}

  1. If a structure is used as return type, it would be really handy to define it "inline". Otherwise people may place this type somewhere far away from the function definition, increasing confusion and potential for misuse.
fn my_func() -> const my_result_type = struct {foo: i32, bar: i32 }
    return my_result_type { ...  };
}

var x : my_func.my_result_type  = my_func();

Unnamed variant:

fn my_func() -> struct {foo: i32, bar: i32 }
    return { ...  }; // compiler knows it returns unnamed struct type
}

// result_type is contextual keyword understood by the compiler
var x : my_func.result_type  = my_func(); 

Here the function acts like a namespace for the return type.

Risk that someone mistakenly uses the type in inappropriate context is lower.


  1. This should be possible:

    ```
    fn foo() -> i32, f32 { ... }

var bar, baz = foo(); // type inference
```


  1. It may be handy to ignore some return value

    ```
    fn foo() -> i32, f32 { ... }

var bar, undefined = foo(); // type inference for bar
```

Is it just me or the struct + error thing violates the maxim about

Only one obvious way to do things.

Can you give an example where it's not obvious which thing to do?

The obvious thing (previously, I suppose) if you want to return multiple values is to use a struct.

Now you have two options: struct or multiple returns.

Suppose you have a function that returns two values, and later it evolves to also support errors. Now you have to rewrite the function and all calls to it so that it uses a struct. Maybe after a few times you decide to always use a struct and never use multiple return values.

Suppose the opposite: you use a struct because the function can return errors, but later it gets simplified and there are no errors anymore. Should you refactor it to return multiple values or leave it as-is?

@hasenj: adding a struct increases number of "high level things" in the system.

One may be tempted to reuse return structure in different contexts, e.g. as member in some other structure. This discourages later changes.

Struct definition could be placed far away from its function. (Project rules may require such structuring - first define all constants, then the structures, last the functions.) It gets even better when the struct gives no hint of its intended purpose.

Having a struct also requires one to invent new name (could be solved by allowing function_name.return_type).


On the other hand, multiple return values is very local thing. It has no chance to affect unrelated code. It is always present where it is needed: at function definition and function invocations, and nowhere else. One is not temped to extend/reuse it for other purposes.

IMHO it should be preferred to structs/tuples.

Project rules may require such structuring

The problem here is with arbitrary project rules.

I can also see another problem with multiple return values: it's not clear what is what (Just like with a regular tuple).

fn div(numerator: i32, denominator: i32) -> i32, i32

Without looking at the code, which value is the div and which is the mod?

I shall invoke other items from the zen:

Reduce the amount one must remember.

Favor reading code over writing code.

When you get a struct, the field name will clearly denote which item is which.

Avoid local maximums.

It might be easier to write the function once and use it once or twice. But can you imagine a project full of such functions?

It can be tempting to litter the code with multiple-value returning functions instead of properly defining the data structures that represent the problem and solution one is trying to build.

@hasenj:

Project rules may require such structuring [of source file sections]

The problem here is with arbitrary project rules.

Yes, but this happens and the negative impact could be reduced a bit.

I can also see another problem with multiple return values: it's not clear what is what (Just like with a regular tuple).

Above I proposed optional named return values (adding the ability to manipulate individual values).

At call place returned value is assigned to a named variable. If one uses wrong name or wrong names order ... well, that's mistake like any other.

It might be easier to write the function once and use it once or twice. But can you imagine a project full of such functions?

Yes, I imagine that. Formal project rules kick in with full force:

// Mandatory project header template

//=== constants ===
...
// === types ===
...
// === functions ===
...

More seriously: "too many functions" is problem that should be solved on different level, by proper modularity, hiding the details as much as possible.


Multiple named values have their place: if there are only few of them (hard limit could be used, or some style guide or compiler check, per project) and when they make intuitive sense ( fn date() -> year : u32, month : u32, day : u32 ).

Structures are good if there's reuse, or if the data get too complex.

In C people often prefer multiple return values: all those out parameters by pointer, instead of defining return structure. Projects invent rules where to place these out parameters, tools are created to catch common bugs. This could happen to Zig too.

What's the difference between a named tuple and a struct?

@hasenj: no, I do not mean named tuple (which can be freely used in other places). I mean:

fn foo() -> ret_val1 : i32, ret_val2 : f64 { ... }

var x, y = foo();

The point is that the ret_val1 : i32, ret_val2 : f64 is tied to this function only, is predictably always at the right place, and does not require unique name.

There is yet another use case for multiple return values: comptime expressions.

Setting a value using comptime is tricky (perhaps I didn't learn enough).

This works:

const x = comptime {
  var i : i32 = 99;
  i += 1;
  i
};

It is bit clumsy (avoid ; after last expression, don't forget ; after closing bracket) and, mainly, it does not allow to return more than one value. Yes, one can define a struct, but this makes design more complicated than it needs to be.

I imagine something as:

const x, y, z = comptime {
  ...
  i, j , k
};

Some further questions to consider:

  1. Can the multiple return include an error as one of the items?

    error DivByZero;
    fn div(numerator: i32, denominator: i32) -> (i32, i32, error) {
        if (denominator == 0) return (0, 0, error.DivByZero);
        return (numerator / denominator, numerator % denominator, null);
    }
    
  2. Why can't a multiple return value also be wrapped/union-ed with an error value?

    error DivByZero;
    fn div(numerator: i32, denominator: i32) -> %(i32, i32) {
        if (denominator == 0) return error.DivByZero;
        return (numerator / denominator, numerator % denominator);
    }
    

I think the main issue I'm trying to raise is, why can't "tuples" be used outside the context of a function return? It seems like an asymmetry that can cause problems or confusion. One of which is the inability to union the return value with an error.

The questions brought up by @hasenj are resolved with #632, and since that's now accepted, I'm going to accept this proposal as well.

Removing accepted label as it conflicts with #208.

This proposal depends on #208 and #287 and would allow something like this:

const S = struct {field: i32};
var s: S = undefined;
var x, const y, s.field = blk: {
    break :blk .{foo(), bar(), baz() + 1};
};

This would surely be incompatible with externs since they are declared but not defined (no assignment). But the grammar for variable definitions and extern decls are currently intertwined. One thing I was thinking about in light of #181 is to make extern -- and maybe export too -- its own statement entirely separate from const/var/let.

const a, b = blk: { // definition
    break blk: "hi", "bye";
};
extern y: [*:0]u8; // declaration: no const/var/let, no =

I'm guessing this thread probably isn't the right place to go into great detail about that, so I could open a separate proposal if it's worth exploring.

in further consideration of := syntax from comment https://github.com/ziglang/zig/issues/5076#issuecomment-615024391,
maybe I missed it but I don't see a way to destructure and init _with_ coerce so here's a thought;
[instead of showing function I will just make rhs look like a magical literal tuple __type__]

// function syntax
{
    var i: u32 = undefined;
    var j: u64 = undefined;

    // rule: every var-init __must__ have a following `:` token
    // before next lhs destructure result location or end of lhs

    i, j, text := (u32, u64, []u8);
    text:, i, j = ([]u8, u32, u64);

    // compile error: `text` is unknown
    // compile error: `j` would shadow

    text, i, j := ([]u8, u32, u64);

    // coercion
    text: []const u8, i, j = ([]u8, u32, u64);
    i, j, text: []const u8 = (u32, u64, []u8);
}

// global syntax with symmetry
{
    i:, j:, text := (u32, u64, []u8);
    text:, i:, j := ([]u8, u32, u64);

    // coercion
    i:, j:, text: []const u8 = (u32, u64, []u8);
}

Under these semantics, would this work? I think the tuple counts as a semantic copy but I'm not sure if RLS interacts here.

var x, var y := .{ 4, 6 };
assert(x == 4 and y == 6);
x, y := .{ y, x };
assert(x == 6 and y == 4);

(Using := here for destructured assignment to disambiguate these two cases:

var x = .{ 4 }; // x is tuple
var y := .{ 4 }; // y is comptime_int

)

Nope - result location semantics means that is equivalent to:

x = y;
y = x;

so they both become 6.

I do think that a lot has changed since this proposal was accepted though and I think it should be re-evaluated.

Based on my new understanding of Zig's direction (https://github.com/ziglang/zig/issues/3897#issuecomment-738954219), this proposal now seems far from the standard needed to justify a new feature. Without this feature, one can use named fields like this:

fn div(numerator: u32, denominator: u32) struct { quotient: u32, modulus: u32} {
    return .{.quotient = numerator / denominator, .modulus = numerator % denominator};
}

Removing the need to specify these field names is a small improvement that doesn't help as much as other proposals that have been rejected, and fixes an ugliness that is much less problematic than other parts that have been deemed acceptable (i.e. having to add field names to represent multiple values is much less of a problem than having to duplicate code).

Was this page helpful?
0 / 5 - 0 ratings