Zig: Improve block-local error handling

Created on 15 Jun 2020  路  13Comments  路  Source: ziglang/zig

Motivation

Consider the following example from my code:

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    const owned_slice = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };

    self.args_map.get(id).?.value.append(owned_slice) catch {
        c.wl_client_post_no_memory(wl_client);
        allocator.free(owned_slice);
        return;
    };
}

The error handling here requires code duplication if I want to keep things in one function. I can't preform the error handling at the callsite either since this function is a callback being passed to a C library. The amount of code being duplicated for error handling is not too large in this case, but it is easy to imagine a case in which it could grow larger. My only option to avoid this code duplication is to create a helper function:

fn addArgumentHelper(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) !void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    const owned_slice = try std.mem.dupe(allocator, u8, std.mem.span(arg.?));
    errdefer allocator.free(owned_slice);
    try self.args_map.get(id).?.value.append(owned_slice);
}

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    addArgumentHelper(wl_client, wl_resource, arg) catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}

Doing this for every callback I wish to register gets annoying and feels like unnecessary boilerplate.

Proposal

EDIT: the original proposal isn't great, see @SpexGuy's comment for the much better revised version.

To summarize the revised proposal:

  1. Fix peer type resolution so it correctly infers error union block types if a block is broken with both an error and another value.
  2. Modify the semantics of errdefer to mean "runs if the containing block yields an error to any outside block". Both returning and breaking with error count as yielding an error to an outside block, as long as the evaluated type of the outside block is an error union. (this is to stay consistent with the current behavior if a function returns an error value but not an error union).
  3. Introduce the keyword try_local, used as try_local :blk foo(); which is sugar for
    foo() catch |e| break :blk e;. As mentioned in a comment below, the try keyword could instead be overloaded to have the described behavior if given a label as in try :blk foo().

I think both 1 and 2 are unambiguously good changes. The addition of a new keyword for 3. may not be worth it, though it would certainly be welcome in the examples I've presented here.

With the revised proposal, the above example would look like this:

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    blk: {
        const owned_slice = try_local :blk std.mem.dupe(allocator, u8, std.mem.span(arg.?));
        errdefer allocator.free(owned_slice);
        try :blk self.args_map.get(id).?.value.append(owned_slice);
    } catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}


Original proposal

Doing this for every callback I wish to register gets annoying and feels like unnecessary boilerplate. As an alternative, I propose allowing catch to be used on blocks, which would allow expressing this control flow like so:

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;
    {
        const owned_slice = try std.mem.dupe(allocator, u8, std.mem.span(arg.?));
        try self.args_map.get(id).?.value.append(owned_slice);
    } catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}

Catch blocks have simple rules:

fn add(a: i32, b: i32) !i32 {
    return error.CantMath;
}

fn div(a: i32, b: i32) !i32 {
    if (b == 0) return error.DivZero;
    return error.CantMath;
}

fn sub1(a: i32) !i32 {
    if (a <= 0) return error.Oops;
    return a - 1;
}

pub fn main() !void {
    const a = 1;
    const b = 5;

    // The catch expression must evaluate to the same type as the block
    const out = blk: {
        const c = try add(a, b);
        break :blk try div(a, c);
    } catch |err| switch (err) {
        error.CantMath => 42,
        error.DivZero => "thing", // compile error
    };

    // Otherwise it must be noreturn
    const out = blk: {
        break :blk try add(a, b);
    } catch return;

    // catch blocks may be nested, errors handled with try are always caught
    // by the innermost catch block, or returned from the function if not inside
    // a catch block.
    {
        const aa = try sub1(a); // caught by outer handler

        const aaa = blk: {
            break :blk try sub1(aa); // caught by inner handler
        } catch |err| {
            std.debug.warn("inner handler: {}", .{err});
        };

    } catch |err| {
        std.debug.warn("outer handler: {}", .{err});
    };

    bb = try sub1(b); // error is returned from main()
}

Consider this an alternative to #5421 solving some of the same issues in a way that is in my opinion more consistent with the rest of the language.

Drawbacks

The major disadvantage to this proposal is that it changes the behavior of the try keyword based on the scope it is used in. try is currently syntactic sugar for catch |err| return err; and changing this complicates things. However I think the proposed semantics of try are straightforward enough.

proposal

Most helpful comment

It's a little more typing, but there is another option not considered here, which is the desugared form of your suggestion. Unfortunately it doesn't work. The errdefer clauses here generate no code, I guess because the block doesn't actually return.

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    @as(error{OutOfMemory}!void, blk: {
        const owned_slice = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch |err| break :blk err;
        errdefer allocator.free(owned_slice); // never runs :( 

        const owned2 = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch |err| break :blk err;
        errdefer allocator.free(owned2); // never runs :(

        self.args_map.get(id).?.value.append(owned_slice) catch |err| break :blk err;
    }) catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}

Having the try keyword default to blocks is definitely undesirable, since nesting blocks is common. Defaulting it to the nearest catch block is also undesirable, because you may want some errors to break the local block and others to break the function. I would propose the following modifications:
1) Introduce the new keyword try_local, used as try_local :blk doThing();. It desugars to doThing() catch |err| break :blk err;.
2) Fix peer type resolution so it correctly infers error{OutOfMemory}!void as the result type of blk in the above example, in order to remove the need for @as. I'm not sure exactly what this entails, there might be problems with this.
3) Modify the semantics of errdefer to mean "runs if the containing block yields an error to any outside block". Both returning and breaking with error count as yielding an error to an outside block, as long as the evaluated type of the outside block is an error union. (this is to stay consistent with the current behavior if a function returns an error value but not an error union).

With these changes, there's no longer anything special about the catch block. It's just a block that evaluates to an error union and a normal catch clause to handle the error, plus some syntactic sugar to break with an error and a guarantee that errdefers will run when doing so.

All 13 comments

I realized after posting this that my code is actually leaking memory if the append() call fails. To fix this, I first reached for errdefer but realized that it cannot be used here as an error is not being returned from the function. This means manual cleanup is necessary which is more error prone especially in more complex cases than this one.

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    const owned_slice = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
    // errdefer allocator.free(owned_slice);  <-- wouldn't get called as no error is returned

    // say there was a third call that could also OOM
    const owned2 = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch {
        c.wl_client_post_no_memory(wl_client);
        allocator.free(owned_slice); // instead we must manually clean up
        return;
    };

    self.args_map.get(id).?.value.append(owned_slice) catch {
        c.wl_client_post_no_memory(wl_client);
        allocator.free(owned_slice); // instead we must manually clean up
        allocator.free(owned2); // both slices, wish i could used errdefer here
        return;
    };
}

With the proposed catch blocks, we can use errdefer again!

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    {
        const owned_slice = try std.mem.dupe(allocator, u8, std.mem.span(arg.?));
        errdefer allocator.free(owned_slice); // runs if the block is broken with an error 

        const owned2 = try std.mem.dupe(allocator, u8, std.mem.span(arg.?));
        errdefer allocator.free(owned2); // runs if the block is broken with an error

        try self.args_map.get(id).?.value.append(owned_slice);
    } catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}

I think this strengthens the argument for the proposed syntax.

It's a little more typing, but there is another option not considered here, which is the desugared form of your suggestion. Unfortunately it doesn't work. The errdefer clauses here generate no code, I guess because the block doesn't actually return.

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    @as(error{OutOfMemory}!void, blk: {
        const owned_slice = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch |err| break :blk err;
        errdefer allocator.free(owned_slice); // never runs :( 

        const owned2 = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch |err| break :blk err;
        errdefer allocator.free(owned2); // never runs :(

        self.args_map.get(id).?.value.append(owned_slice) catch |err| break :blk err;
    }) catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}

Having the try keyword default to blocks is definitely undesirable, since nesting blocks is common. Defaulting it to the nearest catch block is also undesirable, because you may want some errors to break the local block and others to break the function. I would propose the following modifications:
1) Introduce the new keyword try_local, used as try_local :blk doThing();. It desugars to doThing() catch |err| break :blk err;.
2) Fix peer type resolution so it correctly infers error{OutOfMemory}!void as the result type of blk in the above example, in order to remove the need for @as. I'm not sure exactly what this entails, there might be problems with this.
3) Modify the semantics of errdefer to mean "runs if the containing block yields an error to any outside block". Both returning and breaking with error count as yielding an error to an outside block, as long as the evaluated type of the outside block is an error union. (this is to stay consistent with the current behavior if a function returns an error value but not an error union).

With these changes, there's no longer anything special about the catch block. It's just a block that evaluates to an error union and a normal catch clause to handle the error, plus some syntactic sugar to break with an error and a guarantee that errdefers will run when doing so.

I think your changes make a lot of sense, they are much more tightly defined than the original proposal and result in more explicit code without becoming cumbersome. They also stay closer to the current semantics of the language.

With your proposed amendments we get the following which I find quite nice:

fn addArgument(wl_client: ?*c.wl_client, wl_resource: ?*c.wl_resource, arg: ?[*:0]u8) callconv(.C) void {
    const self = @ptrCast(*Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
    const id = c.wl_resource_get_id(wl_resource);
    const allocator = self.server.allocator;

    blk: {
        const owned_slice = try_local :blk std.mem.dupe(allocator, u8, std.mem.span(arg.?));
        errdefer allocator.free(owned_slice);

        const owned2 = try_local :blk std.mem.dupe(allocator, u8, std.mem.span(arg.?));
        errdefer allocator.free(owned2);

        try_local :blk self.args_map.get(id).?.value.append(owned_slice);
    } catch {
        c.wl_client_post_no_memory(wl_client);
        return;
    };
}

How do you get case-specific error handling on one of the calls, and also invocation of the common error handler?

Or get case-specific error handling that aborts the block but doesn't invoke the common handler?

In #5421 I called for a reconsideration of Zig error handling semantics, to find ones simpler and more flexible. The above proposes an additional mode, further complicating them.

Also, a more general purpose (and perhaps simpler) path to the above feature is to allow non-local returns within nested anonymous functions. I imagine that's been considered for Zig already?

How do you get case-specific error handling on one of the calls, and also invocation of the common error handler?

Or get case-specific error handling that aborts the block but doesn't invoke the common handler?

Both of these are possible, with @SpexGuy's modifications the changes to the semantics of zig are minimal.
```zig
fn addArgument(wl_client: ?c.wl_client, wl_resource: ?c.wl_resource, arg: ?[:0]u8) callconv(.C) void {
const self = @ptrCast(
Self, @alignCast(@alignOf(*Self), c.wl_resource_get_user_data(wl_resource)));
const id = c.wl_resource_get_id(wl_resource);
const allocator = self.server.allocator;

blk: {
    const owned_slice = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch |err| {
        // case-specific error handling
        break  :blk err; // Break the block with the err, the catch on the block is invoked.
    }
    errdefer allocator.free(owned_slice);

    const owned2 = std.mem.dupe(allocator, u8, std.mem.span(arg.?)) catch |err|
        // do something
        break  :blk; // Break the block with value void, catch on the block is not invoked.
    }
    errdefer allocator.free(owned2);

    try_local :blk self.args_map.get(id).?.value.append(owned_slice);
} catch {
    c.wl_client_post_no_memory(wl_client);
    return;
};

}

The new title isn't very readable. Maybe that should be the summary in the text?

The title is something like "let catch clause apply to a block" or "shared error handlers with try_local and catch".

If errdefer works inside blocks, then defer would as well. If there's try_local, then return_local also belongs.

defer has always been block scope. return_local already exists, but it's spelled break.

Ah, but that doesn't apply to if/else/for blocks?

If Zig gets try_local, it should use return_local instead of break in blocks.
Also try_block might be more descriptive.

Ah, but that doesn't apply to if/else/for blocks?

No, it does apply. There's nothing special about if/else/for/while blocks in zig.

As an alternative to introducing a new keyword for try_local, the try keyword could be overloaded. When used with no label it would have the current behavior and desugar to

catch |e| return e;

but when given a label as in try :blk foobar(); it would desugar to

catch |e| break :blk e;

I'm not convinced that this is a good idea as it introduces some inconsistency to try. However, it does avoid adding a new keyword and I don't think it causes any problems semantically. It's certainly worth considering.

Sorry this is off topic, but if defer is tied to a block, I can't use this common Go pattern?

fn f(i: *std.fs.file) !void {
   if (!i) {
      i = try std.fs.open(...)
      defer i.close() // same as .close() without defer here?
   }
   const len = try i.read(...)
}

Apologies for any syntax glitches.

Correct, that pattern doesn't behave the same way in Zig. This is the equivalent Zig code:

const is_default_file = i == null;
if (is_default_file) { i = try open(...); }
defer if (is_default_file) i.close();
// use i

I am all for overloading try and modifying errdefer accordingly. I'd like to point out, we already overload break and continue within loops in the same way; also, it isn't immediately obvious where an unadorned try_local would return to.

Was this page helpful?
0 / 5 - 0 ratings