Zig: audit use of %% prefix operator in standard library

Created on 1 Oct 2017 · 20Comments · Source: ziglang/zig

e.g. https://github.com/zig-lang/zig/blob/5e6fc94b7f43dda028de779a874f12591fa0a50d/std/build.zig#L89

The %% prefix operator is supposed to mean that something is impossible, not that it's an error to happen. The above link is possible at runtime if the user passes in long build_root and cache_root parameters, and then the allocator runs out of memory while concatenating them. That's not impossible; that's an error.

I'm generally uneasy about %%foo() being easier to type than %return foo(). It seems that %return foo() should be more commonly used than %%foo(), so why is it harder to type? If the author isn't certain whether an error is possible or not, then use %return (or do more research and become certain). Only when the author is certain despite the Zig type system that an error is not possible is it appropriate to use the %% prefix operator.

Should we remove %%foo() and just let the user type foo() %% unreachable instead?

bug

Source

thejoshwolfe

Most helpful comment

The other reason I used %% in std/build.zig is that, at the end of the day, if we run out of memory, we're going to abort the build and crash. Crashing earlier with %% is easier to debug when it happens.

But %% isn't a crash; it's undefined behavior. In safety mode, it's a crash, but does that mean that build.zig is designed to only work in safety mode? Does it rely on debugging facilities in normal operation? If so, it's not setting a very good example of how to write idiomatic Zig code.

Maybe you kind of want undefined behavior or a crash if printf fails.

I'll bring this up again. git log | less vs svn log | less. In both cases, git or svn can have a failure when doing a printf, because less closed the stdio stream when the user finished reading the history. git simply exits in this case, and svn prints a stacktrace assuming that something has gone wrong.

Even printf failures are something the user should think about. I don't like recommending %% for printf when it really is possible for it to fail in realistic circumstances.

thejoshwolfe on 1 Oct 2017

👍3

All 20 comments

Here's hello world with this proposal:

const io = @import("std").io;

pub fn main() -> %void {
    io.stdout.printf("Hello, world!\n") %% unreachable;
}

The code in std/build.zig is a proof of concept. The bug of a long build_root and cache_root parameter is fundamentally present until this TODO is fixed:

https://github.com/zig-lang/zig/blob/5e6fc94b7f43dda028de779a874f12591fa0a50d/std/special/build_runner.zig#L45-L47

See #511 which I just opened.

Until that is done, %%os.path.relative(allocator, build_root, cache_root) can't really be improved.

I think that your proposal of removing the operator is interesting, can you file a separate issue for that which is labeled "proposal" and not "bug"?

andrewrk on 1 Oct 2017

I don't understand how printf is unable to fail in hello world. Wouldn't the following be better:

const io = @import("std").io;

pub fn main() -> %void {
    %return io.stdout.printf("Hello, world!\n");
}

thejoshwolfe on 1 Oct 2017

👍1

Hmm, maybe that is better.

andrewrk on 1 Oct 2017

The bug of a long build_root and cache_root parameter is fundamentally present until this TODO is fixed

Are you saying the solution to allocation failure is to ensure memory allocation will always succeed? And that means effectively assuming we have infinite memory, which I'm pretty sure is one of the antipatterns Zig was designed to combat.

thejoshwolfe on 1 Oct 2017

The other reason I used %% in std/build.zig is that, at the end of the day, if we run out of memory, we're going to abort the build and crash. Crashing earlier with %% is easier to debug when it happens.

Related, I think it might be interesting to explore if we can have the ability to collect a stack trace very quickly, and at least in debug mode and maybe also in release-safe mode, collect stack traces when an error is returned from a function, and attach them to the error. Then if you had to debug an error when it was passed around via %return you'd have a stack trace for why the error was originally returned.

andrewrk on 1 Oct 2017

I remember one reason %return io.stdout.printf("Hello, world!\n"); isn't better.

In release fast mode, whereas in C you'd have a straightforward series of calls to printf, in zig you'd have a branch for each one. Maybe you kind of want undefined behavior or a crash if printf fails.

Maybe.

In practice I think printing to stdout or stderr is not a bottleneck, and the branches would be fine, especially with #84.

Also, maybe we don't aim the shotgun at one's feet automatically. Users can always make their own shotgun with:

pub fn out(comptime format: []const u8, args: ...) {
    io.stdout.printf(format, args) %% unreachable;
}

andrewrk on 1 Oct 2017

One more reason %% is useful, maybe we can solve this problem:

test "aoeu" {
    %%foo();
}

error ItBroke;

fn foo() -> %void {
    return error.ItBroke;
}

Test 1/1 aoeu...attempt to unwrap error: ItBroke
/home/andy/dev/zig/build/test.zig:2:5: 0x000000000020346b in ??? (test)
    %%foo();
    ^

test "aoeu" {
    foo() %% unreachable;
}

Test 1/1 aoeu...reached unreachable code
/home/andy/dev/zig/build/test.zig:2:14: 0x000000000020346c in ??? (test)
    foo() %% unreachable;
             ^

The former has a better debugging experience because you get the error string printed. Maybe we can look for this pattern and codegen it the same way.

andrewrk on 1 Oct 2017

The other reason I used %% in std/build.zig is that, at the end of the day, if we run out of memory, we're going to abort the build and crash. Crashing earlier with %% is easier to debug when it happens.

Maybe you kind of want undefined behavior or a crash if printf fails.

Even printf failures are something the user should think about. I don't like recommending %% for printf when it really is possible for it to fail in realistic circumstances.

thejoshwolfe on 1 Oct 2017

👍3

does that mean that build.zig is designed to only work in safety mode

Actually yes. This is a bit of a special case. build.zig is built when someone does zig build and therefore needs to build quickly more than it needs to execute quickly. So debug mode is the best fit for it.

andrewrk on 1 Oct 2017

Ok, so what this means is that we've got two different ways of writing Zig code.

Runtime-appropriate, which assumes that undefined behavior is unacceptable.
Safety-reliant, which assumes that undefined behavior is a crash.

The latter category includes build scripts and tests, where the %% prefix operator is used extensively, and where assert() calls and unreachable statements are supposed to trigger errors instead of just informing the optimizer.

Are we comfortable with this dichotomy? Isn't this a pretty big violation of one-obvious-way?

What if we change tests to return errors to signal failure instead of hitting unreachable statements? And yes, I think attaching stacktraces to errors in safety or debugging mode is a good idea too, regardless of this discussion. Would that unify the two worlds?

thejoshwolfe on 1 Oct 2017

👍2

zig test still has --release-fast where %% is actually undefined behavior. The only difference in test mode is the implementation of std.debug.assert calls @panic instead of using unreachable.

https://github.com/zig-lang/zig/blob/a458ec9998159dcfcbb013c6202bdd123969dfaf/std/debug.zig#L14-L24

andrewrk on 1 Oct 2017

So we want 2 ways to write zig code? Like a project can specify in it's readme that it relies on safety, so you gotta build with safety enabled to use the library properly?

Wouldn't it make a lot more sense for there to only be one way to write zig code?

thejoshwolfe on 1 Oct 2017

There are not 2 ways to write zig code. The %% usage as pointed out in this issue is problematic.

Incremental improvements.

I got it working at all, and now it can be improved.

andrewrk on 1 Oct 2017

I updated the hello world examples, and I made the C version do the same error checking so that the comparison is fair.

andrewrk on 1 Oct 2017

If I understand the solution correctly, tests must always use assert instead of %%, because the latter will invoke UB instead of crashing in release-fast mode?

snocl on 3 Oct 2017

That is correct.

Although maybe this ok, because presumably you already tested in debug mode, and you have actual assertions you want to test as well.

As an alternative to %%foo(), tests can use foo() %% |err| @panic("fail: {}", err).

One could implement a helper in userland:

fn ok(x: var) -> @typeOf(x).ChildType {
    return x %% |err| @panic("test failed: {}", err);
}
test "foo" {
    const cwd = ok(os.getCwd(&debug.global_allocator));
    assert(cwd[0] == '/');
}

andrewrk on 4 Oct 2017

or we change the implicit return type of test "" {} functions to %void, and returning an error signals test failure. then you use %return instead of %%, and it's both a working test and a demonstration of responsible error handling.

and then we remove the special case from assert for test mode and add another function:

fn assertOrError(b: bool) -> %void {
    if (!b) return error.AssertionFailure;
}

then tests call %return assertOrError(x == 5); instead of assert(x == 5);.

now there's no special cases for test mode, and tests look like production code wrt undefined behavior.

there is a functional improvement in this change i'm proposing, which is that it is no longer appropriate for tests to get coverage on assertion failures in the production code. In my opinion assertion failures should be impossible to hit from the api, and so it doesn't make sense to test them.

thejoshwolfe on 4 Oct 2017

Getting rid of the special case in assert I agree is the right thing to do here, so that ReleaseFast tests are testing the same code as would be run in a non-test build.

The only problem I have with %return assertOrError(x == 5) is that you don't get a nice stack trace as you would from a panic. So maybe a smaller first step until we figure out how errors can (sometimes?) have stack traces would be to have assertOrPanic.

andrewrk on 4 Oct 2017

Stack traces are generally useful things, especially in error situations. I'm a little bit nervous going in the direction of attaching stack traces to errors, because it's a slippery slope. What's next? Error messages? How about transplanting the stack trace when an error handler returns a different error?

Seems like an opportunity for innovation, but I don't have any good ideas at the moment.

thejoshwolfe on 5 Oct 2017

My two cents is that writing to a file descriptor can fail at runtime so %% should terminate execution and not be undefined or not be encouraged in the documentation for this use case.

I can see two things being mixed, wanting to ignore (yet still fail early) from a problem that should never really happen, and wanting to get performance from truly impossible situations.