const std = @import("std");
const Thing = struct {
initial_sample: ?usize,
};
pub fn main() void {
var thing = Thing {
.initial_sample = 3,
};
if (thing.initial_sample) |t| {
std.debug.warn("before: {}\n", t); // prints 3
thing.initial_sample = null;
std.debug.warn("after: {}\n", t); // prints 0
}
}
This should print 3 both times. Also,
if (thing.initial_sample) |t| {
std.debug.warn("before: {}\n", t); // prints 3
thing.initial_sample = 2;
std.debug.warn("after: {}\n", t); // prints 2
}
I think this broke some time in the past month. (It manifested as a noticeable runtime bug in my game project.)
Not a bug. After result-location/copy-elision merges |t| is no longer distinct memory from thing.initial_sample. The old behaviour of stack-alloc-and-copy is now elided.
I always thought that
if (thing.initial_sample) |t| {
// ...
}
is shorthand for:
if (thing.initial_sample != null) {
const t = thing.initial_sample.?;
// ...
}
Is that not the case? If not, it would make sense to update the documentation to call this out explicitly I think.
Intuitively, I would expect the payload contents to be immutable due to their similarity with function parameters, which have this behavior.
I think the root cause for confusion is that the syntax for writing to Optionals makes it look like a regular assignment is happening, when it's in fact a write to a pointer location.
See the following program, this time the function parameter changes:
const std = @import("std");
pub fn noSurprise(x: u32) void {
std.debug.warn("x={}\n", x); // 42
global_x = 420;
std.debug.warn("x={}\n", x); // 42
}
pub fn surprise(x: ?u32) void {
std.debug.warn("x={}\n", x); // 42
global_x_opt = 420;
std.debug.warn("x={}\n", x); // 420 (???)
}
var global_x: u32 = 42;
var global_x_opt: ?u32 = 42;
pub fn main() void {
noSurprise(global_x);
surprise(global_x_opt);
}
@zimmi how is that really any different from:
const std = @import("std");
// just some struct that has more than primitive size
const Foo = struct {
one: u64,
two: [10]u64,
};
var global_foo: Foo = Foo{ .one = 1, .two = undefined };
fn doit(foo: Foo) void {
std.debug.warn("one={}\n", foo.one);
global_foo.one = 99;
std.debug.warn("one={}\n", foo.one);
}
pub fn main() void {
doit(global_foo);
}
How can t (in my example) not be distinct memory? It has type usize as far as I can tell, how can a non-pointer be aliasing something?
Also, in my first example I set the original optional to null and now the unwrapped value was showing 0. This seems to be exposing some arbitrary implementation detail about how optionals are modelled.
If zig had perfect runtime safety, then this line would crash:
global_x_opt = 420;
With something like "illegal aliasing detected".
There are a couple aliasing issues open to investigate if this safety can be provided.
If the current behavior is desired one must make the parameter a const pointer. If a copy is desired one must make a copy explicitly. As is, the code is illegal.
I was responding to @zimmi , but the same thing applies to the OP.
There is a separate thing to consider here which is whether or not an optional can be a type that is always copied for parameters and capture values. A proposal for this would be reasonable.
@dbandstra, currently:
?usize is codegen to llvm-ir as a struct { i64, i1 } where the first element[0] is usize (the payload) and element[1] is flag indicating if optional has value or null.
t at the source level is a lexical binding to the payload. They are one and the same... an alias. Compare addresses and this is true: &t == &thing.initial_sample.
The reason (which Andrew mentioned is illegal but not yet detected by compiler) setting optional to null results in changing t to 0 is because the struct is re-initialized to null by copying from a global constant. This global constant { undef, false } represents the struct in its null form -- undef is initialized to zero in bss segment, so zero is what you get, and false means it has no value and is semantically null. Technically, if semantically null, the payload should never be examined but anyways that's why it is zero.
@mikdusan
You are right, it's the same behavior. Without knowing the greater context of the function, the field access in your example makes it at least clear that the assignment might have side effects. One has to know the type of global_x_opt to make that connection.
@andrewrk
Makes sense, thanks. Alias detection (if possible) would also solve the issue of OP I think.
In the OP, it might be reasonable for zig to make a copy of the integer. But imagine if it was ?T where T was a very large struct. This is the thing where Zig has "by-value-or-maybe-reference" parameter passing. It might be reasonable to add a semantic rule that, if the payload type of an optional is a primitive, then if unwrapping it will make the capture value a copy.
I haven't used Zig for very long and this might be heresy, but: Why not have call by value as the default and require opt-in from the user when call by reference is desired? That's a simpler rule than having to remember the rules for each data type.
Why not have call by value as the default and require opt-in from the user when call by reference is desired? That's a simpler rule than having to remember the rules for each data type.
I'd say it's fair to call that the null hypothesis. Zig is still experimenting with this other thing which is meant to improve semantics & performance by having parameter passing make fewer guarantees. Whether or not this experiment is considered a success will be determined by the results of the aliasing research (issues #1108 and #476), how often issues like this one come up, and how much the performance is actually impacted (yet to be measured). Already taken into account is how people will make parameters pointers in C out of performance paranoia.
I still want to consider this issue but I'm not going to introduce any changes based on it for 0.5.0.
I ran into the issue of the unwrapped optional getting mutated yesterday and wondered why it crashed:
var headOfLinkedList : ?*Node = …;
…
while(headOfLinkedList) |node| {
headOfLinkedList = node.next;
allocator.destroy(node); // this will actually try to destroy the new `headOfLinkedList`
}
I think it's better to have node be a copy here, as it will prevent any aliasing-related errors and writing back to the original optional will not affect a const variable. But i understand that this may be problematic with result-location-semantics...
I ran into this again[1], realizing it only after spending half an hour trying to create a minimal test case thinking my problem was coming from some unrelated compiler bug :man_facepalming:
I think, maybe even if it "makes sense" being an alias the way it is now, you have to consider the experience of newbies coming from other languages. I don't know exactly why, but I don't intuitively expect this behaviour[2] - it came as a complete surprise again. If this trips up a lot of people it could harm language adoption.
[1] [Here](https://github.com/dbandstra/oxid/commit/715c4ccedab7dff87611f96b76a330aa7be7a7bb#diff-fb8ea9ae37155951f62d18833972ee2bR333-R345). Left side is my workaround after not figuring it out earlier, right side is my fix. I had previously been trying self.rebinding = null; at the top of the block.
[2] Then again I don't have an intuitive sense of how optionals work at all in Zig. Neither do I understand the result location thing much. So take my comment with a grain of salt.
Most helpful comment
I'd say it's fair to call that the null hypothesis. Zig is still experimenting with this other thing which is meant to improve semantics & performance by having parameter passing make fewer guarantees. Whether or not this experiment is considered a success will be determined by the results of the aliasing research (issues #1108 and #476), how often issues like this one come up, and how much the performance is actually impacted (yet to be measured). Already taken into account is how people will make parameters pointers in C out of performance paranoia.