Zig: Proposal: Integer-backed packed struct

Created on 15 Apr 2020  路  6Comments  路  Source: ziglang/zig

So far, Zig doesn't have a good solution for flags/bitfields. Packed structs are the current recommended solution, and they provide a lot of improvements over the standard C approach of defining integer constants. But they are still not quite sufficient, even with #3133. For me, the requirements of a good flags/bitfield type are:
1) Deterministic packing
2) Can be cast to and from an integer type, to support bulk operations (&, |, etc)
3) Well-defined conversion between little-endian and big-endian
4) Guarantees the size of its loads and stores in a predictable way, to be compatible with MMIO registers
5) Behaves as an integer when used as a parameter to an extern function, so as to be compatible with a C ABI that uses flags

Requirements (1) and (2) are already satisfied by packed structs. Requirements (3) and (4) are not satisfied by packed structs yet, but are kind of a package deal. If you define load/store size you get deterministic endianness behavior. There have been ways suggested to do this, such as putting a u0 aligned to the required word size at the beginning of a packed struct. (I can't find that suggestion anymore, if anyone knows where it is please let me know and I'll link it here). But assuming packed structs will always behave like structs from an ABI standpoint, they may not be able to satisfy requirement (5) for all calling conventions. This can lead to some pretty ridiculous workarounds, that are not always feasible.

To solve this, I propose a variant of packed structs which are backed by a specified integer type, similar to enums backed by a specified integer type. Just like enums are integers at the ABI level, integer-backed packed structs are also integers at the ABI level. They are passed in registers across function call boundaries whenever their backing integer type would be, and they are allowed on extern boundaries only when their backing integer type would be. They can be used in atomic operations just like their integer type would be (with the exception of RMW operations like add, sub, etc, similar to enums). Their default alignment matches the alignment of their backing integer type.

My proposed syntax is this:

const ExampleFlags = packed struct(u32) {
    int_flag: u1 = 0,
    bool_flag: bool = false,
    small_value: u3 = 3,
    aligned_value: u8 align(1),

    const Self = @This();
    pub fn union(self: Self, other: Self) Self {
        return @bitCast(Self, @bitCast(u32, self) | @bitCast(u32, other));
    }
};

Integer operators (+, -, &, etc) are not defined for integer packed structs, but @bitCast is allowed to convert to the integer type if you really want to do bulk operations. We could also make a more specialized cast, analagous to @enumToInt. Like normal packed structs or enums, bindable functions can be specified in the struct namespace to facilitate bulk operations at the API level.

Fields are specified from the LSB of the backing integer to the MSB. This definition will always mean that endianness conversion for the flags field is the same as endianness conversion for the backing integer type. If the number of specified bits is fewer than the number available in the backing type, or if there are padding bits due to aligned fields, the value of the extra bits is undefined. If you require these values to be zeroed (e.g. for interop with a C api, MMIO, or sort ordering), you must explicitly add padding fields that are set to zero. If more bits are specified than are available in the backing type, that's a compile error.

Integer-backed packed structs are allowed to contain anything that a packed struct can contain and follows all of the layout rules of packed structs, as long as the contained elements can fit inside the specified number of bits. The backing type must be an actual integer. It cannot be another integer-backed struct or an extensible enum. When necessary, conversions between those types can instead be performed with @bitCast.

When an integer packed struct is nested into another integer packed struct, it behaves like its integer type and is bit-packed, unless an alignment is explicitly specified.

Writing the entire value of a packed struct is guaranteed to be implemented with the same semantics as writing to a value of the integer type. Writing to a field in an integer-backed packed struct is semantically a read-modify-write operation on the entire integer. Whenever the write size is semantically observable (e.g. through atomic or volatile operations), the size of the read and write instructions for an integer packed struct is guaranteed to be the same as that of the backing integer type. As always, if the read/write is not observable, the optimizer is free to do whatever it wants (load/store the smallest size it can, split into multiple stores, combine loads/stores together, ignore all of them and use a register, etc).

Consequently, a pointer to a field of a backing type has the type *align(<parent struct alignment>:<bit offset in integer>:<byte size of integer>) <field type>. So
&(ExampleFlags{}).small_value is of type *align(4:2:4) u3, and
&(ExampleFlags{}).aligned_value is of type *align(4:8:4) u8.

This proposal would provide an explicit solution for #1834, #4056, and #4185, without limiting the capabilities of the more general packed struct. With some extra work, it could also be applied to solve #1761 and #3472.

proposal

Most helpful comment

I think the enumflagset proposal also handles a similar set of use cases, so I think doing either one or the other makes sense. I first considered something like enumflagset, but there are a couple reasons why I prefer this solution:

  • Packed structs can contain multi-bit fields, e.g. a packed enum(u2). This is mostly useful for the MMIO use case but it is also sometimes a thing that C APIs do.

  • Giving a bit more control over an existing construct seems better than adding a new one. This proposal wouldn't need a new union member in @typeInfo(..), and code that uses reflection to inspect fields will "just work" in most cases. You want to make a UI that reflects over your fields and generates editable fields for all your values? Just make it work for packed structs and you're done. Flags aren't a special case.

  • I think the struct-like form is much cleaner than the bitmasking form in many cases:

// struct
foo: Flags = .{};
foo: Flags = .{ .x=true };
foo: Flags = .{ .x=true, .y=true, .z=bool_var };
foo.x = true;
foo.x = false;
foo.x = bool_var;
if (foo.x) { ... }
if (foo.x and foo.y) { ... }

// masking
foo: Flags = 0;
foo: Flags = .X;
foo: Flags = .X | .Y | (if (bool_var) .Z else 0);
foo |= .X;
foo &= ~.X;
foo = (foo & ~.X) | (if (bool_var) .X else 0);
if (foo & .X != 0) { ... }
if (foo & (.X | .Y) == (.X | .Y)) { ... }

I think it might also make sense to make a FlagsMixin class in the standard library that supplies functions for bulk operations (|, &, ~), because these are a bit cumbersome with integer packed structs and it might be better to have one standard set of function names that everyone uses instead of having every library that has a flags struct pick their own names.

All 6 comments

The only thing with this is that if your packed struct contains several fields that are structs, or you just have a lot of fields, or indeed, you care more that's it's packed than the final size of it, you have to manually calculate the bits - which would be a bit of PITA.

if your packed struct contains several fields that are structs, or you just have a lot of fields, or indeed, you care more that's it's packed than the final size of it

This proposal doesn't remove the current packed struct from the language, so those use cases should be handled by it. This is specifically for when you need guaranteed integer behavior, in order to be compatible with a C api, MMIO, or a particular byte swapping strategy.

Byte swapping a normal packed struct is a bridge we'll have to figure out how to cross eventually, since networking is one of its intended use cases. But the rules for that could be allowed to be quite complicated, since a standard packed struct can be much larger than a single integer. (note though, complicated does not necessarily imply slow in this case.)

@SpexGuy , I see that you referred to #4185. What are your thoughts on it?

I assume there is something you believe this proposal provides that #4185 doesn't?

I think the enumflagset proposal also handles a similar set of use cases, so I think doing either one or the other makes sense. I first considered something like enumflagset, but there are a couple reasons why I prefer this solution:

  • Packed structs can contain multi-bit fields, e.g. a packed enum(u2). This is mostly useful for the MMIO use case but it is also sometimes a thing that C APIs do.

  • Giving a bit more control over an existing construct seems better than adding a new one. This proposal wouldn't need a new union member in @typeInfo(..), and code that uses reflection to inspect fields will "just work" in most cases. You want to make a UI that reflects over your fields and generates editable fields for all your values? Just make it work for packed structs and you're done. Flags aren't a special case.

  • I think the struct-like form is much cleaner than the bitmasking form in many cases:

// struct
foo: Flags = .{};
foo: Flags = .{ .x=true };
foo: Flags = .{ .x=true, .y=true, .z=bool_var };
foo.x = true;
foo.x = false;
foo.x = bool_var;
if (foo.x) { ... }
if (foo.x and foo.y) { ... }

// masking
foo: Flags = 0;
foo: Flags = .X;
foo: Flags = .X | .Y | (if (bool_var) .Z else 0);
foo |= .X;
foo &= ~.X;
foo = (foo & ~.X) | (if (bool_var) .X else 0);
if (foo & .X != 0) { ... }
if (foo & (.X | .Y) == (.X | .Y)) { ... }

I think it might also make sense to make a FlagsMixin class in the standard library that supplies functions for bulk operations (|, &, ~), because these are a bit cumbersome with integer packed structs and it might be better to have one standard set of function names that everyone uses instead of having every library that has a flags struct pick their own names.

In this project I'm only using volatile only on aligned word pointers (to u32 or to packed structs of 32 bits.)

https://github.com/markfirmware/zig-vector-table/blob/c69f04caf843d9b7977980a5230f3ced415003dc/main.zig#L38-L56

@vegecode

I agree with this, although I would make one change to the syntax:

const ExampleFlags = packed(u32) struct { // ...

because struct(u32) on its own does not make sense.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bronze1man picture bronze1man  路  3Comments

dobkeratops picture dobkeratops  路  3Comments

S0urc3C0de picture S0urc3C0de  路  3Comments

andrewrk picture andrewrk  路  3Comments

jayschwa picture jayschwa  路  3Comments