Zig: syntax: require all fields to be together

Created on 17 Apr 2020  路  15Comments  路  Source: ziglang/zig

Accepted with this ammendment


struct {
    a: i32,
    const b = "foo";  // error: declaration before the last field
    b: i32, // note: next field here
};
struct {
    a: i32,
    comptime { // error: comptime block before the last field
        assert(true);
    }
    b: i32, // note: next field here
};

Whether zig fmt would report a compile error or auto-fix the code is to be determined.

accepted breaking proposal

Most helpful comment

I like the idea of preventing fields from being spread around in a struct. In all languages, spreading fields out makes it hard to tell how big a struct instance is or how heavyweight it might be to create. But especially in Zig, it makes it harder to figure out what is needed in an initializer list. So I definitely think a rule like this would be good. It also feels strange that you can have a field declaration with a comma, like it's in a list, followed by a constant declaration with a semicolon, followed by another entry in the list.

But thinking about this more, there are cases where I think being able to declare globals before all fields may make code more readable. Especially when writing generic structs. For example,

fn GenericType(comptime InType: type) type {
    return struct {
        // Derived types declared first, pub so other generic code can use them
        pub const DataType = InType;
        pub const AType = ComputeAType(InType);
        pub const BType = ComputeBType(InType);

        // Fields next, using the derived types
        data: DataType,
        a: AType,
        b: BType,

        // Then functions, using the types and fields
        pub fn doStuff(self: *@This()) void {
            // ...
        }
    };
}

This would still compile if I moved the derived types to after the fields, but I think it becomes a lot less clear.
Because of this, I think the rule 'disallow non-fields between fields' might fit better than 'disallow non-fields before fields'. This still ensures that all fields are together, but allows the programmer to decide where in the struct they go.

All 15 comments

I like the idea of preventing fields from being spread around in a struct. In all languages, spreading fields out makes it hard to tell how big a struct instance is or how heavyweight it might be to create. But especially in Zig, it makes it harder to figure out what is needed in an initializer list. So I definitely think a rule like this would be good. It also feels strange that you can have a field declaration with a comma, like it's in a list, followed by a constant declaration with a semicolon, followed by another entry in the list.

But thinking about this more, there are cases where I think being able to declare globals before all fields may make code more readable. Especially when writing generic structs. For example,

fn GenericType(comptime InType: type) type {
    return struct {
        // Derived types declared first, pub so other generic code can use them
        pub const DataType = InType;
        pub const AType = ComputeAType(InType);
        pub const BType = ComputeBType(InType);

        // Fields next, using the derived types
        data: DataType,
        a: AType,
        b: BType,

        // Then functions, using the types and fields
        pub fn doStuff(self: *@This()) void {
            // ...
        }
    };
}

This would still compile if I moved the derived types to after the fields, but I think it becomes a lot less clear.
Because of this, I think the rule 'disallow non-fields between fields' might fit better than 'disallow non-fields before fields'. This still ensures that all fields are together, but allows the programmer to decide where in the struct they go.

I dunno, I'm not really sure why this would be a syntax error, there doesn't really seem to be a benefit to it. zig fmt has business cleaning it up, but I can't honestly see a reason the compiler should enforce it.

There's a very good reason the compiler should enforce it. If you know that a given piece of code is successfully being compiled, then you know that all the fields are together. This means when you are reading code, and you see the end of the field list, you know that there are no fields hidden elsewhere in the struct.

The Zig language asks code writers to stick to some rules that limit expressiveness, but in exchange - readers of code are able to hold fewer possibilities in their minds.

2873 should be considered again in this case, but SpexGuy's generic struct example might be an argument against it.

The good thing about #2873 though, is that top level declaration scopes would become simpler since they do not need to concern themselves with the fields of the various containers.

While programming in c++, it's awful to find fields in a large class, especially in templates. So I like this idea.

I support the idea in general and especially the change that @SpexGuy proposed. I (and probably most other people) usually read code top-to-bottom and i follow that in declaration style:
I usually declare types before i use them. Even if it's not necessary in zig, it supports reading the code.

But yes, struct fields should be in one block without other declarations inbetween. Comments and whitespace should still be allowed though, to add more visual structure to the code.

OK this is accepted with @SpexGuy's amendment https://github.com/ziglang/zig/issues/5077#issuecomment-615035629.

Currently what is accepted is that it is a compile error, rejected and not fixed by zig fmt. The reasoning here is that the rearranging this would do is severe and high chance of being different than the re-arranging one would do manually when finding out fields must be together.

I though it was already the case but glad it will be enforced by the compiler

I don't think this was a good idea, e.g.

pub fn GzipInStream(comptime InStreamType: type) type {
    return struct {
        const Self = @This();

        crc: Crc32 = Crc32.init(),
        bytesAccumulated: usize = 0,
        didReadFooter: bool = false,

        const RawDeflateReader = RawDeflateInStream(InStreamType);
        rawDeflateReader: RawDeflateReader,

        ....
    };
}

In this example I declare a type just before it is used, which seems reasonable IMO.

With this proposal in place, there are two options:

You could declare the type inline. This is the preferred solution if the type isn't overly long and isn't referenced. The reader now knows that they don't need to scan past globals to see if there are more fields, and the namespace doesn't end up with an extra symbol.

pub fn GzipInStream(comptime InStreamType: type) type {
    return struct {
        const Self = @This();

        crc: Crc32 = Crc32.init(),
        bytesAccumulated: usize = 0,
        didReadFooter: bool = false,

        rawDeflateReader: RawDeflateInStream(InStreamType),

        ....
    };
}

If the type is longer or needs to be referenced by name elsewhere in the struct, it could be pre-declared:

pub fn GzipInStream(comptime InStreamType: type) type {
    return struct {
        const Self = @This();
        const RawDeflateReader = RawDeflateInStream(InStreamType);

        crc: Crc32 = Crc32.init(),
        bytesAccumulated: usize = 0,
        didReadFooter: bool = false,

        rawDeflateReader: RawDeflateReader,

        ....
    };
}

Neither of these does a great job of handling the case where a default needs to be specified from a comptime method call within the type's namespace:

pub fn GzipInStream(comptime InStreamType: type, comptime defaultArg: u32) type {
    return struct {
        const Self = @This();
        // RawDeflateReader needs to be predeclared because we want to declare and init a field.
        const RawDeflateReader = RawDeflateInStream(InStreamType);

        crc: Crc32 = Crc32.init(),
        bytesAccumulated: usize = 0,
        didReadFooter: bool = false,

        rawDeflateReader: RawDeflateReader = RawDeflateReader.init(defaultArg),

        // RawDeflateReader not used later in this struct.
        ....
    };
}

But this case seems pretty rare, so maybe moving the type into the type list is fine? It's unfortunate that the declaration needs to be moved away from its usage, but I think it's not a huge cost to make braced initialization easier.

field: Type = .{ ... }, doesn't work yet, and this decision kind of rubs salt in that wound by turning all struct defaults into this predeclare case. But I think once that's fixed it will be quite rare to have to reference the generated type in both the field type and the field initializer.

needs to be moved away from its usage, but I think it's not a huge cost to make braced initialization easier.

@SpexGuy

Can you clarify how this is related to making braced init easier?

If I'm using a library and I want to initialize a type that it declares, I need to be able to look at the type definition in the library and quickly see the list of all the fields. Having all the fields together guarantees that once I find one field I know that I've found the entire list, and I don't need to go searching through the rest of the type declaration (which may be quite long due to function declarations).

@SpexGuy Does the error you get for not initializing all fields tell you which ones are missing?

@Tetralux it does, but even in languages with a language server, viewing the declaration usually makes it way easier to find out which fields are missing than searching through the error message. This could be an issue of bad error messages though and would be solvable with better errors.

There are also common cases where it would be nice to be able to see the requirement without compiling, for example:

  • if the project does a significant amount of comptime execution and compiles are slow
  • if I'm in the middle of a large refactor and other compile errors would halt the compile first
  • if I haven't written the code that calls the function I'm writing yet, so it wouldn't be compiled
Was this page helpful?
0 / 5 - 0 ratings