Zig: Convention, deinit doesn't invalidate

Created on 11 Sep 2020  路  4Comments  路  Source: ziglang/zig

This is a variation of https://github.com/ziglang/zig/issues/6299 that @andrewrk suggested be a separate proposal.

Proposal

This issue proposes that by convention, deinit never invalidates memory and always takes self as a value. So instead of this:

pub fn deinit(self: *@This()) void {
    // release resources
    self.* = undefined;
}

we use this:

pub fn deinit(self: @This()) void {
    // release resources
}

Furthermore, in the cases where invalidation is more than just self.* = undefined, it's proposed that a type can define a function named invalidate to contain type-specific invalidation logic, i.e.

pub fn invalidate(self: *@This()) void {
    self.file_open = false;
}

Why?

The reasoning for this convention appeals to the Single Responsibility Principle. Currently in the standard library there is a convention that deinit perform 2 operations:

  1. release resources
  2. invalidate memory
pub fn deinit(self: *@This()) void {
    self.foo.deinit(); // example of releasing resources
    // ...
    self.* = undefined; // invalidate memory
}

However, there are use cases where code may want to release resources, but not invalidate memory. Since deinit always does both, programs may not be able to use the struct in the way that makes sense in their situation. For example, this convention prevents programs calling deinit on const instances. Say we have a Page type that wraps a page address:

const Page = struct {
    ptr: *u8,
    pub fn deinit(self: *@This()) void {
        std.os.unmap(self.ptr);
        self.* = undefined;
    }
};

Because deinit invalidates memory, it must take a mutable reference. This prevents a program from being able to call deinit on a const version of Page:

const page = try Page.init();
defer page.deinit(); // compile error

For nested types, this can also cause the same memory to be invalidated multiple times when a type calls deinit on its members but also invalidate its own memory with self.* = undefined. Foo contains Bar, which contains Baz, which contains Buz, all of which call deinit on all their members and invalidate themselves, this would mean Foo would invalidate Buz 4 times.

You will also find that throughout the standard library, some types choose to invalidate memory inside deinit while other do not. These variations are likely caused by differing programmer sensibilities and different cost/benefit analyses for each case. The issue here is that the type definition itself doesn't have the full picture, only the user will. By establishing that types never force invalidation in deinit, this puts the "responsibility" for deciding whether to invalidate on the party with the all the information, the user. This is the same reasoning that makes Zig's allocators work so well, it puts the responsibility of deciding an allocation strategy on the user, not the functions that only need to allocate memory.

Note that some types in the standard library have already addressed these issues. For example HashMap defines the deallocate function, StreamServer defines the close function, which both release resources without invalidating. This proposal aims to help developers do the "right" thing by default, by allowing the user to make this decision insteading of forcing one on them. By establishing that deinit should always take self as a value type, it keeps invalidation and resource deallocation separate.

What about Invalidation?

In the common case that a user wants to release resources and invalidate memory, a common function can be implemented to do both, i.e.

pub fn destroy(o: var) void {
    o.deinit();
    if (@hasDecl(o, "invalidate")) {
        o.invalidate();
    } else {
        o.* = undefined;
    }
}

Usage:

var foo = Foo.init();
defer std.destroy(&foo);

Some may not like that this function is no longer a member function. If this is the case, we could also define it like this:

pub fn defineDestroy(comptime T: type) type {
    return struct {
        pub fn destroy(self: *T) void {
            self.deinit();
            if (@hasDecl(o, "invalidate")) {
                self.invalidate();
            } else {
                self.* = undefined;
            }
        }
    };
}


pub const Foo = struct {
    pub usingnamespace defineDestroy(@This());
};

Usage:

var foo = Foo.init();
defer foo.destroy();
proposal

Most helpful comment

It's worth noting that init acquires those resources, and thus, you might expect deinit to release them.
I'd also consider destroy over deinitAndInvalidate.

All 4 comments

It's worth noting that init acquires those resources, and thus, you might expect deinit to release them.
I'd also consider destroy over deinitAndInvalidate.

@Tetralux I like the term destroy. I'm going to ammend the proposal to use that instead.

should a const reference to a resource be able to release it?

should a const reference to a resource be able to release it?

This proposal suggests that all deinit functions (that release resources) take self by value, which is functionally the same as a const reference. The only real difference between the two is that a const reference forces the compiler to pass it by reference, whereas passing it by value allows the compiler to decide whether it should be passed by value or reference under the hood.

Was this page helpful?
0 / 5 - 0 ratings