Zig: ability to annotate functions which allocate resources, with a way to deallocate the returned resources

Created on 22 Feb 2018 · 37Comments · Source: ziglang/zig

This pattern is extremely common in zig code:

    const err_pipe = try makePipe();
    errdefer destroyPipe(err_pipe);

    var in_file = try os.File.openRead(allocator, source_path);
    defer in_file.close();

    var atomic_file = try AtomicFile.init(allocator, dest_path, mode);
    defer atomic_file.deinit();

    var direct_allocator = std.heap.DirectAllocator.init();
    defer direct_allocator.deinit();

    var arena = std.heap.ArenaAllocator.init(&direct_allocator.allocator);
    defer arena.deinit();

Generally:

    const resource = allocateResource();
    defer deallocateResource(resource); // or errdefer

This proposal is to

make it harder to forget to clean up a resource
make it easier to clean up resources

Strategy:

Functions which allocate resources are annotated with the corresponding cleanup function.
New keywords, corresponding to the defer keywords:
- clean corresponding to defer theCleanupFunction(resource);
- errclean corresponding to errdefer theCleanupFunction(resource);
If you want to handle resources manually, you must use noclean to indicate that you accept
responsibility for the resource. Otherwise you get error: must specify resource cleanup strategy.

The above code example becomes:

    const err_pipe = errclean try makePipe();
    var in_file = clean try os.File.openRead(allocator, source_path);
    var atomic_file = clean try AtomicFile.init(allocator, dest_path, mode);
    var direct_allocator = clean std.heap.DirectAllocator.init();
    var arena = clean std.heap.ArenaAllocator.init(&direct_allocator.allocator);

How to annotate cleanup functions:

// std.mem.Allocator
fn create(self: &Allocator, comptime T: type) !&T
    cleanup self.destroy(_)
{
    const slice = try self.alloc(T, 1);
    return &slice[0];
}

// function pointer field of struct
allocFn: fn (self: &Allocator, byte_count: usize, alignment: u29) Error![]u8
            cleanup self.freeFn(self, _),

// std.os.File
pub fn openRead(allocator: &mem.Allocator, path: []const u8) OpenError!File
    cleanup _.close()

Having functions which allocate a resource mention their cleanup functions will make generated documentation more consistent and helpful.

proposal

Source

andrewrk

👍18 👎1

Most helpful comment

What if this was a feature of the returned value itself (like an error union), rather than described as a part of the 'calling convention' of the function? One of the choices that Zig made (that I think was very good) with errors was making errors values rather than a part of function signatures, as they are in languages with exceptions like Java/C++. What if we tried that for cleanup-obligations?

Something like: Val#Obligation is the type of an _obligation tuple_. It holds a value, and something which must be done (called) eventually (i.e., it represents an obligation [that a resource is cleaned up]). Like errors have special syntax like try and catch and errdefer, obligations can have special syntax:

nocleanup obligation_tuple gets the value, discarding the obligation.

cleanup obligation_tuple is the same as defer obligation_tuple.obligation.fulfill(obligation_tuple.value); obligation_tuple.value

This doesn't fix the verbosity of calls like cleanup try allocate(), but I think it's simpler than adding arbitrary expressions to the signature of functions. Checking that resources aren't missed is simply handled by not allowing raw access to the .value except by nocleanup, and the requirement that non-void values aren't discarded

CurtisFenner on 28 Jul 2020

👍10 ❤2 👀1 🚀1

All 37 comments

Er... wasn't that something I proposed a few months ago in the discussion on resources and the use of "#" etc? I cannot seem to find the issue :-(

kyle-github on 22 Feb 2018

@kyle-github I think it was #494

Ilariel on 22 Feb 2018

@Ilariel, Ah, right. Thanks! I looked back, but not that far.

I would like to see something like this proposal combined with some of the ideas in #494. I think (not carefully thought through!) that it might be possible to come close to Rust's ownership/borrow checker in power. Perhaps it is too easy to allow escapes for it to be workable, but even 90% coverage would catch a huge number of cases. Determining lifetime is not that simple, however.

kyle-github on 22 Feb 2018

@andrewrk I was imagining something like this though with different keywords, but these are good keywords, too. Perhaps also allow a default standard (like your common self.deinit) so all you have to do is say clean(up) on the function header if you conform?

@kyle-github I also have some ideas about the "90% coverage" for borrow checking kind of thing, too, but I'd rather just see 1.0 first.

tjpalmer on 23 Feb 2018

I know this is bike shedding, but it reads kind of weird:

clean try get_something();

Sounds like "cleanly attempt to get_something()".

Maybe something more like auto_close reads more natural.

What exactly does errclean do? It sounds like "if no function call here annotated with try returns an error, then don't auto close this resource at the end of this function", which sounds like you're implicitly taking owner ship of the resource without explicitly saying so.

Maybe the default thing should be: if the function "throws" and the resource has not been assigned to any object that lives outside this scope, it's automatically closed, and there's no need to add any annotation for that.

If you want the resource to not autoclose (even on errors), that sounds like it needs a special syntax, for example: own or take.

Another idea is to mention the cleaning strategy after:

try get_something() auto_clean;

And if desired maybe it can be customized:

try get_something() auto_clean(cleanup_function)

And if no cleaning is needed:

try get_something() without_clean;

But at this point it feels like the language is getting too complicated.

hasenj on 27 Feb 2018

👍1

@hasenj Interesting. My keyword plan for autoclean was own, which you intuitively feel would mean the opposite. And my noclean was disown. So many different ways to take implications (and I guess that's why bikeshedding).

Personally, I still think the keywords clean, noclean, and so on from the proposal are clear. And I'm much happier with the prefix syntax, too. The object of the keyword is clearer to me. I read clean try as "autoclean" (or even "be clean with") "the thing I tried and succeeded on" and errclean as "autoclean this on error" and I'm quite happy with the "errdefer" symmetry.

tjpalmer on 27 Feb 2018

On the other hand, with this proposal in place, you could possibly drop the ad hoc defer and friends entirely.

tjpalmer on 27 Feb 2018

This proposal is to

make it harder to forget to clean up a resource
make it easier to clean up resources
Strategy:

Functions which allocate resources are annotated with the corresponding cleanup function.

that is pretty much RAII so why reinvent the wheel?

and if you basically add RAII to the language you probably need copy vs move semantics as well

ghost on 12 Jul 2018

So the main critique of RAII is that it is type based. That means you need to define a type for each lock/unlock, open/close, allocate/free, etc. Then again, idk a better alternative or how this is any different.

isaachier on 12 Jul 2018

IMO you want ctor/ dtor and with that RAII semantics sometimes and also want defer keyword other times.

Maybe just do it like rust does RAII, which is easier than cpp.

ghost on 12 Jul 2018

@monouser7dig
RAII requires constructors, destructors, move semantics overridable copy semantics and wrapping everything in wrapper types (unique_ptr).

This solution does not require any of these features, because either:

You clean on scope exit. Aka you own this resource and is not gonna pass it up
You errclean on scope exit. Aka, you own this resource when an error occurs.
You noclean, and is expected to pass the ownership.

Hejsil on 12 Jul 2018

👍3

Also, Rust RAII is easy, because Rust keeps track of ownership for you. Zig does not, so it would have to be as involved as C++.

Hejsil on 12 Jul 2018

Well that is just a stripped down version of RAII

You clean on scope exit. Aka you own this resource and is not gonna pass it up (normal behavior of a ctor dtor)
You errclean on scope exit. Aka, you own this resource when an error occurs. (normal behavior of a ctor dtor, that is where RAII comes from)
You noclean, and is expected to pass the ownership. (you leave out the dtor in which case you can just not use RAII in the first place and just do it as you currently do)

so the first two cases would be covered by the traditional RAII approach and the proposed syntax is just another syntax for doing it as far as I can see.

I don't see why you would not just call it what it is.

ghost on 12 Jul 2018

@monouser7dig
Well, if this is just about the name, then sure, we can call it RAII. One should just be careful that people don't confuse it with ctor/dtor, move, copy, implicit dtor calls, wrapper types and all that.

Hejsil on 12 Jul 2018

I argue what andrew is proposing already is ctor dtor wrapper type and soon also needs to be copy and move.
That is just how it is/ what you need.
All those functions return values and those are the wrapper types.
The „make**“ Funktion is the ctor of that type and the deferred / clean function is the dtor.

Now as soon as you copy such a type that was returned from „make**“ you need copy and move semantics as welll otherwise this example won’t hold for anything but trivial code.
....or rename it to noclean which may cover part of the usecases but it’s still reinventing the wheel as far as I can tell.

Concerning rust:
What you say is true but does not mean zig could not do the same or a variation of it. Zig does not control you memory safety either so it could just not control your moved from values and be fine, just different safety level than rust.

ghost on 12 Jul 2018

I very much agree with @monouser7dig. As long as Zig aims itself to be an applicable alternative for C, I feel that this feature is too high level to be of good taste. It just feels like unneccesary sugaring to me. The way Zig does resource aquisition/destruction now is nice and elegant, and trying to imitate Rust and C++ here feels like a stab in the back to C-style simplicity of Zig.

make it harder to forget to clean up a resource

Is this actually a problem for anyone? This is a valid concern, but I feel that this problem should only be addressed if it is a real-life problem, not just a hypothetical.

make it easier to clean up resources

...therefore locking programmers into a single form of deallocation. There are many ways to have a "constructor", and depending on the problem, there may be many ways to have a "destructor" too. It is not the place of Zig (or any sane language) to force one form of resource destruction on the programmer.

Zig as a language tries very hard to not hide allocations behind a programmer's back. It must also not hide deallocations either.

bjornpagen on 14 Jul 2018

👍8

Too complicated

andrewrk on 14 Jul 2018

👍6

Not sure that is the correct final answer to the problem

Language design might be complicated if it makes the programmers life less complicated in the end.

But maybe it’s best to think about it more and start a new proposal in the future.

ghost on 14 Jul 2018

😕1

I think It would be especially worth to investigate https://github.com/ziglang/zig/issues/782#issuecomment-404502930 this issue further because https://github.com/ziglang/zig/issues/782#issuecomment-404502081

ghost on 14 Jul 2018

Here's some real actual C code that wants to document ownership semantics for an array of strings returned by a user-supplied function: https://github.com/thejoshwolfe/consoline/blob/2c5e773442f89860f9ee82e13978b5ef3972ca99/consoline.h#L29

if this api were rewritten in zig, would it be possible to encode the desired ownership semantics with this proposal?

thejoshwolfe on 15 Jul 2018

@thejoshwolfe presumably it would by providing a default cleanup wherever caller deallocation is necessary. I guess the assumption is that no caller should free anything provided by a function unless it has a specified clean function.

isaachier on 15 Jul 2018

So turns out Jai got ctors and now wants to rip them out because they're not happy with it, I've not looked into the details, just found it interesting enough to add it in here.

ghost on 1 Sep 2018

Re-opening in light of #2377. Functions which provide a way for the compiler to automatically generate cleanup will make cancel work for non-async functions, without having to generate those functions specially. It also allows defers of async functions to run before tail resuming the awaiter, which is slightly more efficient. So now we have these reasons for investigating this feature:

it adds semantics-based documentation to functions which allocate resources
it could potentially lead to compile errors / tools detecting resource leaks statically
IDEs could auto-generate cleanup code
async functions can be more efficient. non-async functions which get canceled can be significantly simpler and more efficient.

I do think we need a better syntax/semantics proposal for how to annotate functions that allocate resources. There are a lot of issues with the syntax proposed above.

andrewrk on 13 Aug 2019

👍4 🚀1

I don't know how to make this work, and I'm not convinced it's a path that will be fruitful.

andrewrk on 15 Aug 2019

😕1

https://nim-lang.org/araq/ownedrefs.html

https://github.com/nim-lang/Nim/blob/devel/doc/destructors.rst

komuw on 22 Aug 2019

"Ownership You Can Count On" won't work for Zig, because that still requires reference counting everything. Unless Andrew wants Zig to track those in debug builds only ...

As for automating defer x.deinit() by convention, I don't see at all why it should be so hard, but I don't want to push it anymore if Andrew's done with the topic. (Working on my own language again these days, anyway. Though I've never gotten far on such efforts.)

tjpalmer on 22 Aug 2019

Re-opening in light of https://github.com/ziglang/zig/issues/3164#issuecomment-527504887. This would be required in order to implement useful cancel semantics into async functions.

andrewrk on 24 Nov 2019

Doesn't this imply hidden function calls much like operator overloading? I find the explicit defer to be more clear at the callsite.

frmdstryr on 10 Jan 2020

In my own project I noticed I had some initialization functions which create multiple resources and don't actually clean up properly if one of them fails.
It's so easy to just do

try ...
try ...
try ...

possibly with some code in between.

I also found a few cases in Zig std.

One is here:
https://github.com/ziglang/zig/blob/eb4d313dbc406b37f6bfdd98988c88c3b8ed542e/lib/std/build.zig#L120-L125
If the second try fails the BufMap is never cleaned up.

Another is here:
https://github.com/ziglang/zig/blob/eb4d313dbc406b37f6bfdd98988c88c3b8ed542e/lib/std/debug.zig#L480-L488
mod.symbols and mod.subsect_info are never cleaned up if an error occurs.

I haven't looked further. It's a bit hard to search for. And that's my main point. It's hard to find these bugs. It looks like the error is handled, so it's all fine, right? But actually no. After acquiring a resource you have to clean up if you don't intent to hold on to it for longer.

Now maybe in most cases you don't actually care too much, because if there's an error you don't really want to handle it, you just want to give up. Does that mean it shouldn't be try acquire_some_resource();, but rather acquire_some_resource() catch unreachable;? Or some smilar way to just exit? Or maybe Zig can have some syntax which requires a clean-up block to be written by default? Such as a try and defer in one.

I'm not really sure, but I do wanted to say that I think that currently it's quite easy to just try everything and forget about cleaning up.

BarabasGitHub on 15 Mar 2020

@BarabasGitHub that's what errdefer is for

frmdstryr on 15 Mar 2020

@frmdstryr yes I know about errdefer, but my point is that especially errdefer is very easy to forget and hard to test in general. Harder than things you need defer for. And I suggest that something which isn't totally separate from try/catching errors could help people not to forget about cleaning up (writing the errdefer part).

BarabasGitHub on 15 Mar 2020

Why not just a simple extension to defer, and get on with it?
Use 'defer to say: defer the execution to the next scope. And then you can put the 'defer inside the function that allocates.

One can extend this to any number of scopes ''defer to jump 2 scopes and so on.
This will be an easy extension to the language and will probably cover most use cases.

shumy on 13 Jun 2020

👍1

How about adding annotation for function as a „resource making” and force a compiler to use defer errdefer or some other keyword like safe after a call to this function?

const err_pipe = try makePipe() safe;
Would just ignore resource aquisition

const err_pipe = try makePipe();
Would look for either defer or errdefer called on err_pipe

Resource is still user managed, as the function just says that it needs cleanup but doesn’t enforce one way to do it on the user, while still providing safety after such calls (after all the user will be forced to do something)

It doesn’t address „making resource management” easier and less repeatable, but I’m not sure if that’s what we really need. Zig as of now is trying to be readable at first glance, RAII way would only add another layer user would need to check, not to mention it goes close with OOP

Sashiri on 23 Jul 2020

nocleanup obligation_tuple gets the value, discarding the obligation.

cleanup obligation_tuple is the same as defer obligation_tuple.obligation.fulfill(obligation_tuple.value); obligation_tuple.value

CurtisFenner on 28 Jul 2020

👍10 ❤2 👀1 🚀1

This doesn't fix the verbosity of calls like cleanup try allocate()

Why does it need to be a one-liner ? Because the Obligation is now part of the type,
and you need to downcast the value before being used.
The two line version (that we don't want to do):

var x_with_obligation : Value#Obligation = try allocate();
var x : Value = cleanup x_with_obligation;

I think showing too much compile time information in the type of objects which is supposed to represent a memory layout is not a very good idea (too close to C++).
The Obligation has no consequence on how you can use the value, so I'm not convinced that the type is the right place to store it.
And AFAIU it will have a runtime cost unless the compiler inline the function.

I'm suggesting that Obligation should be __along side__ the type (this may sound crazy, but bear with me), as a new compile time metadata.

Then the compilers has two orthogonal job:

check that function are called with values of the correct type
check that obligation are fullfilled

Then you can write:

fn init(n: u32, allocator: Allocator) HashMap#deinit {
  var map = ...;
  return @obligation(map, Hashmap.deinit);
}

var x: Value = try init();
defer x.deinit();
// alternatively: `cleanup x` or `defer cleanup x`;

Most of time the Obligation isn't visible in the caller code, only in the callee code and signature.
If we want to make the "Obligation" visible, we can force the use of a cleanup keyword,
but otherwise we can keep idiomatic Zig code with init/deinit.

If a user forget to call the cleanup method, it will receive a compile time error, which can have a dedicated error message.

Pros:

separation of concerns, don't overload type for a new mechanism
dedicated error messages
no need for cleanup keyword, the caller code stays similar

Cons:

add a new task to the compiler
may require more language support (equivalents of @typeInfo, ...)

gwenzek on 8 Sep 2020

I have been following Zig from afar, and unfortunately did not have the time to really try it however I'd like to add a bit of input to that discussion, hopefully this is not too much off-topic:

I think as soon as you decide to have implicit or checked (such as with @gwenzek obligations) cleanups, you will essentially tie behaviour to the lifetime of objects. In effect you will ensure the cleanup logic is done when the object dies (in the implicit defer case) or that it has to explicitely be done before that happens (in the obligation case).

"when the object dies" here means when the scope that created this object exits. If you admit that this cleanup is tied to the object lifetime, then another question naturally arises: How about about objects whose lifetime is not neatly enclosed by a scope, what if we have an ArrayList of File, can we somehow fulfill that obligation to close the file ?

I am not sure this "obligation" can be tracked by the compiler as the ArrayList might be returned, passed around, copied... Only through some complex set of rules enforced at compile time similar to rust's borrow checker would be you be able to guaranteed that.

What can be done without additional constraints on the language expressive power is to enable ArrayList to perform the cleanup on its contained values, this seems only possible without runtime overhead if the cleanup logic is a property of the type, not a property of the function that created the object (as there can be many of those).

This cleanup logic associated to a type is commonly called a destructor, and I believe it is the cleanest solution to resources management. Please note that destructors are not necessarily called implicitely, Zig could still require some opt-in syntax at scope level to make an object destructor automatically called at scope exit. Having destructors (which could be an arbitrary method with some well-defined annotation, easily indentifiable through reflection) means cleanups can be nested, calling deinit on an ArrayList of ArrayList of File would correctly cleanup all the files and all the allocated memory.

Hope this helps, keep up the good work with Zig, it is definitely one of the most interesting new languages in my view.

mawww on 20 Oct 2020

As a newcomer to this language I already made the mistake @BarabasGitHub pointed out with not adding errdefer between a pair of trys, and while (I thought) I'd thought about the problems of ownership and releasing resources I'd only done so for the happy path; as soon as I read his note I went back to my code and fixed it. I see this as being a very easy mistake to make. I also do not like just documenting ownership responsibility in a comment. Of all the proposals I think I like @CurtisFenner the best; reflect the ownership obligation in the type system paired with cleanup/errcleanup/nocleanup keywords. This avoids ctor/dtor & move semantics while still providing some significant additional safety benefit and I think pairs nicely with the existing error unions functionality and feel of zig.