Zig: RFC: Make function definitions expressions

Created on 13 Nov 2018 · 64Comments · Source: ziglang/zig

Overview

This is a proposal based on #1048 (thank you to everyone discussing in that thread). I opened this because I believe that conversation contains important ideas but addresses too many features at once.

Goals

Provide syntactic consistency among all statements which bind something to an identifier
Provide syntactic foundation for a few features: functions-in-functions (#229), passing anonymous funtions as arguments (#1048)

Non-goals

Closures

Motivation

Almost all statements which assign a type or value to an identifier use the same syntax. Taken from today's grammar (omitting a few decorations like align for brevity):

VariableDeclaration = ("var" | "const") Symbol option(":" TypeExpr) "=" Expression

The only construct which breaks this format is a function definition. It could be argued that a normal function definition consists of:

an address where the function instructions begin;
the type information (signature, calling convention) of the function;
a symbol binding the above to a constant or variable.

Ideally, number 3 could be decoupled from the other two.

Proposal

Make the following true:

A function definition is an expression
All functions are anonymous
Binding a function to a name is accomplished with assignment syntax

const f = fn(a: i32) bool {
    return (a < 4);
};

Roughly speaking, assigning a function to a const would equate to existing behavior, while assigning to a var would equate to assigning a function pointer.

Benefits

Consistency. There is alignment with the fact that aggregate types are also anonymous.
Syntactically, this paves the way for passing anonymous functions as arguments to other functions.
I have a suspision that this will make things simpler for the parser, but I'd love to have that confirmed/debunked by someone who actually knows (hint: not me).
Slightly shrinks the grammar surface area:

- TopLevelDecl = option("pub") (FnDef | ExternDecl | GlobalVarDecl | UseDecl)
+ TopLevelDecl = option("pub") (ExternDecl | GlobalVarDecl | UseDecl)

Examples

The main function follows the same rule.

pub const main = fn() void {
    @import("std").debug.warn("hello\n");
};

The extern qualifier still goes before fn because it qualifies the function definition, but pub still goes before the identifier because it qualifies the visibility of the top level declaration.

const puts = extern fn([*]const u8) void;

pub const main = fn() void {
    puts(c"I'm a grapefruit");
};

Functions as the resulting expressions of branching constructs. As with other instances of peer type resolution, each result expression would need to implicitly castable to the same type.

var f = if (condition) fn(x: i32) bool {
    return (x < 4);
} else fn(x: i32) bool {
    return (x == 54);
};

// Type of `g` resolves to `?fn() !void`
var g = switch (condition) {
    12...24 => fn() !void {},
    54      => fn() !void { return error.Unlucky; },
    else    => null,
};

Defining methods of a struct. Now there is more visual consistency in a struct definition: comma-separated lines show the struct members, while semicolon-terminated statements define the types, values, and methods "namespaced" to the struct.

pub const Allocator = struct.{
    allocFn:   fn(self: *Allocator, byte_count: usize, alignment: u29) Error![]u8,
    reallocFn: fn(self: *Allocator, old_mem: []u8, new_byte_count: usize, alignment: u29) Error![]u8,
    freeFn:    fn(self: *Allocator, old_mem: []u8) void,

    pub const Error = error.{OutOfMemory};

    pub const alloc = fn(self: *Allocator, comptime T: type, n: usize) ![]T {
        return self.alignedAlloc(T, @alignOf(T), n);
    };

    // ...
};

_Advanced mode, and possibly out of scope._

Calling an anonymous function directly.

defer fn() void {
    std.debug.warn(
        \\Keep it down, I'm disguised as Go.
        \\I wonder if anonymous functions would provide
        \\benefits to asynchronous programming?
    );
}();

Passing an anonymous function as an argument.

const SortFn = fn(a: var, b: var) bool; // Name the type for legibility

pub const sort = fn(comptime T: type, arr: []T, f: SortFn) {
    // ...
};

pub const main = fn() void {
    var letters = []u8.{'g', 'e', 'r', 'm', 'a', 'n', 'i', 'u', 'm'};

    sort(u8, letters, fn(a: u8, b: u8) bool {
        return a < b;
    });
};

What it would look like to define a function in a function.

pub const main = fn() void {
    const incr = fn(x: i32) i32 {
        return x + 1;
    };

    warn("woah {}\n", incr(4));
};

Questions

Extern?

The use of extern above doesn't seem _quite_ right, because the FnProto evaluates to a type:

extern puts = fn([*]const u8) void;
              --------------------
                 this is a type

Maybe it's ok in the context of extern declaration, though. Or maybe it should look like something else instead:

extern puts: fn([*]const u8) void = undefined;

Where does the anonymous function's code get put?

I think this is more or less the same issue being discussed in #229.

Counterarguments

Instructions and data are fundamentally separated as far as both the programmer and the CPU are concerned. Because of this conceptual separation, a unique syntax for function body declaration is justifiable.
Status quo is perfectly usable and looks familiar to those who use C.

accepted proposal

Source

hryx

👍64 ❤7 👎4

Most helpful comment

As @Rocknest suggests, there's an alternative if we want to be explicit about this. We could formally acknowledge labels and pointers as separate entities in the type system. This might not actually be as bad as it sounds. The core changes are:

Require that any var of a function label type must be comptime known, just like type, anytype, and comptime_int
Allow pointers to function types to be runtime known
Have a function expression evaluate to a function label

So in terms of this proposal, it would look like this:

const VoidFnType = fn() void;
const foo = fn () void { }; // @typeOf(Foo) is VoidFnType, foo is a function label.
const bar = foo; // bar is the same function label as foo
var baz = foo; // compile error: cannot have a function label at runtime.  Use `&foo` to get a function pointer
var quux = &bar; // quux is a runtime function pointer initialized to bar
var fab: @TypeOf(foo) = undefined; // compile error: cannot have a function label at runtime.
var fub: ?*VoidFnType = null; // fub is a nullable runtime pointer to a void fn.
var fob = &(fn () void { }); // quux is a runtime function pointer initialized to an anonymous function
const fib = &foo; // fib is a comptime-known function pointer
const foob = fib.*; // only allowed at comptime, converts a comptime-known function pointer into a label
const fuub: *VoidFnType = foo; // compile error. no implicit decay like in C/C++

comptime {
    var feb: ?VoidFn = null; // only allowed at comptime, nullable mutable function label
    inline for (slice_of_stuff) |item| {
        // Terrible, oh yes, but great
        feb = if (feb) |lastFeb| 
                fn () void { lastFeb(); item.process(); lastFeb(); }
            else
                item.process;
    }
    feb();
}

Similar to how . and @field work on both single pointers and values, func(params) and @call should work on both function pointers and labels. Both Nullable function pointers and nullable function labels require unwrapping before calling.

It is kind of nice that Zig can get so far without making a distinction between function pointers and function labels. But this ambiguity could be a reason to break that. I personally don't feel like we need to do this, but I also wouldn't mind if we took this step to be more explicit. I don't think it would lead to a lot of extra thought or typing when writing code, and you get nice compile errors if anything is done wrong.

Edit: also abuse BoundFn
Edit2: fix very bold text

SpexGuy on 19 Apr 2020

👍13 👀1

All 64 comments

I love the idea overall, but wonder about the syntax a little. Defining the function and the function type is a little too close:

const A = fn(i32) void;
const B = fn(x: i32) void {};
var C: A = B;

@Hejsil just redid the stage 1 parse and probably could say if this can be parsed correctly.

bheads on 13 Nov 2018

👍2

given we have syntactic sugar already in the form of optional_pointer.? would it be possible to make pub fn foo() void {} syntactic sugar for pub const foo = fn() void {};?

emekoi on 13 Nov 2018

👍9

@bheads Parsing fn defs and fn photos uses the same grammatical rules already, so this proposal doesn't make a difference in how similar these constructs will be.

@emekoi Given that Zig values "only one way", probably not. Pretty sure .? exists as asserting for not null is very common when calling into C. We also don't have .! (syntactic sugar for catch unreachable).

Hejsil on 13 Nov 2018

@Hejsil according to this, optional_pointer.? was, and still is, syntactic sugar for optional_pointer orelse unreachable.

emekoi on 13 Nov 2018

@emekoi I know. We give syntatic sugar when it really affects the readability to not have it. Things like try is a good example. ((a orelse unreachable).b orelse unreachable).c is a lot worse than a.?.b.?.c so we give syntactic sugar here. I don't think there is really a value in keeping the old fn syntax if we're gonna accept this proposal.

Hejsil on 13 Nov 2018

👍5

@bheads To me the syntax seems consistent in that curly braces after a type instantiate that type.
The only missing step towards full consistency would be parameter names. The argument list of a function introduces those variables into the function's scope.

const A = struct {a: i32}; //type expression/definition
const a = A {.a = 5}; //instantiation
const F = fn(a: i32) void; //type expression/definition
const f = F { return; }; //instantiation

When instantiating a function type (F above), I would think the parameters to be exposed via the names used in the function type definition/expression. While that might decouple their declaration from their usage, it's similar to struct definitions assigning names to their members.
Alternatively, if that seems too strange, I could see a builtin of the form @functionArg(index: comptime_int) T (or possibly @functionArgs() [...] returning a tuple (#208) / anonymous struct) to serve niche/"library" use cases.

rohlem on 13 Nov 2018

👍4

@rohlem I've contemplated that "define/instantiate a function of named type F" idea before, but it breaks down quickly for a few reasons:

The parameter names are not part of the actual function type. This is fine and even useful in some cases, I think.
Imagine if you wanted to write a function that implemented function type F as specified by some other library author, but you had to use the param names that that author chose. That would cause problems, including the fact that in Zig you can't shadow or otherwise repurpose any identifiers which are currently in scope. (So if this imaginary F takes a x: i32, you'd better not already have an x in scope). In Zig, you always get to choose your var identifiers, even for imported stlib packages.
Making it possible to define the body of a function without having the parameter names/types and return type visible immediately above that body would be _very harmful_ to readability and comprehension. Not just in 6 months, but now while you are currently writing the function. Unfortunately, a @functionArg(...) builtin wouldn't help there.

I agree that level of consistency is cool and enticing, but I think in this case it clearly works against Zig's goals.

hryx on 13 Nov 2018

👍4

@hryx For the record, I overall agree with your stances.

I agree that two function types (fn(a: i2) void) and (fn(b: i2) void) should compare equal. I think it would be possible to have the names as extra data in their type object anyway, which would require a couple of workarounds in f.e. comptime caching though, so it's not ideal.
Imagine the same with a struct retrieved from a @cImport call. Status quo Zig does not (yet) feature struct member renaming (EDIT: as in aliasing), though I'd be all for a proposal akin to that idea, which could then equally apply to function types. (Defining your own struct with different names will _probably work if handled carefully, but it's not 100% waterproof.) (EDIT: Now I see, I guess a function scope variable is different from a member name from a language perspective, so "shadowing" applies only to the former.)
I agree that it harms readability, but in code that instantiates a generic function type you're already reasonably decoupled from the concrete type. While copying around the function head worked well enough up until now, I don't think there's a suitable replacement for defining a function instance like f.e. callbackType{trigger_update(); return @functionArg(0);} (EDIT: with callbackType being variable, coming f.e. from a comptime type argument). . I think this would be the closest alternative and Zig-iest syntax for instantiating function types.
The biggest argument I currently see against it would be the fact that the value of a type T in T { } now dictates how to parse the instantiation (member list vs function code), which moves us further away from context-free grammar.

Either way, just adding to the discussion. Sorry for hijacking the thread, I definitely don't think the details about decoupling parameters should stand in the way of the original proposal.

rohlem on 14 Nov 2018

👍2

I agree with @hryx:

Defining the function and the function type is a little too close

We could approximate the switch case syntax and do something like, which opens the door for function expressions:

const A = fn(i32) void;
const B = fn(x: i32) void => { block; };
const X = fn(x: i32) u8 => expression;
var C: A = B;

raulgrell on 15 Nov 2018

@raulgrell That would also solve the ambiguity with braces in the return type.

bheads on 15 Nov 2018

@bheads yep, I think it came up in the discussion. The only weird case I could come up with, from @hryx's post:

var g = switch (condition) {
    13   => fn() !void => error.Unlucky,
    else => null,
};

raulgrell on 16 Nov 2018

What if instead if the fat arrow (=>) we instead use the placeholder syntax of while and for loops.

This allows the separation of parameter names from the type specification.

Examples:

// Typical declaration
const add = fn(i32, i32)i32 |a, b| {return a + b;};

// Usable inline
const sorted = std.sort(i32, .{3, 2, 4}, fn(i32,i32)bool |lhs, rhs| {return lhs >= rhs;});

// With a predefined type.
const AddFnType = fn(i32,i32)i32;
const otherAdd = AddFnType |a, b| {return a + b;};

Additionally, in line with #585, we could infer the type of the function declaration when obvious

// Type is inferred from the argument spec of sort
// However, the function type is created from the type parameter given
// earlier in the parameters, so I'm not sure how feasible this is
const sorted = std.sort(i32, .{3, 2, 4},  .|lhs, rhs| {return lhs >= rhs;});

We could even make the definition of the function take any expression, not just a block expression, but that may be taking it too far.

I think there is a lot of potential in this feature to provide inline function definition clarity without a lot of cognitive overhead.

(Please forgive any formatting faux pas, this was typed on mobile. I'll fix them later.)

williamcol3 on 4 Dec 2018

👍12

The following is already possible (version 0.4):

const functionFromOtherFile = @import("otherfile.zig").otherFunction;
_ = functionFromOtherFile(0.33);

I prefer the "standard" way of defining functions as it is more visually pleasing to me, but I don't see any real problems with this proposal either.

user00e00 on 15 May 2019

This is now accepted.

@williamcol3 interesting idea, but I'm going to stick to @hryx's original proposal. Feel free to make a case for your proposed syntax in a separate issue.

The path forward is:

Update the parsers to accept both.
Update zig fmt to update the syntax to the new canonical way.
Wait until the release cycle is done, and release a version of zig.
Delete the deprecated syntax from the parsers.

Extern can be its own syntax, or it can be downgraded to builtin function, which might actually help #1917.

andrewrk on 4 Jul 2019

🎉10

Wasn't a goal of Zig to say close to the syntax of C? I would say with this change, there is quite a bit difference compared to C. This would make the step for current C developers to move to Zig way bigger.

However, the change makes sense in the current expression system of Zig and I like it, but I think that this is one extra step to overcome for C developers moving to Zig.

FireFox317 on 4 Jul 2019

👍1

Extern can be its own syntax, or it can be downgraded to builtin function, which might actually help

Extern functions could just be variables with a function type, but no content:

// puts is a function value with the given function type
extern const puts :  fn([*]const u8) void;

// main is a function with the implicit type
const main = fn() {
    puts("Hello, World!\n");
};

// foo is a function of type `fn()`
const foo : fn() = fn() {
    puts("called foo\n");
};

For me this seems logical if we treat functions as values, we can also declare those values extern => consistent syntax for declaration of extern or internal functions

MasterQ32 on 4 Jul 2019

// Usable inline
const sorted = std.sort(i32, .{3, 2, 4}, fn(i32,i32)bool |lhs, rhs| {return lhs >= rhs;});

the type here could be inferred (similar to enum literals), making it:

const sorted = std.sort(i32, .{3, 2, 4}, |lhs, rhs| {return lhs >= rhs;});

Which isn't a bad "short function syntax" at all.... @williamcol3 please do make another issue for your proposal.

daurnimator on 24 Jul 2019

Could the function passed to sort be comptime, so that specialization (and inlining) can occur for each distinct function that is passed?

c-cube on 24 Jul 2019

I just noticed that this proposal has been accepted and thought I'd throw my two cents in. I don't see a way of applying the extern keyword to the fucntion definition, as extern requires that something has a name, but with this proposal function definitions would be anonymous and only the const/var they are assigned to would have a name. This would also be consistent with how extern is applied to the declaration (the pub const bit) rather than the definition/assignment of variables and types.

SamTebbs33 on 24 Jul 2019

Why not keep it the way it is right now?

The grammar states

FnProto <- FnCC? KEYWORD_fn IDENTIFIER? LPAREN ParamDeclList RPAREN ByteAlign? LinkSection? EXCLAMATIONMARK? (KEYWORD_var / TypeExpr)

# Fn specific
FnCC
    <- KEYWORD_nakedcc
     / KEYWORD_stdcallcc
     / KEYWORD_extern
     / KEYWORD_async (LARROW TypeExpr RARROW)?

If we take the "full reroute" and make global functions also just "variables", we get this:

// main is a function with the implicit type
pub const my_c_fun = extern fn() { // the IDENTIFIER is removed here, the FnCC not
    puts("Hello, World!\n");
};

this will also work in expression context:

iterate_and_call(my_array, stdcallcc fn(x : u32) {
    put(x);
});

but: in this context, we could also infer the required calling convention by the type that is required by iterate_and_call as the function type has to match the parameter type anyways

EDIT:

I don't see a way of applying the extern keyword to the fucntion definition, as extern requires that something has a name

The problem is that extern both states linkage as well as cdecl calling convention and also imports symbols from other translation units.

MasterQ32 on 25 Jul 2019

Where does the anonymous function's code get put?

In the current (and only as far as the zig binary is concerned, only) LLVM module. This question is not relevant to this discussion, unless you are discussing symbol visibility. As functions in LLVM always have names, we would have to auto-name them, probably based on the scope and the scoped variable being assigned to.

shawnl on 29 Jul 2019

As functions in LLVM always have names, we would have to auto-name them, probably based on the scope and perhaps the scoped variable being assigned to

Can this be made to work nicely with incremental [re]compilation/linking ? As stated it looks like simple, unrelated changes to a source file could cause recompilation of a lot of things.

Sahnvour on 29 Jul 2019

Such a big change. This is going to require pretty much every zig source file in existence to be updated to support this change.

It's refreshing to see that the language is still willing to make changes like this for the sake of being better.

marler8997 on 29 Aug 2019

👍3

@marler8997 the plan to roll this out is to have both syntaxes supported at the same time for 1 release cycle, with zig fmt converting to the canonical style. After one release cycle this way, the old syntax is removed. We are currently doing this with use/usingnamespace.

andrewrk on 30 Aug 2019

👍4

Has the grammar already been updated to reflect this change ?

ceymard on 7 Oct 2019

How would this interact with recursive calls? I can see it not being a problem at top level due to order-independence (so a recursive const f = fn() ... { f(); } could be resolved), but how would this work for a function defined inside something?

blackhole89 on 9 Dec 2019

👍1

As #685 landed, so anonymous function literals(function or closure) should introduced? Like:

v.map(.(i) { return i+1; });
// or
v.map(.(i) -> i+1);

mogud on 13 Dec 2019

Wasn't a goal of Zig to say close to the syntax of C?

Apparently javascript's syntax is better.

Sarcasm aside, I fail to see how this improves two goals of the zen of zig of "maintainability" and "Communicate intent precisely" and strongly disagree that this change should be made.

In my opinion this proposal:

Does not solve any problems that cannot already be done with the existing method.
Obfuscates the intent of whether a line of code is a procedure or a storage declaration (as mentioned in the counter arguments).
Makes code less searchable and therefore less maintainable. With this change I cannot just search for fn someName to find the function definition but must now search for var name = fn and const name = fn but since keywords can be added also const name = pub fn, etc..
Makes it harder to determine whether the intent is to define a function type or a function itself as the syntax is more similar.
It requires at least two extra tokens to read and write.
Enables repeated code like this const foo : fn() = fn() { from a comment above
Encourages nested function code patterns because variables are "meant" to be passed around (as is already clear from the examples in previous comments)
Will immediately lead to people wanting closures

For what it's worth, I find that zig currently is _more readable and maintainable_ than c and javascript. Please don't "fix" what isn't broken :)

frmdstryr on 12 Jan 2020

👍6

First time i saw this proposal i was positive about it, however after second reading it i've got more negative feelings.

The status quo is completely fine and somewhat familiar to ANY programmer. There are no fundamental flaws in not applying variable declaration syntax to functions by default, I do not see benefits in forcing to think about functions as constant pointers. However i do see benefits in a 'syntactic sugar' for a such fundamental feature of the language.

Furthermore as i can understand this proposal does not solve function signature inference

const exec = fn (operation: fn (x: u32) u32, arg: u32) u32 {
    return operation(arg);
}

exec(fn (x) { // ??? will be possible?
    return x + 1;
}, 111);

I think its better to keep existing syntax. To solve the use case i described i propose an anonymous function initializers(?):

const exec: fn (fn (u32) u32, u32) u32 = .|operation, arg| {
    return operation(arg);
}

exec(.|x| {
    return x + 1;
}, 111);

Rocknest on 13 Jan 2020

👍4

@Rocknest that looks about the same as this post? https://github.com/ziglang/zig/issues/1717#issuecomment-444200663

Looks like three people now have been somewhat in favor of that idea, especially because of that inline anonymous function syntax w/ inferred type. Someone want to open a new proposal for that?

kavika13 on 13 Jan 2020

👍1

@Rocknest Inferred return types are proposed in #447. I don't know of an existing proposal for inferred parameter types. Neither is a goal of this proposal.

@blackhole89 Since a non-top-level fn decl isn't currently possible anyway, I don't think this proposal necessarily provides for it. But it might be a worthwhile follow-up proposal.

@ceymard No, a grammar/spec change is usually shipped at the same time as the implementation.

hryx on 13 Jan 2020

@hryx

Since a non-top-level fn decl isn't currently possible anyway, I don't think this proposal necessarily provides for it. But it might be a worthwhile follow-up proposal.

Not sure if this is exactly what you mean, but you can define a struct locally, and define a function inside that struct. So a non-top-level function definition can already exist right now, at least in some form:

fn some_function() void {
    const functor = struct {
        fn do_something(a: u32) void {
            std.debug.warn("Value of a: {}", .{a});
        }
    };
    some_other_function(functor.do_something);  // Pass the function pointer to some other function
}

kavika13 on 13 Jan 2020

@kavika13 Technically, a struct creates a "top level" in Zig grammar lingo, same as a file. But you are totally correct that it can be done that way, and it does come in handy. :)

hryx on 13 Jan 2020

👍1

A major goal here is to unify the syntax. Zig has 2 ways of creating a variable right now: const/var foo or fn foo. It’d be more consistent to have just one syntax, and this also provides an obvious inline anon function syntax, which I think is commonly desired.

As for searching, functions can only be declared as const foo.*fn. It should be slightly easier to search now since any variable must now be const/var and shadowing is disallowed so you don’t even have to attach the trailing fn token.

Historical note: we used to have the similar distinction between declared structs and anon structs, but that was ditched for the current syntax.

fengb on 13 Jan 2020

👍3

@fengb there is a precedent of syntactic sugar for a frequently used feature: orelse unreachable - .?. Should we remove that if two ways are completely unacceptable?

Rocknest on 13 Jan 2020

Zig has 2 ways of creating a variable right now: const/var foo or fn foo

I think this is the fundamental point of disagreement. Should a function __always__ be treated as a variable or not? Is it beneficial to have the language force that way of thinking on users?

I have no problem with this syntax __only__ for functions that can be inlined. But forcing __every function__ to use this concept/syntax (which is my understanding of this PR) does not make sense to me.

frmdstryr on 13 Jan 2020

👍1

Some more points to consider:

How am I supposed to set a breakpoint in gdb on an anonymous function?
What useful stack trace can be given for a crash in an anon function?
What's the impact on the stack size?
What's the impact on code size?

Edit: the last two concerns here are addressed at https://github.com/ziglang/zig/issues/229#issuecomment-721421197. The first can be done by a line breakpoint and the stack can contain the line so this change shouldn't be a problem.

frmdstryr on 13 Jan 2020

It's quite simple. If we make function definitions expressions, there will be 2 ways of defining a function:

fn foo() { ... }          // current syntax
const foo = fn() { ... }; // new syntax

Now there are 2 ways of doing the same thing. The difference is that the current syntax only supports a subset of what the new syntax can do. Based on the Zen of Zig, the decision to remove the now redundant syntax is clear.

marler8997 on 13 Jan 2020

👍3

Addressing some of the points raised by @frmdstryr.

EDIT: I realise this was a little long, I'm not trying to pick on you - you just had the easiest to reference post =P

1. Does not solve any problems that cannot already be done with the existing method.

One thing it solves is consistency, like @fengb said. It may seem like little gain now, but will give us benefits with other features like closures.

With some of the other features proposed, it could also enable things like

const Binary = fn(a: var, b: @TypeOf(a)) @TypeOf(a);
const addInferred: Binary = .|a, b| { return a + b };
const subInferred: Binary = .|a, b| { return a - b };

const Op = enum { add, sub};
const opFn : Binary = switch(Op.get()) {
     .add => .|a, b| { return a + b; },
     .sub =>  .|a, b| { return a - b; }
}
opFn(a, b);

const add: Binary = fn (a: u8, b: u8) u8 { return a + b; };
const addFail: Binary =  fn (a: u8, b: 16) u8 { return a+ b; }; // compile error: @TypeOf(b) != @TypeOf(a)

2. Obfuscates the intent of whether a line of code is a procedure or a storage declaration (as mentioned in the counter arguments).

I see the purpose of var/const not as declaring a storage location. It declares a name/identifier. Types don't have storage, and neither do 0-sized values, though they can both be declared with var/const. Actually, I think a function declaration does actually declare storage - in the binary. And a function pointer can take up space on the stack.

The same reasoning applies here:

I think this is the fundamental point of disagreement. Should a function always be treated as a variable or not? Is it beneficial to have the language force that way of thinking on users?

If you replace "variable" with "identifier" does it sound more reasonable?

4. Makes it harder to determine whether the intent is to define a function type or a function itself as the syntax is more similar.

Zig style tells us to name types in PascalCase and functions in camelCase, which should help. In the extreme case, we could make it a compile error to give types lowercase names.

6. Enables repeated code like this const foo : fn() = fn() { from a comment above

Sure, but the possibility of that repetition already exists:

const S = struct {};
const s: S = S{};

// Also in C++
S s = S{};
// vs
auto s = S{};

7. Encourages nested function code patterns because variables are "meant" to be passed around (as is already clear from the examples in previous comments)

If you mean declaring functions inside functions, this can be a benefit: the name can only be used in that scope, so you won't be able to call that function in unexpected places. You can precisely communicate: only call this function through here.

If you mean callback-style functions as parameters, it's an very powerful technique that can facilitate some great abstractions, which is why so many examples touch on this.

8. Will immediately lead to people wanting closures

We already want closures =P

How am I supposed to set a breakpoint in gdb on an anonymous function?
What useful stack trace can be given for a crash in an anon function?

I'm not sure if zig support in gdb will require changes in order to break on an anonymous function. You can still break on the address, though we can always give it the name of the variable it was declared in or some compiler generated name that reflects where it is defined.

What's the impact on the stack size when using these concepts?
What's the impact on code size when using these concepts

Neither should be affected by the change, it's just syntax/grammar.

raulgrell on 13 Jan 2020

👍2

@raulgrell how do you imagine closures without hidden memory allocations?

Rocknest on 14 Jan 2020

@Rocknest It's probably possible to create some basic closure implementation based on async functions already, especially now that the @call builtin allows calling function from a stored value. I don't have a full way of doing it, but it could probably be done with something like

const T = struct { i: i32 = 0 };
var closure = closeFunction(T, fn(upvalue: *T, inc: i32) i32 {
    const res = upvalue.i;
    upvalue.i += inc;
    return res;
});

const i0 = closure.invoke(.{ 3 });
const i3 = closure.invoke(.{ 2 });
const i5 = closure.invoke(.{ 0 });

By using a similar pattern to the one used by @fengb for his generators (as closures and generators are pretty similar in the "inner" workings)

MasterQ32 on 15 Jan 2020

There's been some discussion on it already. Not all closures require heap allocation, the memory can be on the stack of the outer function. For those that do, the user must provide an allocator. From #229:

It makes sense for there to be a function inside a function, as long as it doesn't require allocating any memory, and the function is only callable while the outer function is still running.

If it were implemented as a function, then the compiler would create a struct for all the local variables in the outer function, and then pass that struct address to the inner function as a hidden parameter, and the inner function has access to the outer function's local variables.

If it were implemented as a block, the inner function's local variables would need to be a struct, and when you call the inner function, we allocate the stack space in the outer function's stack for the local variables of the inner function, and push the return address.

Other comments (also in #1048) suggested we could get a lot of the ergonomics of closures by only allowing capture by value, not allowing recursion and whatnot.

Since those discussions, the new async stuff landed which has similar issues to deal with and can probably be dealt with the same way. Say defining a closure actually returned a callable struct containing the function pointer and its stack including space for closed values.

raulgrell on 15 Jan 2020

Now there are 2 ways of doing the same thing.

Nearly every popular language has a separate syntax for functions. Why would zig want to go against the grain here? Edit: Or more importantly, why did they choose do to that?

frmdstryr on 17 Jan 2020

👍1

Nearly every popular language has a separate syntax for functions. Why would zig want to go against the grain here?

For starters, because copying other languages defeats the point of designing a new language. The goal isn't to improve on what's already there - e.g. a C extension - it's to completely replace it with something better.

More importantly, why

Disclaimer: this is an educated guess. Please don't quote me on this.

Presumably, B had a reason, which carried on to C, and has since just been passed down because of tradition. Without an overwhelming reason _not_ to stick to it, new languages tend to keep most of the cruft that older languages have in order to appeal to programmers familiar with those languages and to stick to more well-known territory.

pixelherodev on 19 Jan 2020

👍1

@pixelherodev usually special function declaration syntax is shorter so it counts as syntactic sugar. I see no benefits in removing it.

Rocknest on 20 Jan 2020

Some more examples. Status quo zig:

pub fn makeFeatureSetFn(comptime F: type) fn([]const F) Cpu.Feature.Set {
  return struct {
    fn featureSet(features: []const F) Cpu.Feature.Set {
      var x: Cpu.Feature.Set = 0;
      for (features) |feature| {
        x |= 1 << @enumToInt(feature);
      }
      return x;
    }
  }.featureSet;
}

This proposal:

pub const makeFeatureSetFn = fn(comptime F: type) fn([]const F) Cpu.Feature.Set {
  return fn(features: []const F) Cpu.Feature.Set {
    var x: Cpu.Feature.Set = 0;
    for (features) |feature| {
      x |= 1 << @enumToInt(feature);
    }
    return x;
  };
};

vs #4170:

pub fn makeFeatureSetFn(comptime F: type) fn([]const F) Cpu.Feature.Set {
  return .|features| {
    var x: Cpu.Feature.Set = 0;
    for (features) |feature| {
      x |= 1 << @enumToInt(feature);
    }
    return x;
  };
}

Rocknest on 22 Jan 2020

👍3

Here's another advantage to this syntax: function signatures can be typechecked

Current zig:

pub const SomeStruct = struct {
   pub fn format (self: *const SomeStruct, comptime fmt: []const u8, options: std.fmt.FormatOptions, context: var, comptime Errors: type, output: fn (@TypeOf(context), []const u8) Errors!void) Errors!void {
        return std.fmt.format(context, Errors, output, "", .{});
    };
};

Using the wrong type for an argument (like *SomeStruct instead of *const SomeStruct) is caught as an error, but the return trace is useless.

This:

pub const SomeStruct = struct {
   pub const format: std.fmt.FormatFunction = fn (self: *const SomeStruct, comptime fmt: []const u8, options: std.fmt.FormatOptions, context: var, comptime Errors: type, output: fn (@TypeOf(context), []const u8) Errors!void) Errors!void {
        return std.fmt.format(context, Errors, output, "", .{});
    };
};

Using the wrong type would point to this part of the code.

pfgithub on 10 Feb 2020

👍2

@pfgithub good observation. Kinda works like an interface, where changing the interface will also show you all the implementors that need to be updated. I didn't think of that.

marler8997 on 11 Feb 2020

👍1

I think code readability is paramount for a language which tries to make "writing quality software" easier and "writing crappy software" harder. I understand all excitement about new feature, validating function signatures, but honestly (and this applies to struct declarations too) does anybody reading new syntax get that it's a function declaration and not an assignment statement any faster? It was way slower to me. This looks to me like JavaScript style syntax acrobatics. Why move in that direction?

adontz on 23 Feb 2020

👍2

I think code readability is paramount for a language which tries to make

And that's kinda the point. As some above already told, it's just another syntax. I've written some Zig now and i start typing auto const foo = struct { in C++ now out of habit. You get used to such declaration style pretty fast, and i don't think it hurts to remove "just another syntax" in the language. After that change, __every__ declaration is just var x = … or const x = … which means: "I see a symbol x, where is it declared?" leads to "search for x =" project wide and you get your declaration found. No matter if it's a type, a function, a constant or a variable.

but honestly (…) does anybody reading new syntax get that it's a function declaration and not an assignment statement any faster?

No, but more important: I don't get it any __slower__ so removing a feature from the language is good, because it was not necessary in the first place at all. It was just there because "others did it always this way" and that's not a good argument to create new stuff

MasterQ32 on 23 Feb 2020

👍1

And that's kinda the point. As some above already told, it's just another syntax. I've written some Zig now and i start typing auto const foo = struct { in C++ now out of habit.

My comment was about readability, not about typing habits. You type just once, but read many more times. It almost does not matter how easy or consistent is typing experience if it benefits reading.

You get used to such declaration style pretty fast, and i don't think it hurts to remove "just another syntax" in the language.

Simplicity of parser implementation should hardly be a priority ever.

After that change, every declaration is just var x = … or const x = … which means: "I see a symbol x, where is it declared?" leads to "search for x =" project wide and you get your declaration found. No matter if it's a type, a function, a constant or a variable.

This is just one use case. Now find all functions (but not structs) which names start with "PostgresBackend". Technological unification is not a good thing for human brain. You'll invent Lisp or Forth this way.

No, but more important: I don't get it any slower so removing a feature from the language is good, because it was not necessary in the first place at all.

This syntax is similar to JavaScript. And we already know that is was so bad, that TypeScript was born.

It was just there because "others did it always this way" and that's not a good argument to create new stuff

New proposal is not previously unseen syntax either. More, it's well known confusing syntax which led to invention of a few new languages to fix this syntax.

adontz on 23 Feb 2020

👍2

I decided to add an example with close syntax to clarify

const less = fn (a: i32, b: i32) bool { return a<b;};

this is bad to read, I am starting to read variable initialization and after fn keyword have to reinterpret.

fn less = (a: i32, b: i32) bool { return a<b;};

this is good to read, I know it is a function from the very start.

adontz on 23 Feb 2020

👎5 👍3

One thing I want to address on this proposal before implementing it is export.

There's a problem here, which is that, in status quo:

export fn foo() void {}

is different than

fn x() callconv(.C) void {}
export const foo = x;

The former exports a function directly. The latter exports a function pointer. The Zig code to call the former as an external function would be:

extern fn foo() void;

But the Zig code to call the latter as an external function would be:

extern const foo: fn() void;

The latter is rarely wanted, the former is the common case. This matters for ABI reasons; it's not merely syntax. With this proposal implemented, how would export work?

export const foo = fn() callconv(.C) void {};

This would not have the desired behavior; it would export a function pointer.

andrewrk on 18 Apr 2020

👍1

Regarding the export syntax. I haven't used it myself so forgive me if I'm misunderstanding the problem here.

Though, if the issue is that there has to exist syntax for both attaching export to the function identifier/pointer OR the function itself, maybe #4285 could be considered. The drawback with that proposal is that it has to work for a lot of other cases as well to be worthwhile. It won't do as a fix for one-off syntax issues.

user00e00 on 19 Apr 2020

I feel like export const foo = ... is a very easy mistake to make here.

However, maybe either of:

// export is like callconv; a property of the function
const foo = fn() export callconv(.C) void {};

// export is still a qualifier, on the _function_
const foo = export fn() callconv(.C) void {};

Tetralux on 19 Apr 2020

I'd like to point out that the most obvious way (to me at least) to export a function given the proposed syntax would be:

const foo = fn() callconv(.C) void {};
export foo.*;

ifreund on 19 Apr 2020

👍1

@ifreund that's missing the export name. The closest viable thing to your snippet would be dropping the export keyword and doing:

const foo = fn() callconv(.C) void {};
comptime {
    @export(foo.*, .{ .name = "foo" });
}

This is considerably worse than status quo, especially if you consider godbolt, and especially if you consider @Tetralux's observation that this is an ABI footgun, the worst kind of footgun.

andrewrk on 19 Apr 2020

👍3

export fn foo() void {}

is different than

fn x() callconv(.C) void {}
export const foo = x;

I haven't gotten deep into the compiler internals, but as a user of the language I'm kind of surprised by this. In most cases, const foo = someFn; at function scope behaves like a label, not a pointer. Some examples of what I mean:

fn func(self: Struct) void { }

const Struct = struct {
    a: i32,

    // this is equivalent to defining func inline here in every way I can think of
    pub const boundFunc = func;
}

fn examples() void {
    const x = Struct{ .a = 0 };
    func(x); // generates `call func`
    x.boundFunc(); // generates `call func`
    const boundFn = x.boundFunc;
    boundFn(); // generates `call func`
}


const importFunc = @import("other.zig").func2;
fn importExample() void {
    importFunc(); // generates `call func2`, importFunc is a label not a pointer
}


fn runtimeFuncParam(param: fn () void) void {
    // param is runtime const
    param(); // generates `call qword ptr [mem]` (or something similar)
}


fn voidFn() void { }
var runtimeVoidFn = voidFn;
fn truthTable() void {
    comptime var comptimeVar = voidFn;
    const comptimeConst = voidFn;
    var runtimeVar = voidFn;
    const runtimeConst = runtimeVar;

    comptimeVar();   // generates `call voidFn`
    comptimeConst(); // generates `call voidFn`
    runtimeVar();    // generates `call qword ptr [mem]`
    runtimeConst();  // generates `call qword ptr [mem]`

    runtimeVoidFn(); // generates `call qword ptr [mem]`
}

The ambiguity with function pointers comes from the fact that Zig doesn't differentiate between function labels and function pointers. From observing, it seems like the operating rule that Zig uses to resolve it is: if it's comptime known, it's a function label. If it's runtime known, it's a function pointer.

So based on that rule, I would expect

export const foo = x;

to export a function label, not a function pointer, because x is comptime known. If I instead wrote

export var foo = x;

I would expect that to export a function pointer, because foo is not comptime known.

Making this the rule would mean that you can't directly export a const function pointer. But maybe that's ok? I can't think of an actual use case for that. It might still be possible with @export.

SpexGuy on 19 Apr 2020

👍6

Exactly as @SpexGuy says: users do not think in terms of function pointer, if the user sees const foo = std.warn; they do expect that it is THE function not a pointer.

1890 is similar to the footgun described above. You expect a function, but you get a pointer.

I oppose this proposal, but i do think that function-pointerness by default needs to be solved. Maybe in a similar way as array literals were solved (you need to use & with them).

fn x() callconv(.C) void {}
export const foo = x;
// works fine

fn x() callconv(.C) void {}
var foo = x;
// error: type (function) is not allowed in variables
// note: use `&` to get a pointer to a function

from #1890

fn entry() *c_void {
    return @ptrCast(*const c_void, &entry);
    // works fine
}

fn entry() *c_void {
    return @ptrCast(*const c_void, entry);
    // error: cannot cast type (function) to *const c_void
}

Rocknest on 19 Apr 2020

👍5

Require that any var of a function label type must be comptime known, just like type, anytype, and comptime_int
Allow pointers to function types to be runtime known
Have a function expression evaluate to a function label

So in terms of this proposal, it would look like this:

const VoidFnType = fn() void;
const foo = fn () void { }; // @typeOf(Foo) is VoidFnType, foo is a function label.
const bar = foo; // bar is the same function label as foo
var baz = foo; // compile error: cannot have a function label at runtime.  Use `&foo` to get a function pointer
var quux = &bar; // quux is a runtime function pointer initialized to bar
var fab: @TypeOf(foo) = undefined; // compile error: cannot have a function label at runtime.
var fub: ?*VoidFnType = null; // fub is a nullable runtime pointer to a void fn.
var fob = &(fn () void { }); // quux is a runtime function pointer initialized to an anonymous function
const fib = &foo; // fib is a comptime-known function pointer
const foob = fib.*; // only allowed at comptime, converts a comptime-known function pointer into a label
const fuub: *VoidFnType = foo; // compile error. no implicit decay like in C/C++

comptime {
    var feb: ?VoidFn = null; // only allowed at comptime, nullable mutable function label
    inline for (slice_of_stuff) |item| {
        // Terrible, oh yes, but great
        feb = if (feb) |lastFeb| 
                fn () void { lastFeb(); item.process(); lastFeb(); }
            else
                item.process;
    }
    feb();
}

Edit: also abuse BoundFn
Edit2: fix very bold text

SpexGuy on 19 Apr 2020

👍13 👀1

I found some more ambiguity that this would help solve. It has to do with the new comptime var anonymous functions.

const funcScope = fn () void {
    comptime {
        var Struct = struct { };
        Struct = struct {
            // references to Struct in this block resolve to the old value of Struct,
            // before the assignment has completed. 
            inner: Struct,
        };

        var func = fn () { };

        // example 1
        func = fn () void {
            // like the struct case, this should be a comptime closure over the old
            // value of `func`, which means this is not a recursive call.
            func();
        };

        // example 2
        const lateBindingFunc = &func;
        // lateBindingFunc is semantically a pointer to function
        // the type system can't really express this properly, but I think the compiler
        // could still conceivably do the right thing here and keep track of the target
        // being comptime known.
        func = fn () void {
            // comptime closure over a pointer to the ct-known function reference
            // will call the current value of `func` when this function is called.
            lateBindingFunc.*();
        };

        // example 3
        func = fn () void {
            const innerBinding = &func;
            // does this call the `func` when the lambda was created or the `func`
            // when the value was used?  This depends on whether `func` is a reference
            // type or a value type.  If it's a reference, the reference was captured into
            // the lambda and this is late-binding.  If it's a value, the value was captured
            // into the lambda and we didn't do anything to it, so this calls the previous `func`.
            // For example 1 to work, `func` needs to be a value type.  This is consistent with
            // the previous rule that comptime-known functions are labels and runtime-known functions
            // are pointers.  So this should call the previous value of `func`, not the newest value.
            innerBinding.*();
        };
    }

    _ = struct {    
        // example 4
        // note that this is a runtime `var`.
        var func = fn () void {
            func(); // unlike example 1 above, this call is late-binding call through a runtime function pointer
        };

        const exec = fn() void {
            // example 5
            func = fn () void {
                const innerBinding = &func;
                // unlike example 3, this call is late-binding since `func` is a runtime function pointer.
                innerBinding.*();
            };
        }
    };
}

After playing around with it, I don't think there's a way to write a function that does one behavior when called at comptime and a the other one when called at runtime. But still, we have cases where two functions with the same text do different things in different contexts. Making the difference between functions and function pointers explicit would help to make this clearer.

// example 1
compiles with no changes, same behavior
// example 2
compiles with no changes, same behavior
// example 3
`.*` is no longer required, but will still work

// example 4
// runtime `var` of function is not allowed, must be made a pointer
var func = &(fn () void {
    func(); // it's now clear that this is definitely not recursion
});
// example 5
compiles with no changes, same behavior, but now func is explicitly a pointer

In light of this, I think making this distinction might be a good idea. We could consider going a step further and requiring that function pointers must be dereferenced to call, which would force the text of the two functions to be different, but I don't think that's necessary. . working for both pointers and values is a precedent that I think we should uphold here.

One potential downside of making this distinction this is its effect on generic code. The types fn () void and *fn () void are almost identical in terms of usage, but their @typeInfo structs would be very different with an extra * in the way. On the other hand, this would also mean that the information about the indirection is exposed to generic code, which could be useful in some cases. This could also be worked around by a function std.meta.fnType which accepts both function types and single pointers to functions and returns the underlying function type.

SpexGuy on 20 Apr 2020

👍2

I think the idea of comptime-only function values is actually quite appealing.

1) They can be collapsed into a usual function call when being zig-only and functions can be deduplicated in the process

2) You can export functions as function values:

const GiveMeFun = fn() callconv(.C) u32;

const internal = fn(comptime constant: u32) GiveMeFun {
    return fn() callconv(.C) u32 {
        return constant;
    };
}

const renamed = internal; // this could yield a different function than internal, but may be collapsed into one by the compiler (same code)

export const giveme1 = internal(1);
export const giveme2 = internal(2);
export const giveme3 = internal(3);
export const giveme1ptr = &giveme1;

3) it would require to distinguish raw function values from function pointers (imho a drawback, but i can live with that. it's even a tad more explicit).

I would also allow implicit coercion from a function to a function pointer, as it would be a bit weird to code that otherwise. But i could also understand if it's not done to keep clearer code

MasterQ32 on 20 May 2020

👍2

while thinking about #5421 I realized that the proposal probably makes it possible to call function literals as well. This makes the following code a valid way to handle multiple errors at once while continuing execution in the case of an error.

const main = fn() !void {
    var my_buf: [1024]u8 = undefined;

    const maybe_slice: ?[]u8 = fn (buf: []u8) ![]u8 {
        const file = try std.fs.openFileAbsolute("/path/to/file", .{});
        defer file.close();

        try file.seekTo(42);

        const len = try file.read(buf);

        return buf[0..len];
    }(&my_buf) catch |err| switch (err) {
        error.FileNotFound => null,
        else => return err,
    };

    try process(maybe_slice);

    finish();
}

It certainly feels a little weird and I'm not sure this would be considered good style. However, I thought it would be worth mentioning here as I don't think it's been brought up yet.

ifreund on 25 May 2020

👍3

I want to note here that @SpexGuy's amendment proposal to make function labels vs pointers is also accepted.

andrewrk on 8 Sep 2020

👍6

Was this page helpful?

0 / 5 - 0 ratings

Related issues

fix inability to interact with C ABI symbols with underscore name (`_`) by making it a keyword

andrewrk · 3Comments

C ABI test failures on Windows x86_64

andrewrk · 3Comments

remove octal and binary float literals from the language

andrewrk · 3Comments

zig fmt deletes comments starting with empty comment

fengb · 3Comments

runtime safety to detect branch on undefined

S0urc3C0de · 3Comments