Zig: Syntax proposal: in functions, separate the parameter list from its type declarations

Created on 5 Jun 2020 · 16Comments · Source: ziglang/zig

This is my attempt at bringing back a feature that C used to have but was lost long ago, in my opinion more or less accidentally.

Instead of having to write

fn myFunc(x: f32, y: f32, z: f32) f32 {
    ...
}

I would like to be able to write instead:

fn myFunc(x, y,z)
    x, y, z: f32;
    f32
{
    ....
}

(This is a contrived example, with one-letter variable names and a 3-letter type name, but we have all written functions where we just couldn't fit the declaration on a single line.)

It may help to provide a bit of background here, as Zig appears to be a young people club ;-).

[EDITED TO HOPEFULLY IMPROVE CLARITY:]
My proposal actually reflects the way things were done in the original (K&R) incarnation of C.
In C however, this led to a problem owing to the way C functions were exported back then, i.e. by simplified declarations (usually in header files) that only specified the functions's name and its return type, completely ignoring the parameters. Thus, it was completely up to the programmer to know and provide the correct number and types of arguments - the compiler had no way to check them.

To remedy the situation, so-called prototypes were introduced. Those were extended declarations where the names of the parameters were replaced with their types. It was allowed to specify the names also, but that wasn't - and still isn't - required in a function declaration.

I seem to recall that initially, it was intended for those declarations to be machine-generated from the source files. Soon after, however, it was decided that typed parameter lists should be mandatory, and not just in export declarations, but in the implementations too. And this was done for both C and C++.

The net effect is that parameter lists with more than 2 or 3 parameters are, and have been for > 2 decades, horribly ugly - and for some reason not just in C & C++, but in every new language with even remotely comparable design.

Zig is a new opportunity to correct what I see as a clear mistake. It is actually a particularly good chance, since Zig already does away with the archaic contraints of order-of-declaration and the declaration/implementation split.

proposal

Source

JPGygax68

👎10

Most helpful comment

We also need to answer the question of how this affects function types. Names are intentionally not part of a function type, though they can optionally be specified. Would this require names, or would the current syntax be kept?

To weigh the look, here's a full set of examples. I'm going to set these the way I think I would prefer to format them if this were the actual syntax. Since #1717 is accepted I'll use that syntax. I'm also going to use the suggested -> modification because I think it's ambiguous without it. These are meant to be realistic, many are pulled directly from the standard library or other libraries.

// function without args
pub const main = fn() -> anyerror!void {

// function with one arg
pub const init = fn(allocator) allocator: *Allocator -> !@This() {

// member function
pub const add = fn(self, other) self, other: @This() -> @This() {

// type declaration function
pub const ArrayList = fn(ElementT) comptime ElementT: type -> type {

// function taking both comptime and runtime parameters
pub const alignedAlloc = fn(self, T, alignment, n)
    self: *Allocator,
    comptime T: type,
    comptime alignment: ?u29,
    n: usize,
-> Error![]align(alignment orelse @alignOf(T)) T {

// function taking parameter of unknown type
pub const expectEqual = fn(expected, actual)
    expected: var, actual: @TypeOf(expected) -> void {

// function with unused argument
pub const exp_zero_case = fn(random, _)
    random: *Random, f64 -> f64 {

// function type:
pub const PFN_AllocationFunction = fn(pUserData, size, alignment, allocationScope)
    pUserData: ?*c_void,
    size, alignment: usize,
    allocationScope: SystemAllocationScope,
-> ?*c_void callconv(.Stdcall);
// not really sure where to put `callconv`,
// between -> and return type doesn't make much sense,
// and before -> is also weird.  It needs to come after the
// declarations though, because they are in scope inside
// the callconv expression.

// function type, potential nameless syntax?
pub const PFN_DebugReportCallbackEXT = fn(_,_,_,_,_,_,_,_)
    DebugReportFlagsEXT.IntType,
    DebugReportObjectTypeEXT,
    u64,
    usize,
    i32,
    ?[*:0]const u8,
    ?[*:0]const u8,
    ?*c_void,
-> Bool32 callconv(.Stdcall);

// really big function
// too many args to fit on one line
pub const debugReportCallback = fn(
    flags,
    objectType,
    object,
    location,
    messageCode,
    pLayerPrefix,
    pMessage,
    pUserData,
)
    flags: DebugReportFlagsEXT.IntType,
    objectType: DebugReportObjectTypeEXT,
    object: u64,
    location: usize,
    messageCode: i32,
    pLayerPrefix: ?[*:0]const u8,
    pMessage: ?[*:0]const u8,
    pUserData: ?*c_void,
-> Bool32 callconv(.Stdcall) {

// function taking function parameter
pub const sort = fn(T, items, lessThan)
    comptime T: type,
    items: []T,
    lessThan: fn(lhs, rhs) lhs, rhs: T -> bool,
-> void {

// function taking only one function parameter
pub const registerCallback = fn(callback) callback: fn() -> void -> void {

// extern function header
pub extern "kernel32" const CreateEventExW: fn(lpEventAttributes, lpName, dwFlags, dwDesiredAccess)
    lpEventAttributes: ?*SECURITY_ATTRIBUTES,
    lpName: [*:0]const u16,
    dwFlags: DWORD,
    dwDesiredAccess: DWORD,
-> ?HANDLE callconv(.Stdcall);

After going through all that, I'm not a fan. For functions with lots of parameters, I would have written them out on multiple lines anyway, so the initial list of names feels unnecessary and redundant. For functions with one or two parameters, I still want to put the types on the same line as the parameters so this doesn't feel any cleaner. There weren't many places where a, b: T was useful, and in a few places I felt it would actually hurt readability, because the two fields are entirely unrelated. It makes using longer parameter names feel bad, because they need to be typed twice. The extra line with all of the names feels unhelpful to me, since even with the names I don't know comptime-ness or types without reading into the type declarations, and I can't call a function without knowing its parameter types.

Overall it feels like this is trying (and failing) to force a one-parameter-per-line style. Worth noting though is that in every example I looked at with more than two parameters, the parameters were already written out on one parameter per line in the existing source. So I feel like this is unnecessary.

SpexGuy on 5 Jun 2020

👍3

All 16 comments

This proposal is going to be rough on my ego :-)

JPGygax68 on 5 Jun 2020

I'm struggling to understand the problem the proposal addresses. At first I thought the syntax was to combine header / implementation files since you mentioned programmers having to satisfy the linker / loader with the correct functions. But for non-externally-built-library Zig code, there is no separate header / implementation requirement. Alternatively, if you want to provide a header only API as a product to keep implementation details private, then you have no choice in the matter in duplicating the function specification anyway, right?

JesseRMeyer on 5 Jun 2020

@JesseRMeyer That was just historical background. The goal of the proposal is to make function declarations more readable, by drastically reducing the likelyhood that the parameter list will require more than one line.

One thing I forgot to mention is that it would also make it much easier to document parameters, because you can append comments after the type declarations:

fn myFunc(x, y, z)
    x, y, z: f32; // vector elements
    f32 // result: vector length
{
    ....
}

If the parameters are given meaningful names, and the function is well named too, you often won't need to know the exact types of the parameters to understand how to call the function. That is what I mean by improved readability.

JPGygax68 on 5 Jun 2020

👍1

@dbandstra Thank you for the uplifting comment, I kinda needed that after the 5 downvotes :-)

Seeing the types first and the names to the right runs rather contrary to my aim.

I consider the zig fmt a good idea, though not as easy to read as my own proposal.

JPGygax68 on 5 Jun 2020

Sorry, I deleted my comment as I realized I got carried away a bit in the wrong direction. But since you spotted it I will try to replicate it. I brought up #1717 and the proposal of a function declaration syntax like this (which I believe was rejected):

const myFunc: fn(f32, f32, f32)f32 = .|a, b, c| {
    ...
};

As you said, this goes in a different and incompatible direction from your proposal.

Other than that, my gist was, I am sympathetic to your idea but think it will be a hard sell, mostly due to increased typing. And, unfortunately, I think extra justification is needed every time you try to go away from the norm in language syntax, as the friction (tiny as it is) can cost you new users. And it could mean that every time your language is brought up on HN, year after year, 90% of comments are complaining about one little syntax detail...

I'll make another attempt at a comment. First, the status quo that you are competing against :

// zig status quo
fn myFunc(
    x: f32,
    y: f32,
    z: f32,
) f32 {
    ...
}

The nice advantage of your proposal, compared to this, is seeing the argument names in one line, similar to how it may look when being called. That's a lot easier for the human to scan than when arguments are broken up into their own lines.

In your proposal the place of the return type is hard to distinguish (this wasn't a problem in C). I think this will need to be improved somehow.

Also, I notice you have another proposal in your proposal, which is the Go-style x, y, z: f32 typing shorthand. Maybe we can evaluate the proposal without that first (forget the semicolons for now too). (Btw, my personal experience is that my big-long functions don't often have consecutive arguments with the same type, so I don't benefit much from this syntax.)

So maybe like this?

// tweaked proposal
fn myFunc(x, y, z)
    x: f32,
    y: f32,
    z: f32,
f32 {
    ...
}

What if you did want to fit it in one line? Technically, this works, but it will get increasingly annoying if the argument name is longer, and I think the return type syntax is not good enough, may need a new glyph reintroduced into the language like -> or something.

// small function
fn myFunc(x) x: f32, f32 {
    ...
}

dbandstra on 5 Jun 2020

Thanks for the clarification.

From a usability standpoint, I dislike the requirement of having to replicate the variable names (and presumably their order). If y is removed, or the name is changed, now there is the extra step of following that change through where its type is specified. I don't see how that cost justifies the value in more but shorter lines of code. Zig is not known for its brevity.

JesseRMeyer on 5 Jun 2020

@dbandstra Thanks for the detailed reply! As you said, improved readability is my aim, and a big part of that is the natural indent, so your tweak would run contrary to that. I could however see something like this:

fn myFunc(x, y, z)
    x, y, z: f32,
    -> f32 
{
    ...
}

This would preserve the indent while still making the return type stand out clearly. As for the opening brace, IMO it is necessary to put it on its own line simply as a visual separation between the declaration and the implementation.

Something that just occurred to me is that it would not always be necessary to specify the parameter types at all - when they are var, one could simply leave out the type declaration completely. (That of course could be done independently from my proposal - though it would then preclude the possibility of specifying a comma-separated list of parameters with a single type specifier at the end.)

@JesseRMeyer The order of the type declarations would not matter, though of course a wild jumble would do anything except help readability. However, I imagine that a good editor could help with that.
Another possibility could be to allow a kind of "bullet" as an (optional) replacement for the parameter name, perhaps like this:

fn myFunc(x, y, z)
    *, *, *: f32,
    -> f32 
{
    ...
}

JPGygax68 on 5 Jun 2020

Bullets would only help if the parameters all share the same type. Otherwise you have to distinguish them, and in the limit, where every parameter type is unique, we have the opposite problem of maximizing code lines. So from a more general point of view, why do you value more lines of shorter code over a single longer line of code?

JesseRMeyer on 5 Jun 2020

@JesseRMeyer Scrolling vertically, which you usually have to do anyway, to me seems less disruptive than having to scan long lines.

Also, if you want to comment your parameters, you have no choice but to go multiline, with or without my proposal.

JPGygax68 on 5 Jun 2020

Interesting! I've never felt a disruption for either scanning or scrolling in of themselves. As someone who was raised to first read a left to right language, I do not feel a disruption reading along a horizontal axis.

Comments - True, although I don't inline comments for parameters, and I don't know of any API I use that does.

JesseRMeyer on 5 Jun 2020

I only read left-to-right languages myself. But I've always felt it easier to scan something short that stands out, like a headline.
EDIT: as for comments, the only alternative to inlining is repeating all parameters in a separate comment block, javadoc-like. Not my preference.

JPGygax68 on 5 Jun 2020

I definitely understand the value of separating parameters across individual lines if your comment / documenting preference is inline.

JesseRMeyer on 5 Jun 2020

// function without args
pub const main = fn() -> anyerror!void {

// function with one arg
pub const init = fn(allocator) allocator: *Allocator -> !@This() {

// member function
pub const add = fn(self, other) self, other: @This() -> @This() {

// type declaration function
pub const ArrayList = fn(ElementT) comptime ElementT: type -> type {

// function taking both comptime and runtime parameters
pub const alignedAlloc = fn(self, T, alignment, n)
    self: *Allocator,
    comptime T: type,
    comptime alignment: ?u29,
    n: usize,
-> Error![]align(alignment orelse @alignOf(T)) T {

// function taking parameter of unknown type
pub const expectEqual = fn(expected, actual)
    expected: var, actual: @TypeOf(expected) -> void {

// function with unused argument
pub const exp_zero_case = fn(random, _)
    random: *Random, f64 -> f64 {

// function type:
pub const PFN_AllocationFunction = fn(pUserData, size, alignment, allocationScope)
    pUserData: ?*c_void,
    size, alignment: usize,
    allocationScope: SystemAllocationScope,
-> ?*c_void callconv(.Stdcall);
// not really sure where to put `callconv`,
// between -> and return type doesn't make much sense,
// and before -> is also weird.  It needs to come after the
// declarations though, because they are in scope inside
// the callconv expression.

// function type, potential nameless syntax?
pub const PFN_DebugReportCallbackEXT = fn(_,_,_,_,_,_,_,_)
    DebugReportFlagsEXT.IntType,
    DebugReportObjectTypeEXT,
    u64,
    usize,
    i32,
    ?[*:0]const u8,
    ?[*:0]const u8,
    ?*c_void,
-> Bool32 callconv(.Stdcall);

// really big function
// too many args to fit on one line
pub const debugReportCallback = fn(
    flags,
    objectType,
    object,
    location,
    messageCode,
    pLayerPrefix,
    pMessage,
    pUserData,
)
    flags: DebugReportFlagsEXT.IntType,
    objectType: DebugReportObjectTypeEXT,
    object: u64,
    location: usize,
    messageCode: i32,
    pLayerPrefix: ?[*:0]const u8,
    pMessage: ?[*:0]const u8,
    pUserData: ?*c_void,
-> Bool32 callconv(.Stdcall) {

// function taking function parameter
pub const sort = fn(T, items, lessThan)
    comptime T: type,
    items: []T,
    lessThan: fn(lhs, rhs) lhs, rhs: T -> bool,
-> void {

// function taking only one function parameter
pub const registerCallback = fn(callback) callback: fn() -> void -> void {

// extern function header
pub extern "kernel32" const CreateEventExW: fn(lpEventAttributes, lpName, dwFlags, dwDesiredAccess)
    lpEventAttributes: ?*SECURITY_ATTRIBUTES,
    lpName: [*:0]const u16,
    dwFlags: DWORD,
    dwDesiredAccess: DWORD,
-> ?HANDLE callconv(.Stdcall);

SpexGuy on 5 Jun 2020

👍3

@SpexGuy Thank you a ton for this analysis, way better and more thorough than I could ever have asked for!
Regarding the intent, yes I was aiming to promote a one parameter per line style, except when working with operand-like signatures where consecutive parameters share a common type. Promote though, not force - I would see little reason to outright forbid the current syntax.

Your examples are absolutely fantastic; your alignedAlloc() I would have said makes the strongest case in my favor, tied perhaps with CreateEventExW(). I do notice though that you never indented the return type specifier - this IMO takes away from my idea of making the first line stand out like a heading before a block of text; (this same idea would also imply placing the opening brace at the beginning of the line following the return type specifier).

The debugReportCallback() example is a clear case where the separation makes no sense at all: when the parameter list is so long that it takes multiple lines anyway, then any advantage my approach might have had is lost and replaced with a nightmare.

Overall, I think I have to concede defeat. The rule of thumb of going multiline (with the zig fmt standard formatting`) when there are more than 2 parameters makes a lot of sense, and makes my proposal appear unnecessary.

JPGygax68 on 6 Jun 2020

why can't we have