Zig: Nit: anytype is a really inconsistent name

Created on 17 Jul 2020  Â·  31Comments  Â·  Source: ziglang/zig

Sorry, this is kind of nitpicking, but it's been bothering me.

The value of a variable of type anyerror is some error.
The value of a variable of type anyframe is some frame.
The value of a variable of type anytype is usually not a type.

Said another way,
A variable that holds some unknown error is an anyerror.
A variable that holds some unknown frame is an anyframe.
A variable that holds some unknown type is a type.
A variable that holds some unknown value is an anytype?

var was a bad name because it was overloaded with the mutability specifier. But aside from that, IMO it was an ok name. I don't think we should restore it but I think anytype is also confusing. any might be a good choice except that it is a common enough variable or function name that the language probably shouldn't reserve it. infer was suggested previously, but doesn't convey that the type can change over the variable's lifetime. Maybe anyvalue, anyval, or vartype would be better?

proposal

Most helpful comment

Easy there, I can sense the passions in this thread escalating a bit. Let's all take a moment to remind ourselves that we have each others' best interests in mind, and try to keep the arguments focused on the technical bits. One thing to remember is that the ultimate decisions here will be based on the technical points made, and nothing else. If somebody makes a bad argument, I promise you I know it is a bad argument, you can just let it slide. Please, for my sake :sweat_smile:

All 31 comments

I agree, and personally like anyvalue and anyval.

Important part was missed - declaration.

i: u32 // i is u32 variable
T: type // T can be type only
writer: anytype // writer is variable with any type, not unknown value

What's problem?

+1 for anyval. I think that's much better.

I agree, and personally like anyvalue and anyval.

Seconded. I found it terribly confusing when the ZIg formatter started changing my varS to anytypeS. I would have expectd anytype to really be a type, but it's apparently just the new name to var.

Important part was missed - declaration.

i: u32 // i is u32 variable
T: type // T can be type only
writer: anytype // writer is variable with any type, not unknown value

What's problem?

@data-man Something like so:

const std = @import("std");

fn printAnyType(obj: anytype) void {
    std.debug.print("{}, {}, {}\n", .{ obj.@"foo", obj.@"bar", obj.@"baz" });
}

test "printAnyType" {
    printAnyType(.{ .foo = "Hello", .bar = 1, .baz = false });
}

where I would have assumed anytype to be a type, not a value. Instead, it is sort of like a generic object, a value. In a purely functional dependently-typed language like Idris, maybe types and values are truly equivalent, but in an imperative language, it feels jarring.

+1 for anyvalue since we have anyerror not anyerr, but anyval sounds nice too.

My opinion:
A value of anyerror is an error value. anyerror is the "supertype" of all more specialized error types.
A value of anyframe is a frame value. anyframe is the "supertype" of all more specialized frame types.
A value of the proposed anyvalue/anyval will be... a value value. Which kind of makes sense, because Zig treats everything, including types, conceptually as values. But when asking "what types of values can I expect" the answer is "value values". It doesn't seem quite as intuitive to me.

I suggest plain any. Then it's clear it's whatever you can think of - functions, types, values, whatever you can pass as an argument / bind to the name.
I think it's also clearer that any is even less specialized than anyerror and anyframe.

I hard disagree with this.
Personally, I believe a type definition answers the question "What type is this?".

a: anyerror -> The type of a is any error.
b: anyframe -> The type of b is any frame.
c: anytype -> The type of c is any type.
d: u32 -> The type of d is u32.

Nothing is ambiguous, the only reason you have an ambiguity here is because you are omitting the "The type of" statement and transforming the statement around a bit (which is ususally fine) but for some reason you doing it incorrectly for anytype.

d: u32 -> d is a u32
But instead of this: c: anytype -> c is an any type (which is a weird thing to say, but its correct)
You're saying this c: anytype -> c is any type. (wrong, and can be interpreted wrong as saying that c itself is a type).

Additionally there is a bit of a misunderstanding conflating values themselves to types to superclasses of types.

A variable that holds some unknown value is an anytype?

A variable that hold some unknown value is just a redundant way of saying "a variable".
Up there, d is also has some unknown value, the value has nothing to do with it, you can clearly see why anyvalue is wrong below.

e: anyvalue -> The type of e is any value [ of any type?]

This suggests that [the type of] e could be a value like 2. But obviously the type of e must still be a type (hence anytype).

Anyway for more on superclasses, anyframe isn't actually a specific type, it's a superclass where any specific frame type for each function can be transformed into it. Same for errors, anyerror is a superclass where any specific error enum type can be turned into it.

I understand why all this confusion specifically happens for anytype though because Zig treats types like values. So you have to deal with strange (at first glance) statements like this.

c: antype -> c is an any type (including a type itself)

Then the misinterpretation "c is any type" isn't unambiguously wrong, because technically c could be a type.

Anyway I'm sorry if my response is long and convoluted, but please read through it again if you don't understand. I spent a long time thinking about why this was wrong but I couldn't quite put it into words. It might not be abundantly clear, but I assure you, antype is not ambiguous and is correct.

@AssortedFantasy You seem to be putting a lot of weight on a specific translation of Zig syntax into English, and moreover on a specific syntactic ambiguity which this particular translation has. There are other ways of thinking, and many of them arrive at different places. Just because something makes sense to you doesn't mean it's the only or best possible way of thinking.

anyerror is the supertype of all errors.
anyframe is the supertype of all frames.
anytype is...the supertype of all types? Well, yes, but that's hardly a faithful description.

but I assure you, antype is not ambiguous and is correct.

This is rather dismissive and rude. Also, from the very fact that this thread exists, evidence suggests that its intent and meaning are not as clear as you're implying.

Easy there, I can sense the passions in this thread escalating a bit. Let's all take a moment to remind ourselves that we have each others' best interests in mind, and try to keep the arguments focused on the technical bits. One thing to remember is that the ultimate decisions here will be based on the technical points made, and nothing else. If somebody makes a bad argument, I promise you I know it is a bad argument, you can just let it slide. Please, for my sake :sweat_smile:

@EleanorNB @timmyjose I understand I might come off as a bit aggressive, Sorry about that and I'll try to be a bit more agreeable from now on if that makes for better debate.

I was trying to avoid the word "supertype" and "superclass" because its not really what I meant, it implies that it's somehow a different structure than a type itself, a higher level abstraction of sorts, but its not, it's just a union of types (which is still just a type).

Let's get away from shakey language and maybe I can show my same point again with pure mathematics, then I can show why I personally believe anytype is a faithful and consistent name.

In mathematics, when you're being formal, you need to specify the domain (and codomain) of your functions when you define them.

For a function f(x,y) = ln(x)+y you'd need to specify something like x ∈ R+ and y ∈ R. I believe that Zig should try to match this.
Basically I think the little : is supposed to be an element of symbol.

So then fn foo(x: u32, y: i32) i32 means

x ∈ u32 and y ∈ i32.

anyerror is a cool thing that's supposed to be the equivalent of a mathematical union of all errors. anyframe is supposed to be a union of all frames, and lastly anytype is supposed to be a union of all types.

Basically I don't believe

fn bar(k : anytype) is ambiguous, because a union of types makes perfect sense mathematically.

k ∈ u0 ∪ u1 ∪ ... ∪ i0 ∪ i1 ∪ ... ∪ f64 ∪ ... ∪ struct {a: i32, b:i32} ∪ ... type

The above is correct, k actually is an element of any of those things. And it also shows why I don't like using anyvalue, In my opinion any value doesn't make sense because a type itself is just a union of values. So 2 is a value, an instance of a point struct is a value, but k is not an element of 2, and k is not an element of a instance of a point struct.

I hope maybe this makes a bit more sense and is less angry sounding.

To complete my argument from the last thing, my point about how the conflating confusion shows up is also shown in the mathematics. The very last thing at the end of the union, type happens to be a set containing everything that makes up that anytype union, including type itself. Which technically isn't mathematically sound (violates axiom of regularity) and makes the language a bit confusing, but ¯\_(ツ)_/¯, at least it's rigorous.

type = { u0, u1, ... , i0, i1, ... f64, ... ,type}

Since unions of types are still types (which are themselves sets of values) then type is a powerset of anytpe. Regardless, even with this weird recursive structure a single value by itself never appears on the left hand side (though all sets containing a single value does). So you can again see.

2 ∩ anytype = {} (because 2 is not a type).

The conflatement happens becuse [in Zig] a union of all types is not completely different from a union of all values (anyvalue). Specifically the intersection is the set of all types, because types are values (but values are not types).

anytype ∩ anyvalue = type

One way to linguistically resolve this inconsistency is to say that anyvalue is not the union of all values. But is instead a set containing all values (the set of all values would be a type).

This seemed reasonable until I realized that the only reason this makes sense grammatically is that it manages to avoid defining/mentioning the domain that you're talking about. It's the set of all things in some unmentioned universe of values. It's poorly defined, you might take it to be any value of a f128 for example, but it's not, its any value of a ... anything? It's explicitly mentioned what this universe is with the word anytype (when its taken to mean the union of all types), this universe is anything that can be an element of any Zig type.

TLDR; I think the fact that anytype makes functions generic but anyerror and anyframe don't warrant a different syntax and/or naming scheme

I think the confusion here is caused by an inconsistency with the "any*" types.

anytype is not a type in and of itself. It is a "placeholder" for other types. Declaring an anytype parameter is what makes functions generic. The word "any" in this case refers to the type itself meaning it can be substituted with "any type".

This is in contrast with anyerror and anyframe which are actual types and do not make functions generic. They are union types which can hold "any value" from any error type or any frame type respectively. Here the word "any" means the type that can represent "any value", whereas with anytype, "any" means the type is a generic placeholder that accepts "any type".

const std = @import("std");

fn takeAnyError(x: anyerror) void {}
fn takeAnyFrame(x: anyframe) void {}
fn takeAnyType(x: anytype) void { }

pub fn main() void {
    std.debug.print("{}\n", .{&takeAnyError});
    std.debug.print("{}\n", .{&takeAnyFrame});
    // compile error because takeAnyType is generic
    //std.debug.print("{}", .{&takeAnyType});
}

At first glance, all 3 takeAny* functions appear as if they would behave similarly, but they are actually very different because only takeAnyType is generic. Regardless of the actual names, I think using a different naming convention to distinguish between generic and non-generic behavior would be more clear. Whether that means renaming anyerror/anyframe or anytype doesn't seem important but having a clear way to distinguish between them seems more so.

That is a good point. Perhaps vartype? That communicates that the type is variable, and doesn't overload syntax.

I like vartype, but type is not variable - it's constant, we just don't know which.

fn a(b: vartype) void {
    // @TypeOf(b) is *constant*, and will never change within a function permutation
}

How about something that indicates generic type?

That's not correct, the type can change if the variable is not const. This code is perfectly valid:

comptime {
    const Any = struct { value: anytype };
    var a = Any{ .value = "foo" };
    a.value = 4;
    a.value = Any{ .value = 3.0 };
}

I like any. It has precedent in TypeScript.

I agree with what @SpexGuy said in his original comment:

any might be a good choice except that it is a common enough variable or function name that the language probably shouldn't reserve it.

How about generic, arbitrary, value, anything, etc?

I agree with what @SpexGuy said in his original comment:

any might be a good choice except that it is a common enough variable or function name that the language probably shouldn't reserve it.

In my experience error and test are much more common names than any when writing in languages that don't reserve them.
I think the benefit of having a (in my opinion) nicer keyword is far larger than the occasional inconvenience of choosing a different name.

How about auto?

I was about to suggest autotype.

I like autotype, I think it's the best option I've heard yet.

I might prefer auto over autotype, since it doesn't imply that the
value is of type type, but that might just be me.

With autotype there would not be the same confusion as with anyerror and such, since it would be the only one using "auto". Thus it "automatically" determimes the type.

C++ uses auto for type inference, where the type cannot change. That might cause confusion for people learning the language, and I don't think it really conveys that the type can change. The more I think about it, the more I like any. It's such a generic word that I think it might be ok for the language to reserve, especially since variables or functions with that name can be trivially replaced by "hasAny".

C++ uses auto for type inference, where the type cannot change. That might cause confusion for people learning the language, and I don't think it really conveys that the type can change.

@SpexGuy, you keep emphasizing this point, but I'm not sure I follow. The documentation only mentions anytype for generic function arguments, which surely must remain constant after being inferred in a particular invocation. Your above example with anytype struct fields is interesting, but I couldn't find any actual use of this in the Zig code base. Is this some kind of undocumented feature? Or even intended? I kind of struggle to imagine what a dynamic Any type would even mean in a language that does not use boxed values and runtime dispatch.

@zzyxyzz
anytype as a mutable field is quite a useful feature when writing comptime code in my opinion.
The feature is undocumented but intended.
When you use an anytype field the container type becomes comptime-only.

Afaict atm this is only used in the standard library itself in some std.builtin.TypeInfo structs (e.g. for default values of fields and sentinel values).

However, it is also very useful for building up tuples.
Here is an example from ctregex:

        // args is `anytype`, will be apssed to a std.fmt function
        // We intercept u21 unicode codepoint values and encode them as utf8 
        const ArgTuple = struct {
            tuple: anytype = .{},
        };
        var arg_list = ArgTuple{};
        for (args) |arg| {
            if (@TypeOf(arg) == ?u21) {
                if (arg) |cp| {
                    arg_list.tuple = arg_list.tuple ++ .{ctUtf8EncodeChar(cp)};
                } else {
                    arg_list.tuple = arg_list.tuple ++ .{"null"};
                }
            } else if (@TypeOf(arg) == u21) {
                arg_list.tuple = arg_list.tuple ++ .{ctUtf8EncodeChar(arg)};
            } else {
                arg_list.tuple = arg_list.tuple ++ .{arg};
            }
        }

I agree with @SpexGuy here, the fact that the type of an anytype (/anyvalue/w.e.) parameter is constant is just a side effect of the fact that the parameter itself is constant.

I kind of struggle to imagine what a dynamic Any type would even mean in a language that does not use boxed values and runtime dispatch.

The type of an anytype cannot change at runtime. But comptime zig is an interpreted language with boxed values and type values and dynamic dispatch and everything, so the type of a variable stored in anytype changing at comptime makes just as much sense as the type of a var changing in JS. It's currently restricted to function parameters and struct members, but it could easily be extended to be allowed as the given type of a local comptime var in the future. I'm hoping this happens, because the Any struct workaround gives all the same functionality but is just more typing.

anytype fields are currently used by std.builtin.TypeInfo.Pointer to represent the sentinel. In order to robustly construct the type *<modifiers>[1]T from *<modifiers>T, for example, modifiers on the pointer need to be preserved. The maintainable approach is to use @typeInfo to get a type info for the pointer and modifiers, alter the target type of the pointer, and then use @Type to construct the altered type. But in order to do this, the type of the sentinel field must also change from @as(?T, null) to @as(?[1]T, null). That requires a struct member that can change types at comptime, so this behavior probably isn't going away.

@alexnask, @SpexGuy
Thanks for the explanations. Makes sense.

I would actually argue that this makes the two uses of anytype quite distinct: One for creating generic functions and one for representing comptime mutable types. That both use the same keyword is economical, but confusing. Maybe the two uses should simply be split, with auto indicating a generic parameter and any or anyval or whatever standing for a comptime dynamic type. Of course that would reserve two short keywords...

I propose _anyitem_ as _item_ is in English language more holistic than _object._ It conveys the expression of a very generic concept like an idea, a person, a place or even a physical or conceptual object.

As for me, _anyitem_ may be adequate to signify any logical concept as abstact as a value of a value.

Anytype looks a bit misleading to me. Lets look if the type system is consistent:

  • anyerror, variable/parameter compatible with all error values.
  • anyframe, var/param compatible with all frame pointers.
  • type, var/param compatible with all types. Requires comptime. One of the earliest types. Sometimes its inconvenient that type is a keyword.
  • anytype, its a type/keyword(??) that infers a type of a value.

Lets see what TypeOf will produce:

pub fn abc(a: anyerror, b: anyframe, comptime c: type, d: anytype) void {
  @compileLog(@TypeOf(a), @TypeOf(b), @TypeOf(c), @TypeOf(d));
}
comptime { 
  abc(undefined, undefined, i32, @as(i32, 5));
}

| anyerror, anyframe, type, i32
<source>:2:3: error: found compile log statement

@TypeOf(a) == anyerror; @TypeOf(b) == anyframe; @TypeOf(c) == type; @TypeOf(d) != ~anytype~
//@TypeOf(anytype);
//error: expected token ')', found 'anytype'

As you can see that symmetry does not hold. I believe that this type/keyword should not be prefixed with any. The type of types may be renamed to anytype, that would allow to create fields and variables named type using the simple syntax.

Was this page helpful?
0 / 5 - 0 ratings