Go: proposal: Go 2: operator functions

Created on 10 Sep 2018  Â·  43Comments  Â·  Source: golang/go

The only issue that I could _find_ about operator overloading currently #19770, although it's currently closed and doesn't have many details.

Goal

Operator overloading should be used to create datatypes that represent things that already exist in Go. They should not represent anything else, and should ideally have no other function.

New operators should not be allowed to defined, we should only be able to define currently existing operators. If you look at languages that let you define your own operators (Scala, looking at you!) the code becomes extremely messy and hard to read. It almost _requires_ an IDE because operator overloading is very heavily used.

Using an operator for anything _other_ than it's purpose is a bad idea. We should not be making data structures with fancy looking + operators to add things to the structure. Overloading the + operator in Go should _only_ be used for defining addition/concatenation operators for non-builtin types. Also, as per Go spec, binary operators should _only_ be used for operating on two values of the same type.

Operators should only operate on and return a single type. This keeps things consistent with how operators currently work in Go. We shouldn't allow any type1 + string -> type1 stuff.

Operators should only be defined in the same package as the type they are defined on. Same rule as methods. You can't define methods for structs outside your package, and you shouldn't be able to do this with operators either.

And last but not least, operators should never mutate their operands. This should be a contract that should be listed in the Go spec. This makes operator functions predictable, which is how they should be.

Unary operators should not need to be overloaded.

+x                          is 0 + x
-x    negation              is 0 - x
^x    bitwise complement    is m ^ x  with m = "all bits set to 1" for unsigned x
                                      and  m = -1 for signed x

This part of the spec should always remain true, and should also remain true for anything using these operators. Perhaps ^x may need to have it's own, as there's no good way to define "all bits set to 1" for an arbitrary type, although defining a .Invert() function is no less readable IMO.

Unary operations on structs would then therefore be Type{} + t or Type{} - t, and pointers would be nil + t and nil - t. These may have to be special cases in the implementation of operator functions on pointers to types.

Assignment operators should also never be overloaded.

An assignment operation x op= y where op is a binary arithmetic operator is equivalent to x = x op (y) but evaluates x only once. The op= construct is a single token.

This should remain the same just as unary operators.

If we do not permit overloading the ^x unary operator, this means that we only need to define binary operations.

Issues/Projects aided by operator overloading

19787 - Decimal floating point (IEEE 754-2008)

26699 - Same proposal, more detail

19623 - Changing int to be arbitrary precision

9455 - Adding int128 and uint128

this code - Seriously it's gross
really anything that uses math/big that isn't micro-optimized
If I went searching for longer, there'd probably be a few more that pop up

Syntax

What's a proposal without proposed syntaxes?

// A modular integer.
type Int struct {
    Val int
    Mod int
}

// ==============
// each of the functions would have the following function body:
//
//      if a == Int{} { // handle unary +
//      a.Mod = b.Mod
//  }
//
//  checkMod(a, b)
//  nextInt = Int{Val: a.Val + b.Val, Mod: a.Mod}
//  nextInt.Reduce()
//
//  return nextInt
//
// ==============

// In all of these, it would result in a compile error if the types of the arguments
// do not match each other and the return type.

// My new favorite. This makes for a simple grammar. It allows
// people who prefer function calls can instead use the `Add` function.
operator (Int + Int) Add
func Add(a, b Int) Int { ... }

// My old favorite. Abandoning the `func` construct clarifies
// that these should not be used like a standard function, and is much
// more clear that all arguments and the return type must be equal.
op(Int) (a + b) { ... }
operator(Int) (a + b) { ... }      // <- I like this better

// My old second favorite, although this looks a lot like a standard method definition.
// Maybe a good thing?
func (a + b Int) Int { ... }

// It can be fixed with adding an "op" to signify it's an operator function, although
// I do not like it because it just reads wrong. Also, looks like we're defining a package-level
// function named `op`... which is not what we are doing.
func op (a + b Int) Int { ... }

// Although at the same time, I don't like having words
// before the `func`... I feel that all function declarations should begin with `func`
op func (a + b Int) Int { ... }

// Another idea could just be to define a method named "Plus", although this
// would cause confusion between functions like `big.Int.Plus` vs `big.Int.Add`.
// We probably need to preserve `big.Int.Add` for microoptimization purposes.
func (a Int) Plus(b Int) Int { ... }

Considering other languages' implementations.

C++

// there's several ways to declare, but we'll use this one
Type operator+(Type a, Type b)

I think C++ isn't a bad language, but there are a lot of new programmers who use it and think it's "super cool" to implement operator functions for everything they make, including stuff like overloading the = operator (which I have seen before).

I also have a couple friends from college who really enjoyed defining operator functions for _everything_... no bueno.

It gives too much power to the programmer to do everything that they want to do. Doing this creates messy code.

Swift

static func +(a: Type, b: Type) -> Type

Note that custom operators may be defined, and you can define stuff like the operator's precedence. I have not looked much into how these operators end up being used, though.

C

public static Type operator+ (Type a, Type b)

Operator functions in C# end up being massively overused in my experience. People define all of the operators for all of their data structures. Might just be a consequence of using a language with many features, though.

Kotlin

operator fun plus(b: Type): Type // use "this" for left hand side

https://kotlinlang.org/docs/reference/operator-overloading.html, actually a really nice read.

Operator functions get used _everywhere_, and even the standard library is _littered_ with them. Using them leads to unreadable code. For instance, what does mutableList + elem mean? Does it mutate mutableList? Does it return a new MutableList instance? No one knows without looking at the documentation.

Also, defining it as a method instead of a separate function just _begs_ it to mutate this. We do not want to encourage this.

Open Questions

Which operators should be allowed to be overridden?

So, the mathematical operators +, -, *, /, % are pretty clear that if this gets implemented, we'd want to overload these operators.

List of remaining operators that should be considered for overloading:

  • << and >> (I do not think this is a good idea)
  • |, &, and &^ (also do not think this is a good idea)
  • <, >, <=, >=, ==, != (maybe a good idea?)

    • If we include <, we should include == to prevent the confusing case of x <= y && y >= x but x != y.

Overloading equality _may_ be a good thing. big.Int suffers because the only way to test equality is with a.Cmp(b) == 0 which is not readable _at all_.

I have left out || and && because they should be reserved exclusively for bool or types based on bool (has anyone ever even based a type on bool?) and see no reason to override them.

Should we even allow operator overloading on pointer types?

Allowing operator overloading on a pointer type means the possibility of mutating, which we do not want. On the other hand, allowing pointer types means less copying, especially for large structures such as matrices. This question would be resolved if the read only types proposal is accepted.

Disallowing pointer types
  • Does not allow mutation
  • No need for nil checks in operator implementation
Allowing pointer types
  • Allows operators to be consistent with the rest of the type's methods.

    • ie *big.Int is *big.Int everywhere else, it would be good for consistiency

  • Since it's consistent, it makes it easier to pass into other functions.

    • ie Can't pass big.Int into a function that takes *big.Int

Perhaps it should be a compile-time error to mutate a pointer in an operator function. If read-only types were added then we could require the parameters to be read-only.

Should it reference/dereference as needed?

Methods currently do this with their receivers. For instance:

// Because the receiver for `NewInt` is `*big.Int`,
// the second line is equivalent to `(&num).NewInt(....)`

var num big.Int
num.NewInt(5000000000000000000)

So should the same logic apply to operator functions?


I'm aware that this will probably not be added to Go 2, but I figured it would be a good thing to make an issue for, since the current issue for Operator Functions is quite small and, well, it's closed.


Edits:

  • Added #9455 to list of proposals which could be solved using this instead
  • Fixed a clarity issue
  • Changed examples to use a modular integer rather than a *big.Int since the math/big package is designed to be used in an efficient way, and added that read-only types would benefit this proposal
Go2 LanguageChange NeedsInvestigation Proposal

Most helpful comment

It also increase the learning cost of the language.

I would argue the opposite. The current implementation of the math/big package is confusing and very hard to use. A very big improvement would be if it would just support operators and would act like the base number types. That would make it very easy to learn and use.

It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

The code readability is greatly improved. Again, math/big clearly illustrates that. As for "where is the code defined", that has little to do with readability, and the answer, as is right now, is of course: https://godoc.org/math/big

All 43 comments

@gopherbot please add go2

I do not like this idea. Because:

  • It increase the complexity of syntax, compilers, tools that read the language.
  • It also increase the learning cost of the language.
  • It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

It increase the complexity of syntax, compilers, tools that read the language.

Of course it does, it's a new feature. All features inevatibly will do this. Although if we're worrying about complicating anything, it should be the spec. Luckily a lot of the rules remain the same, the only things that change are allowing more things to be "operated" together with arithmetic operators.

It also increase the learning cost of the language.

Again, so does every new feature, and I'd say the learning cost is very small. People typically don't need to write operator functions anyway, as there are meant to be used much more than they are written.

It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

That is a valid critique, but ideally you should not be worried about what it does. It doesn't matter _how_ 1+2 works, you care that it adds properly and you get a 3 in return. Operator functions should not do anything other than what you expect the operator to do.

I'd say it actually increases readability as well. Again, look at literally anything that uses math/big like this quick example. People who need stuff like math/big or a future decimal64 library can take advantage of this. Honestly, APIs for these things are quite terrible, and need to be because we need workarounds for these operators.

Also, it's defined the same place that methods are. In the same package as the type itself! So really the question "where is it defined" is a question that we already have the tools to answer.

It also increase the learning cost of the language.

I would argue the opposite. The current implementation of the math/big package is confusing and very hard to use. A very big improvement would be if it would just support operators and would act like the base number types. That would make it very easy to learn and use.

It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

The code readability is greatly improved. Again, math/big clearly illustrates that. As for "where is the code defined", that has little to do with readability, and the answer, as is right now, is of course: https://godoc.org/math/big

Another area that may be improved with such operator functions is the current generics proposal.
As it stands right now, writing a function to add 2 numbers would only be limited to the current numeric types. A hypothetical Sum generic function will never be able to work with big.Int, for example, or say, a custom rational number type.

I disagree. Only a small percentage of the math/big api can be replaced with the small set of binary operators defined in the language.

On 11 Sep 2018, at 17:47, Viktor Kojouharov notifications@github.com wrote:

It also increase the learning cost of the language.

I would argue the opposite. The current implementation of the math/big package is confusing and very hard to use. A very big improvement would be if it would just support operators and would act like the base number types. That would make it very easy to learn and use.

It decrease the readable of the code. It is much more difficult to answer the question "where is the code defined?"

The code readability is greatly improved. Again, math/big clearly illustrates that. As for "where is the code defined", that has little to do with readability, and the answer, as is right now, is of course: https://godoc.org/math/big

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

// each of the functions would have the following function body:
{
    return new(big.Int).Add(a, b)
}

I would prefer not to allocate a potentially large amount of memory for every arithmetic operation, especially if that operation looks like a normal arithmetic operation.

The package math/big is both _well-designed_ and has a learning curve. The two qualifiers are conflated in some of the examples and comments here. The stated goal of this package was to provide efficient manipulation of big numbers. The current proposal should perform no worse than the current state of affairs.

I disagree. Only a small percentage of the math/big api can be replaced with the small set of binary operators defined in the language.

Yes, only a small _percentage_ can be replaced, but what about how often those are used? (Also, this really isn't _only_ for math/big, I mainly used it as an example because it has a very steep learning curve, and the code resulting from it is typically very messy and can be hard to maintain.

I would prefer not to allocate a potentially large amount of memory for every arithmetic operation, especially if that operation looks like a normal arithmetic operation.

The package math/big is both well-designed and has a learning curve. The two qualifiers are conflated in some of the examples and comments here. The stated goal of this package was to provide efficient manipulation of big numbers. The current proposal should perform no worse than the current state of affairs.

This is a good critique as well, but when all you want to do is to add numbers without having to worry about overflow, it could be extremely useful.

There could also be a separate type defined somewhere else with a less steep learning curve. After all, there is a pretty popular issue (#19623) to make int arbitrary precision. This could instead be a third party library (or in golang.org/x/) with operator functions defined for it.

Just as a quick demo as to how this could be implemented _outside_ of math/big, which was a bad example (since it's designed to be efficient).

This could also of course be used for other mathematical constructs such as matrices, vectors, etc. Also for arbitrary-precision integers, a bigdecimals, a elliptical-point, etc.

A simple modular integer implementation:

// A modular integer.
//
// `Reduce` should be called if you ever modify `Val` or `Mod` directly.
// As long as this is done, the `==` operator will work properly.
type Int struct {
    Val int
    Mod int
}

// Adds two modular integers and reduces.
// Panics if a and b do not have the same modulus
operator(Int) (a + b) {
    if a == Int{} { // handle unary +
        a.Mod = b.Mod
    }

    checkMod(a, b)
    nextInt = Int{Val: a.Val + b.Val, Mod: a.Mod}
    nextInt.Reduce()

    return nextInt
}

// Subtracts a modular integer from another and reduces.
// Panics if a and b do not have the same modulus.
operator(Int) (a - b) {
    if a == Int{} { // handle unary -
        a.Mod = b.Mod
    }

    checkMod(a, b)
    nextInt = Int{Val: a.Val - b.Val, Mod: a.Mod}
    nextInt.Reduce()

    return nextInt
}

// Multiplies two modular integers and reduces.
// Panics if a and b do not have the same modulus
operator(Int) (a * b) {
    checkMod(a, b)
    nextInt = Int{Val: a.Val * b.Val, Mod: a.Mod}
    nextInt.Reduce()

    return nextInt
}

// sets i.Val to its least nonnegative residue
func (i *Int) Reduce() {
    if i.Val >= 0 && i.Val < i.Mod {
        return
    }
    i.Val = (i.Val%i.Mod + i.Mod)%i.Mod
}

// makes sure a and b have the same modulus.
// if not, it panics.
func checkMod(a, b Int) {
    if a.Mod != b.Mod {
        panic(fmt.Errorf("mod is not equal: a(%d) b(%d)", a.Mod, b.Mod))
    }
}

And of course the usage:

func main() {
    nine := modmath.Int{ Val: 9, Mod: 2 }
    ten := modmath.Int{ Val: 10, Mod: 2 }
    twentyOne := modmath.Int{ Val: 21, Mod: 2 }.Reduce()

    fmt.Printf("9 + 10 == 21 (mod 2):", x + y == twentyOne)
    // output: true
}

With operators there are 5 goals:

  1. Disallow operating on different types
  2. Not wildly exported to other packages, force re-declaring
  3. Disallow mutating input args
  4. Interact with generics well
  5. Only one function that returns integer for operators <,>,<=,>=,==,!= negative positive and equal (zero)

I propose the following. Operators should be a function calls with functions that are in general:

type T struct{}
func MyCustomAdd(*T , *T) *T {...}
func MyCustomSub(*T , *T) *T {...}
func MyCustomMul(*T , *T) *T {...}
func MyCustomMod(*T , *T) *T {...}
func MyCustomShr(*T , uint) *T {...}
func MyCustomShl(*T , uint) *T {...}
func MyCustomBwAnd(*T , *T) *T {...}
func MyCustomBwOr(*T , *T) *T {...}
func MyCustomBwClr(*T , *T) *T {...}
func MyCustomCompare(*T , *T) int {...}

Comparison is special, it is the "default operator", and maps all of the operators <,>,<=,>=,==,!=

Comparison is written like:

op default MyCustomCompare

Next the user can define also the other operators:

op + MyCustomAdd

If MyCustomAdd does not match the generic signature func ( * , * ) (*) this would be a compile time error. This satisfies point 1. Also, defining operator on a type that already has it would be a compile time error.

If the function is capitalized then the operator can be exported but must be also declared in the remote package, like so:

op + otherpackage.MyCustomAdd

This satisfies point 2.

All functions linked to operator will go thru additional type checking, to detect cases where you write to args and where you return something else than pointer to your own variable (this rule bans returning nil, also bans returning a and b):

func MyCustomAdd(a *MyInt ,b *MyInt) *MyInt {
    var foo MyInt // this must exist
    a.x = 7 // cannot write to arg
    if (yadda yadda) {
        return nil // this cannot exist
    } else {
       return b // this cannot exist
    }
   return &foo // this must exist
   return //this cannot exist
}

Operators that passed the additional type checking, will be internally optimized to something like

func(a *, b *,ret *)

Those who did not won't.

Next, generic function Maximum:

func Maximum (compare func (*, *)int, ... things) (out *) {...}

Can be specially written as:

func Maximum2 (op default, ... things) (out *) {...}

Now comes the main point: Maximum requires either plain generic function or auto-passed "unoptimized operator".
Maximum2 requires optimized operator. If you pass optimized operator to function that requires unoptimized one, a wrapper code will be generated and the wrapper passed.

Operators can be auto passed, like this:

Maximum(_, T{},T{},T{},T{} )
Maximum2(_, T{},T{},T{},T{} )

And used inside the generic function:

func Maximum2 (op default, things ... ) (out *) {
   if things[1] > things[0] {
          return &things[1]
    }
    return &things[0]
}

And the best part: compiler takes care of the boring details.
Like: promoting values to pointers when passing to operator functions. Creating wrappers for operators. Typechecking operator functions and returning compile time errors when you pass unoptimizable operator function to a generic function that has op param.

Could you edit to have your examples in code fences?

like this

would be
```
like this
```

Maybe a simpler syntax is also okay?

func (a mytype) +(b mytype) (mytype) {
     return a.Add(b)
}

Just like a normal methods, but with an operator rune as name.

To make it sure that operators always return something, and are optimized (by writing it's result to a specific memory location using input argument as output), I revise my proposal to:

type T struct{}
func MyCustomAdd(*T , *T, *T) *T {...}
func MyCustomSub(*T , *T, *T) *T {...}
func MyCustomMul(*T , *T, *T) *T {...}
func MyCustomMod(*T , *T, *T) *T {...}
func MyCustomShr(*T , uint, *T) *T {...}
func MyCustomShl(*T , uint, *T) *T {...}
func MyCustomBwAnd(*T , *T, *T) *T {...}
func MyCustomBwOr(*T , *T, *T) *T {...}
func MyCustomBwClr(*T , *T, *T) *T {...}
func MyCustomCompare(*T , *T) int {...}

If operator is passed to a generic function, the operator function must return it's third argument: Example:

func DoAddition(op +, operand1 *, operand2 *, result *) (*) {
    *result = operand1 + operand2
    return result
}

type T struct {x int}
func AddCorrect(a *T, b *T, ret *T) (ret2 *T) {
    (*ret).x = (*a).x + (*b).x
    return ret
}

func AddIncorrect(a *T, b *T, ret *T) (ret2 *T) {
    r := T{(*a).x + (*b).x}
    return &r
}

this will work:

op + AddCorrect
DoAddition(_, &T{}, &T{}, &T{})
// okay

this won't:

op + AddIncorrect
DoAddition(_, &T{}, &T{}, &T{})
// COMPILE TIME ERROR: Operator + mapped to AddIncorrect on line 54,
//does not return it's third argument on line 87774,
//when passed to generic function DoAddition on line 45645

Okay, I think you're micro-optimizing a bit here. The syntax should be simple.

Also, as a minor point, you don't need to dereference before using the . operator. The compiler does that for you since it's part of the spec.

I'd also like to add that read-only types (I'd link the proposal if I weren't on my phone) would work well with this proposal, as operator functions could require that all inputs are read-only.

I think forcing redeclaration of operators may lead to an "operator poisoning" of sorts... It would be especially inconvenient for people who use Go for mathematics to use a matrix, vector, and arbitrary-size int, and need to redeclare 50 operators.

Calling the comparison operators "default" doesn't make any sense to me.

I like the rest of the first part of your proposal, though. The second part not-so-much.

I am not merely speculating, I have a very serious intention to add operator overloading to my experimental language, named go-li. In fact I plan to do it next week, exactly how I propose.

Comparisons: Calling them default is wrong you're right. IMHO Most people would want to map all of them to the same function, in the C language tradition that returns negative int when ab.

take look at this line:

func Foo(op comparison) {...}

How do you know that it is operator comparison, rather than parameter named op of type comparison? You don't know. Backwards compatibility is not 100% bulletproof. For this reason at least one of the words has to be reserved keyword, I propose:

func Foo(op default) {...}

or something like this:

op <=> MyComparison
func Foo(op <=>) {...}

or, alternatively, not using "op" but the "map" keyword to map operators.

map + MyAddition
func Foo(map +, map -, map comparison) {...}

alternatively, not using any arg names:

map + MyAddition
map <=> MyComparison
func Foo( +,  -, <=>, actual_parameter int) {...}

To prove it, here is example of code that uses generics, and would benefit from operator overloading. Fenwick tree (a data-structure that allows you to quickly sum a continuous sub-sequence of a large array.).
Fenwick tree in go-li language
Fenwick tree in folang
Fenwick tree example on folang playground
It needs the addition operator and negation operator , currently implemented using those callback functions.

The thing you probably disliked, is how we pass operators to generic functions. Trust me this is needed, otherwise people who don't prefer operators will have no way to call into generic functions that use operators. In this way they can, by passing the required concrete function that matches the callback signature as the parameter the way it is.

If we don't do this, the use of operators would spread like a disease:
i need package A
package A uses operators
therefore I define operators in my package so A can use them
since i defined operators i start using them (why not? they're already here)
suddenly my code is full of operators
other people need my package
therefore they define operators ... and so on

I'm 100% okay with binding operators to existing function calls. What I don't like is redeclaring the operators in every package that's going to use them.

I really would like a new "op" or "operator" TLD. It would make sure source code stays readable. It could still be backward-compatible as it is a TLD and would only be used as one.

I think the best syntax would be something like

op + CustomAdd

func CustomAdd(*T, *T) *T { ... }

The only issue is that this would allow

op + func (*T, *T) *T { ... }

Sure, you could say "just don't allow inlining here", but I feel like that'd be against Go's simplistic design. Perhaps instead preventing that should just be a go vet check.

TL;DR
Dear "go community", please read how its implemented in C++, then come back to this proposal. Also, you will probably need generics. And this is not trolling. @go-li has actually already brought up many valid points from that ecosystem in this thread.

Long list of comments:

  • Operator overloading beyond "i can do this" brings us to value semantics and abstract algebra, and requires several levels of constraints on types of the arguments to be done right. But it also allows to radically simplify all code dealing with Number-like object (i.e: the ones that have conceptual arithmetic operations available upon. E.g.: ints, floats, bigInts, matrices, complex numbers, quaternions, currency), and also provide convenience for things like Set operations (s1 + s2 == s1.union(s2)).
  • without equality (==) and its siblings (ordering / comparison) operator overloading turns into weak syntax sugar at best. Comparison operator definitions are a quite wordy, and so C++ plans to introduce <=> (starship) operator to simplify them.
  • while forcing totality on types (A, A)=>A sounds logical and cool, it will not let you implement [even such a simple thing as] multiplying matrix by scalar
  • regarding developer ergonomics, if operators are defined on types, then only packages using those types will have new semantics, and so developers would not be surprised by behavior of the operator.

I also don't like the idea of re-declaration. If I import a package that defines operators on its types then I don't have access to those operators? That kinda makes them useless. The whole point of operator overloading is to give them to library consumers. I, as a library implementer, don't care about them and don't really need them.

Also, I don't think we should look at C++. It has too much complexity that's very specific to C++ - value semantics, rvalue/lvalue, move semantics etc etc etc. It looks, frankly, ridiculous how it works rights now and committee understands that. <=> tries to make it slightly better. There're better examples on how it can be done.

Operator declaration without overloading is very useful. With this feature decimals can be made usable, complex numbers can be took out of core, even thinking of a flexible number type gets live.
For me the absolutly useless decimal type today is the reason for woting for operator methods. Operator overloading is no must for me, cause I define a new type when I want to overload, but declaring a new type meens also I can declare new operator methods. Overloading is useful for languages like C++, but (in my opinion) not for Go.

// Barycentric returns the barycentric coordinates for a point in a triangle.
// The barycentric coordinates cna be used for interpolation of {a, b, c} to
// point p (such as normals, texture coordinates, colors, etc)
func Barycentric(a, b, c, p *glm.Vec3) (u, v, w float32) {
    var v0, v1, v2 glm.Vec3
    v0.Sub(b, a)
    v1.Sub(c, a)
    v2.Sub(p, a)
    d00 := v1.Dot(&v1)
    d01 := v0.Dot(&v1)
    d11 := v1.Dot(&v1)
    d20 := v2.Dot(&v0)
    d21 := v2.Dot(&v1)

    denom := 1 / (d00*d11 - d01*d01)
    v = (d11*d20 - d01*d21) * denom
    w = (d00*d21 - d01*d20) * denom
    u = 1 - v - w
    return
}

^ disgusting, error prone, entirely because I can't define +* on vectors

All geometry/physics libraries are full of unreadable code because this feature is missing.

@hydroflame Your example contains a lot of code with little context. Can you provide the example of imagined "non-disgusting, error minimizing" code where you are using the notation?

I would like to add some new insights to this topic:

  • when reading the original design goals of GO I thought this (operator overloading) would never happen
  • when going forward with this I would not reflect upon how other languages do it (see end of text why)
  • when implementing this we will quickly run into secondary decisions as operator overloading is closely linked to function overloading (same function name, different function argument types) and that is also something that GO was supposed to never implement

Proposed method for function overloading is via function names that SHOW both the operator and the position of the arguments.

Example 1:
func "_+_"(a,b int) (sum int) {
    sum = a + b 
    return
}

Example 2:
func "_++"(a *int) (result int) {
    result = *a
    *a = result+1
    return
}

Example 3:
func "++_"(a *int) (result int) {
    result = *a + 1
    *a = result
    return
}

Example 4:
func "_[_]=_"(arr map[int]string, idx string, value string){
    ...
}

Example 5:
func "_._"(arr map[string]int, value string) int {
    return arr[value]
}

As you see this opens much wider possibilities than old-style operator overloading but (and I have implemented this in an compiler/bytecode-interpreter language so I know) this is bothersome and if I would face it again I would not go for operator overloading at all.

The only thing why you want to do this is that GO currently is lacking some of the more excotic data containers but francly. I do not think that GO needs this. It would change the language profoundly in a manner that makes it less distinct and less systematic (readable/predictable).

It also must be implemented on a low level in the parser and compiler so this change would be affect a large part of the code-base of the language.

I reflect on other languages to learn from them by their mistakes. We should avoid the _[_] = _ operator and such similar operators which do not reflect arithmetic types - this proposal is strictly for creating a way to represent arithmetic types in Go. The syntax you provide is intriguing though. I may draft a new proposal in the near future which is less about overloading operators. It's hard to explain in a quick comment but hopefully it should still be simple.

There is an important but implicit (and probably slightly hidden) trade-off in operator overloading, which is visible if you compare the proposed functions/methods signatures with the methods of math/big:

TL;DR:

it is not possible to pass among the arguments a value or pointer where the result will be stored.

This will cause significant slowdown on large objects as arbitrary precision numbers (math/big.Int, math/big.Float, math/big.Rat), arbitrary size matrices (gonum.org/v1/gonum/mat/Dense) etc.

Full explanation:

Let's consider for example the addition between user-defined types: the syntax is not very relevant, it can be any of the existing proposals:

operator (*big.Int + *big.Int) Add
func Add(a, b *big.Int) *big.Int {
 // must allocate a new big.Int and return its address
}

or

operator(*big.Int) (a + b) {
  // idem, must allocate a new big.Int and return its address
}

or anything else.

In all cases, as highlighted in the comments above, the operator implementation must allocate a (possibly large) object where the result is stored. I used pointers *big.Int instead of values to make the issue more obvious.

On the other hand, math/big methods solve this issue by accepting an additional argument where they store the result - currently the receiver:

package big

// Add sets z to the sum x+y and returns z. 
func (z *Int) Add(x, y *Int) *Int {
  // store the result in z:
  // if its internal buffer is large enough, no allocation is needed.
}

Now, one may argue that the approach taken by math/big is a micro-optimization.
I am not really interested in discussing whether it's a "micro" optimization or not, but I think it's a very important optimization:

Imagine, instead of integers or floats, a library for linear algebra on arbitrary size matrices - for example gonum.org/v1/gonum/mat:
if operator overloading is part of Go, authors and users of such library will be tempted to support it.
At the moment, they have methods similar in spirit to math/big, for example:

package mat

// Add adds a and b element-wise, placing the result in the receiver
func (m *Dense) Add(a, b Matrix) {
  // store the result in m: also here, if its internal buffer
  // is large enough, no allocation is needed.
}

but operator overloading will not have the third argument, forcing to allocate it:

operator (Matrix + Matrix) Add
func Add(a, b Matrix) *Dense {
  // must allocate a new Dense and return its address
}

If you then write something like a + b + c + d on matrices, each + will allocate its result, which will need to be garbage collected at the end of the expression (except for the last one) - I expect it will be much slower than the (micro?) optimized

var *Dense a, b, c, d, z
// initialize a, b, c, d and z. Then:
z.Add(a, b)
z.Add(z, c)
z.Add(z, d)

So my question is about the following trade-off:

The (hopefully) improved readability and simplicity of operator overloading (if used correctly and not abused) is worth the slowdown it causes due to multiple allocations and subsequent garbage collection?

This happens if the the arguments of overloaded operators are values with internally allocated (and possibly large) buffers:
for example arbitrary precision numbers (math/big.Int, math/big.Float, math/big.Rat), arbitrary size matrices (gonum.org/v1/gonum/mat/Dense) etc.

P.S. one may attempt to solve this issue by adding an extra parameter also to overloaded operators, as for example:

operator (*big.Int + *big.Int) Add
func (z *big.Int) Add(a, b *big.Int) *big.Int {
  // fill z with the result of a + b and return z
}

but it does not work: it only moves the problem from method declaration to method use, because when invoking such operator+ in many cases there is no receiver:

var a, b, c, z *big.Int
// initialize a, b, c, d and z. Then:

z = a + b // a clever compiler may use z as receiver of operator+

z = a + b + c // z can be the receiver of the second addition.
                    // but who is the receiver of the first addition, i.e. a + b ?

@cosmos72 You make an excellent point, but I note that that compiler can convert z = a + b + c into z = a + b; z = z + c. Any expression no matter how complex is just a sequence of unary and binary operations. So the question is whether and when the compiler can reliably turn

t1 = a + b
z = t1 + c

into

z = a + b
z = z + c

That is, when is it safe to replace a compiler-introduced temporary variable with the final result of the expression, and, similarly, when it is safe to combine compiler-introduced temporaries? It's a problem not dissimilar to register allocation with an unbounded number of registers, although it's possible that there are cases where there is some additional aliasing that must be considered.

Anyhow, I think the main point is, if we have operator functions, when writing the implementation there needs to be a way to express not just the operands but also the destination aka receiver.

Thanks @ianlancetaylor

Sorry for the second long post in a row - the analysis of "compiler-introduced temporary variables" and "aliasing" contains some subtle points and, if operator overloading is added to the language, they are going to affect Go programmers and compilers writers for a long time.

It is thus important to fully understand the consequences in detail, as they are non-trivial - I hope you agree it's the only way to make an informed decision.

Let's start from a concrete example of operator overloading:

package main
import (
  "math/big"
)
var _2 = big.NewRat(2, 1)

// compute z = ax^2 + 2bxy + cy^2 + 2dx + 2fy + g
func QuadraticCurve(z, x, y, a, b, c, d, f, g *big.Rat) *big.Rat {
  z = a*x*x + _2*b*y + c*y*y + _2*d*x + _2*f*y + g
  return z
}

The implementation surely looks enticingly similar to the mathematical formula.

Without operator overloading, the equivalent code is ugly, difficult to read
and requires some thought to write:

// compute z = ax^2 + 2bxy + cy^2 + 2dx + 2fy + g
func QuadraticCurve(z, x, y, a, b, c, d, f, g *big.Rat) *big.Rat {
  var t big.Rat

  z.Mul(a, x).Mul(z, x)               // z = a*x*x

  t.Mul(_2, b).Mul(&t, x).Mul(&t, y)  // t = 2*b*x*y
  z.Add(z, &t)                        // we can now reuse t

  t.Mul(c, y).Mul(&t, y)              // t = c*y*y
  z.Add(z, &t)                        // we can now reuse t

  t.Mul(_2, d).Mul(&t, x)             // t = 2*d*x
  z.Add(z, &t)                        // we can now reuse t

  t.Mul(_2, f).Mul(&t, y)             // t = 2*f*y
  z.Add(z, &t)                        // we can now reuse t

  return z.Add(z, g)
}

There is no need to highlight the advantages of the first version and the disadvantages of the second.
I will just say that the first seems a high-level language, while the second has an assembler-like look,
where variables take the place of CPU registers and are managed explicitly one by one - including temporary variables.

The disadvantages of the first version can be summarized in a question:
can the compiler be sufficiently smart that both compiled versions run equally fast?
(without artificially penalizing the second, obviously)

Let's see what a sufficiently smart compiler should do.

  1. compute how many temporary variables are needed.

This is somewhat similar to register allocation, as you pointed out.
It can be visualized as a tree coloring problem, where the tree is the Abstract Syntax Tree (AST):

      =
      +
     / \________
    *           +
   / \         / \________
  a   *       *           +
     / \     / \         / \________
    x   x   2   *       *           +
               / \     / \         / \________
              b   *   c   *       *           +
                 / \     / \     / \         / \
                x   y   y   y   2   *       *   g
                                   / \     / \
                                  d   x   2   *
                                             / \
                                            f   y

The tree coloring problem can be stated as follows:

  • All leaves have no color - nothing to allocate there, as the values are stored in existing variables.
  • All internal nodes (the + and *) must be colored i.e. their result must be stored
    in some temporary variable. Each color is a different temporary variable.
  • The immediate children of an internal node must all have different colors:
    you cannot store two (or more) partial computations into the same temporary variable.

Also, the compiler cannot take advantage of any mathematical property of the operators,
such as neutral element e op x == x, commutativity a * b == b * a,
associativity (a * b) * c == a * (b * c) and distributivity (a + b) * c == a * c + b * c
because the user-defined operators may not have such properties.
For example, matrix or quaternion multiplication is not commutative
and octonion multiplication is neither commutative nor associative.

In this case, the AST is quite simple and needs only two colors, i.e. two temporary variables:

  • a*x*x is blue (i.e. all its internal nodes are blue)
  • 2*b*x*y is red (i.e. all its internal nodes are red)
  • c*y*y is red
  • 2*d*x is red
  • 2*f*y is red
    and all other internal nodes (the +) are blue
  1. analyze aliasing

What's aliasing actually?
One can attempt to define it as follows: if two values are aliased, they may reference the same mutable data.
Thus modifying the content of one value may modify the other.

In our example, the "sufficiently smart" compiler should realize that z is only used to store the final result, and thus, if it demonstrates that z is not aliased to any other variable, it can directly use z
as one of the temporary variables.
This is a tricky point, both for programmers and for the compiler:

if z is aliased to some other argument passed to QuadraticCurve(), the second version of the function will not behave as expected, i.e. it will produce a different result.

Maybe it's a bug, maybe it's intentional (and hopefully documented), but in any case there is no way to inform the compiler of such peculiarity. In other words, there is currently no Go syntax to tell the compiler which variables may alias each other and which variables are guaranteed not to alias each other.

As a side note, C (not C++) has the restrict keyword to annotate a pointer as "it will not alias anything else", but a single mistake using it will introduce subtle and hard-to-find bugs.
Not something I would recommend for Go.

In Go, analyzing aliasing at compile time can probably be analogous to escape analysis:
when a function is compiled, its callers may not be known yet, so the compiler cannot
assume anything about the aliasing among its arguments, but it can analyze the aliasing
created or destroyed by the function body.\
When local variables are created, they are not aliased to anything: aliasing can only be introduced by storing shared data inside them, or by storing the same slice, map or address into multiple places.
(Note: what about closures?)

In short, aliases analysis will probably be able to produce a proof "these two variables are not aliases"
only for local variables, and only if the compiler performs an aliasing analysis on each function
while compiling it, similarly to how it currently performs escape analysis.

  1. actually creating "compiler-introduced temporary variables"

There is currently no uniform syntax for this.

Many types guarantee in their documentation (which a compiler cannot read) that the zero value is valid and useful, but other types give no such guarantee.

Also, in many cases operator overloading will accept pointers to named types, as the *big.Int used in the example above, and the "compiler-introduced temporary variables" of type *big.Int must be a non-nil pointer to big.Int.

In general, if the compiler needs to introduce a temporary variable of type *T, should it call new(T)?
What happens if the zero value of T is not the right thing to use?

The solution I am currently experimenting with (for generics, which have a similar need for "user-introduced temporary variables") is the following:

  • a compiler-generated temporary variable of type T is the value returned by zero.New() below, provided that the code compiles and zero.New() returns a T
    Go var zero T zero.New()
    Note that T may be a pointer type, i.e. a *U, as for the example *big.Int. In such case, the method T.New() must have pointer receiver and will be called with a nil receiver - which must not be dereferenced.

This is a way to introduce C++-style constructors in Go - New() becomes equivalent to a zero-argument constructor, but I could not find a better way.

And it certainly has the disadvantage that Go code using operator overloading becomes harder to reason about: the compiler will invisibly insert calls to T.New()

@cosmos72 much work, chapou.

But the main point is just to have operator methods, like introduced by Mr. Griesemer in 2016. This is absolutely needed for a usable decimal implementation or other useful number formats. And for minimizing the core of Go. For example complex numbers could be implemented in a separate package.

Overloading is not needed or wanted.

That this works, can be seen here https://youtu.be/vLxX3yZmw5Q and here https://github.com/griesemer/dotGo2016?files=1
Maybe I missed the point. For me it seems quiet simple. Here is the proposal and the solution.
Motivation is business calculations: invoices, taxes, ... This is not possible without a good decimal implementation with usual operators.

@maj-o I was not discussing the syntax to define operator on new types - it can be operator overloading, operator methods, annotations, or any other proposal.

My work above analyzes the amount of garbage (temporary values) produced by operators on user-defined types, the techniques to minimize the garbage, and the limits of such techniques.

The reason is simple: less garbage means fewer allocations and fewer garbage collections, which translates to faster execution (provided the algorithm is not changed in other ways)

The only user-visible part of my analysis is that functions/methods that implement operators should accept an additional parameter, where they store the result before returning it. The Add, Sub, Mul, Div ... methods of types inside math/big do exactly that. Note: this has no visible effect on how operators are used.

i would be a healthier person if "math/big" had operator overloading plz do soon :)

@deanveloper: oh yes, this (#35307) takes the wind out cosmos72 long posts. if simd, misd ... would work, then the compiler could perfectly optimize

a+b+c

of any type. That would be much better then internal variables or how old peaple would solve this problem stack pushes and pops.

My (very) long posts were meant to analyze the performance implications of operator functions on large and variable-size objects, as for example math/big.Int.

I perfectly agree that operator functions on fixed-size SIMD data (AVX/NEON etc.) would be very useful and do not need all that machinery - especially because they can (and should) be stored in CPU registers as much as possible

@bronze1man Where is the code defined is also very important to me. How about a solution of prefixing the operators with _. So something like

a _+ b _* c. And if the user ctrl + click on _+, the IDE could take the user to the code that overloads the operator? And the fact that such prefix would inform the users of potential overhead of using the such operator overloading.

@vwxyzjn
Maybe I missed something - but it seams more difficult.
Overload - How can You overload an interface? Go is "method-orientated", a Reader is any instance of a type (mostly struct) with an implementation of the Reader interface. This is great. If You "derive" this "class" you have to implement its interface - this is the Go way - and this is good.

Here comes the proposal in - we can not do this for build in types, 'cause they have an interface including operators (infix-methods), which can not be declared in Go - but being able to do this, would be great.

In far future, I'd like to write
~~~go
var i int = 1
var d double = 1.1
var res1 int = i + d
var res2 double = i + d
var res3 int = d + i
var res4 double = d + i

// or - this would be a dream >>

var resDream decimal = d * d

// 'cause this would mean, we can work with any numbers in any type - getting the right result (either fast or correct)
~~~
with res1 and res3 == 2 and res2 and res4 == 2.1 and resDream 1.21 (decimal)

For correctly doing this both int and double must implement the interface for + (for int and double - maybe some generics could make this simpler ;-) )

Do not get me wrong - but Infix-Methods _+ seam to make things more difficult and once again overloading is not the target.

I'm kind of a newbie to Go but it ocurred to me if we already have the interface block, would it not make sense to define operators in there since interfaces define behaviour?

i’m not sure how you would go about doing it with an interface, since interfaces only describe already existing structures.

It think the definition of unary operators in the OP is a can of worms. For instance, taken literally it seams it opens up +"abc" to mean same as ""+"abc".

Ostensively this is because string concatenation is fundamentally different from mathematical addition (and it could be argued they shouldn't share an operation), but such is life.

IMO, while I appreciate the attempts at simplification, trying to ensure operators are only applied to "math", and making a minimal set of sane assumptions about their usage, to ensure we don't end up with C++, it should be noted that even many typical "math" notations will go against some of the more restrictive rules, because some algebras have different ideas about "zero", "unit", etc.

For example, unifying comparisons is a nice effort, but it should not be lost that many types will might have defined equality, but not a defined total order.

For another, the SO3 space of 3D rotations, typically represented by unit quaternions usually uses binary * for composition, and unary - for the conjugate/inverse (which as "wrong" as string concatenarion, and similarly violates the OP rules by defining unary - but not binary -).

Trying to enforce what essentially boils down to good taste is going to be hard.

Having said that: fully agree with the restriction that operands must have the same type (in line with Go), and attempts at ensuring value semantics (even if at the cost of some performance).

I'm fairly new to go, and from what i saw i like this idea
i was reading through here and thought maybe changing this:

operator (Int + Int) Add
func Add(a, b Int) Int { ... }

to this:

operator (Int + Int) Add
func (a int) Add(b Int) Int { ... }

using the first operand in the func definition as a receiver, i don't have much to argue for it other then it looks better to me personally

That would actually be possible to write with the initial syntax using a neat trick!

operator (Int + Int) Int.Add
func (a Int) Add(b Int) Int { ... }

The Int.Add syntax takes what we would traditionally call func (a Int) Add(b Int) and instead evaluates it as func Add(a Int, b Int). See method expressions

Any update on this?

@k06a No updates.

The next big language change is going to be generics. If the generics proposal is adopted and implemented, I doubt we will make another big language for some time. So honestly I would not anticipate a change along these lines happening for some years (if ever).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bradfitz picture bradfitz  Â·  3Comments

enoodle picture enoodle  Â·  3Comments

Miserlou picture Miserlou  Â·  3Comments

jayhuang75 picture jayhuang75  Â·  3Comments

michaelsafyan picture michaelsafyan  Â·  3Comments