There doesn't seem to be much support for supporting custom operators (see #818) in Rust. But what about calling normal binary functions as if they were operators, like haskell allows you to?
The syntax would be:
a `dot` b
to convey scalar product of two vectors a and b.
Another option suggested by @bluss:
a \dot b
Or if it possible without parsing conflicts (_@thepowersgang: "but probably not dersirable"_):
a dot b
Which is converted to
a.dot( b )
or, depending on the function:
dot( a, b )
This is a TIMTOWTDI feature with very little (any?) gain. It doesn't read terribly well either.
Also discussed in: https://github.com/rust-lang/rust/issues/8824
TIMTOWTDI = There's more than one way to do it
There's nothing inherently wrong with having more ways to do one thing.
Some prefer reading a dot b
to a.dot( b )
since it gives a more "mathematical" feel to it.
a \dot b
is the nicest syntax I have seen proposed for this (taken from the previous discussion).
The idea is to introduce one more operator to end all custom operators. Instead of defining custom operators, we can just use functions for this purpose.
I know the argument that methods are infix, x.dot(y)
is infix. The infix binary function call proposal has advantages:
x.dot(y.dot(z.dot(w))))
.x \dot y \dot z \dot w
is easier to read and even a parenthesized infix call version is easier to read: x \dot (y \dot (z \dot w))
.It doesn't seem beyond the wit of man that someone might reasonably expect a library-defined operator whose return type is the same as that of the first operand to be usable as an augmented assignment. How easy is that to do?
Discussion on fixity from IRC, for future reference...:
151205 vfs | Centril: are those to have some hard-coded precedence?
151242 bluss | Centril: if you want to. You can just say that either one or the other is a possible
| implementation
151301 bluss | proposals like this are judged from superficial issues like syntax too
151321 Centril | vfs: it would have to I guess, how high it should bind I don't know yet
151331 bluss | I mean, the syntax is important in the end, but it doesn't decide if the feature is worth
| having or not by itself
151357 Centril | bluss: would it be possible to skip the \ alltogether and just have a dot b ?
[...]
151924 vfs | Centril: what about associativity? bluss comment seems to suggest right associative, but
| afaik all rust operators are all left associative.
151942 Centril | vfs: haskell says:
| http://stackoverflow.com/questions/8139066/haskell-infix-function-application-precedence
151956 Centril | infixl 7 `dot`
152204 bluss | vfs: wouldn't the comment suggest left associative, if the (x (y z)) version is the one
| needing explict ()'s
152220 Centril | so in haskell, a `plus` b `plus` c == (a `plus` b) `plus` c
152256 vfs | bluss: `x.dot(y.dot(z.dot(w))))` <-- this does if i read it correctly
152353 Centril | vfs is correct that it is right associative because it evaluates z.dot(w) fist
152357 bluss | anyway, it's not intended to suggest anything
152453 Centril | vfs bluss: one could (possibly) add an attribute to a function specifying its fixity if a
| default one isnt good
152533 vfs | what if function is in another crate?
152549 oli_obk_ | Centril: and how would that work together with other infix functions?
152636 Centril | vfs: then the fixity information would have to be carried over into the metadata in some
| way i guess
152644 bluss | vfs: the attribute should be on the function
152655 bluss | You'd use this with a function intended to be callable infix
152754 Centril | or, maybe you could know if it is left associative or right depending on where \ is, so x
| \dot y or x dot\ y
152807 Centril | maybe thats a bad idea, just brainstorming atm
152828 Centril | oli_obk_: ideas :) ?
152842 vfs | bluss: so you would need to external crate metadata to know how to parse an expression?
| (maybe it is the case already now, idk)
152917 bluss | vfs: that's a good point, that doesn't sound workable to me
152919 Centril | guess we need someone who knows the compiler better to flesh that out
152931 bluss | even if crates can inject macros etc
152947 Centril | so maybe it a better idea to specify fixity at call site?
153039 oli_obk_ | Centril: you can't decide it globally. So you can say that a function has a higher fixity
| than another function. This way you get a tree of rules depending on the crate dependency
| tree. Similar to impl orphans
153117 oli_obk_ | Centril: then you can simply force infix to require brackets if you're specifying it at
| the call site
153129 oli_obk_ | and if you are doing that, you can go the macro way
153212 oli_obk_ | a `$lhs:expr $fun:path $rhs:expr` => { $fun($lhs, $rhs) } rule should do it
153255 Centril | so, umh, rephrase .P ?
153351 oli_obk_ | which part? the one about fixity at the call site or the fixity tree?
153423 Centril | all of the above :P
153555 oli_obk_ | if you want to specify the fixity at the call site, you can simply use brackets and force
| them. so it would be something like `\(a dot b)` transforms to `dot(a, b)`, but then you
| don't gain much over macros: `i!(a dot b)` might be possible, maybe one needs `i!(a \dot
| b)`
153722 oli_obk_ | if you want to specify the fixity at the definition site, you need to specify it for all
| infix functions that your infix function can be in the same expression with. That might
| not be very feasible, but at least complete.
153724 Centril | well, if you have a default fixity, say infixl, then you can write a \dot b \dot c , or
| force it: a \dot (b \dot c)
153822 Centril | oli_obk_: haskell does it at definition site, but it has a default fixity of infixl 9
153902 oli_obk_ | that works if you only have one function
153914 oli_obk_ | but what about `a \foo b \bar c` ?
153921 Centril | problems =)
153932 Centril | or: first come first served
154108 oli_obk_ | Centril: what about `vec \add_each_value vec2 \mul_each_value vec3` ? (in matlab terms:
| `vec + vec2 .* vec3`
154206 Centril | well yes, that isnt so nice since that would eval to (vec1 .+ vec2) .* vec3
[...]
154759 Centril | oli_obk_: but i think that left associative with first-come-first-serve still might be a
| good simplification that makes it crystal clear
154831 Centril | otherwise you have to read about the fixity in the documentation, etc.
I like custom infix operator as long as limited in ascii letter. But I don't like if you can turn any arbitrary function to infix call. When I create function foo
, I expect the function to be used like this: foo(a,b)
. When I want to create custom infix notation, I expected to implement the necessary trait.
see previous discussion in https://users.rust-lang.org/t/infix-functions/4724
Does it interact with generics? Would we end up with a \dot::<Foo<'a>, Bar> b
?
@durka Probably, you'd have to specify it somehow with generics, and what you've come up with seems best. But I guess you'd save \dot
for cases when the types can be inferred from the Expr
s.
I don't want this in the language. I think it adds a lot of complication, and it's unimportant to me personally. However, I wouldn't be too opposed to this. It seems nice, especially, for mathematical code.
Infix functions are great in languages like Haskell where operators are more common, but it doesn't seem to give Rust much.
Are there examples of projects where this would result in a decent ergonomic improvement?
@AndrewBrinker I venture a guess that writing mathematics libraries, e.g: linear algebra, and using them, as well as the low level parts of physics engines might benefit from the increased ergonomics.
Essentially, inside and outside anywhere where a lot of math is needed.
There might be other projects of course, but off of my head, that's the typical use case.
@Centril The existing overloadable operators covers 95% of those use cases.
Would this mean we could write obfuscated code that compiled both in Rust and TeX? ;)
I love the infix operator notation in Haskell, partially because not all mathematical operators to *, +, etc. even when mathematicians write them that way. I love Haskell's lack of parentheses on function calls too though. Yet, Rust is a rather different language, so tools like currying do not always fit.
At first blush, I'm thinking one should start with just knowing when to avoid, or argue against, using object .
notation around mathematical objects. If for example, we're doing an action, scalar multiplication, etc. then object notation looks like a left action, which gets annoying if you are not a group theorist from Britain or something.
I fear this proposal makes the left vs right action issue worse. In other words, a \dot b
might be more useful for mathematics as b.dot(a)
than as a.dot(b)
, but that seems confusing in non-mathematical situations.
In the long run, one could probably build a strict functional language with Haskell-like syntax, maybe an Idris fork, that compiles to Rust code, but does not introduce a garbage collector. I'd suspect that's a better way to build DSLs that compile to Rust.
It strikes me that, although it wouldn't be super pretty, procedural macros could be used to provide as flexible a mathematical notation as is desired, without the introduction of infix function application.
Actually, looking at the IRC log @Centril includes above, it seems this idea has been mentioned previously. It would be some extra work, and would be strictly-speaking less flexible than infix function application, but it is something that would work today without changes to the language.
I was going to post this in #818 but it's more appropriate here.
I'm in favour of infix functions. User definable operators can be a bit tough to wrap your head around depending on how they're defined, but simpler, more rigorous operators like .
and $
are very convenient control operators that also eliminate the overhead of parentheses, like @bluss said. Respectively they are function composition, and function application. Here are some more concrete usage examples:
-- Operators
f $ g $ h $ 10
=
-- Backtick infix functions
f `apply` g `apply` h `apply` 10
=
-- Math-y infix functions
f apply g apply h apply 10
=
-- Dot functions; not very applicable here
f.apply(g.apply(h(10))
=
-- Normal functions
f(g(h(10)))
let f = g . h
=
let f = g `compose` h
=
let f = g compose h
=
let f = g.compose(h)
=
let f a = g(h(a))
let newList = n `insert` oldList
=
let newList = n insert oldList
=
let newList = oldList.insert(n)
=
let newList = insert(n, oldList)
What about functions of >2 arguments, though? It's actually doable with auto-currying which is a topic that has been discussed quite a bit for Rust. This is specifically useful when you're doing higher-order programming, and sending functions around to other functions, or returning functions; like map
. And once you have auto-currying you can also partially apply infix functions.
let insertValue = (`mapInsert` value)
let addPair = insertValue(key)
addPair(hashMap)
=
(key `mapInsert` value)(hashMap)
=
mapInsert(key, value, hashMap)
We also have to think about the default associativity and precedence of infix functions. In Haskell they are left associative and have the strongest precedence so they bind tightly. Maybe it would be nicer if they bind weakly instead? And should it be user definable? Assuming it is, how would that be made verbose to the user? It wouldn't be easy to figure out without attaching it to the name of the function itself as far as I can think, so infix functions with different associativity and precedence can be a bit opaque to users, which is alike the argument against user-definable operators.
-- Tight precedence
(10*n+1) `insert` myList
-- Weak precedence
10*n+1 `insert` myList
These are tradeoffs in tightly binding vs weakly binding by default.
@burdges: I fear this proposal makes the left vs right action issue worse. In other words, a \dot b might be more useful for mathematics as b.dot(a) than as a.dot(b), but that seems confusing in non-mathematical situations.
Yes, this seems to be the central problem... It is solvable, but one has to decide which way to go...
The various alternatives seem to be:
a \cross b \cross c == (a \cross b) \cross c
or: a \cross b \cross c == a \cross (b \cross c)
. This is likely the worst possible solution.ghci :i
to get the fixity this can slow developers down a notch - but I guess this is a problem with function documentation anyways - you still have to read it.infix syntax modifier
is _located_, or how it _looks like_. For example: vec \add vec2 \mul vec3 == vec.add( vec2.mul( vec3 ) )
, while vec /add vec2 \mul vec3 == vec3.mul( vec.add( vec2 ) )
. /add
could be replaced with add\
, or if possible, you could use !add
and add!
. All in all, this seems a bit arbitrary, so I prefer definition site fixity.@burdges: In the long run, one could probably build a strict functional language with Haskell-like syntax, maybe an Idris fork, that compiles to Rust code, but does not introduce a garbage collector. I'd suspect that's a better way to build DSLs that compile to Rust.
This sounds like an awesome idea! It could just compile to HIR
directly perhaps and let rustc
do the rest...
@AndrewBrinker: It strikes me that, although it wouldn't be super pretty, procedural macros could be used to provide as flexible a mathematical notation as is desired, without the introduction of infix function application.
Actually, looking at the IRC log @Centril includes above, it seems this idea has been mentioned previously. It would be some extra work, and would be strictly-speaking less flexible than infix function application, but it is something that would work today without changes to the language.
The problem with procedural macros here is that you have to use the procedural macro, inducing the visual overhead of the macro itself. Having to do: infix!(a \cross b \cross c)
is not really pretty.
@Shou: What about functions of >2 arguments, though? It's actually doable with auto-currying which is a topic that has been discussed quite a bit for Rust. This is specifically useful when you're doing higher-order programming, and sending functions around to other functions, or returning functions; like map. And once you have auto-currying you can also partially apply infix functions.
One has to remember that unlike rust, haskell is a lazy garbage collected language. Rust is both eager and has manual memory management. Thus, every time you partially apply a function, its arguments must be moved to the heap, and then you have to deal with Box, Rc, Arc
and so on...
Some developers might start using currying by default as a nice style of writing, but be completely unaware of the massive relative performance penalty.
I think by now, enough has been fleshed out to create an RFC out of this issue... what does @bluss think?
There is one area in particular where some help is needed...
Regarding storing the fixity and precedence in the crate metadata:
Thus, every time you partially apply a function, its arguments must be moved to the heap, and then you have to deal with Box, Rc, Arc and so on...
Some developers might start using currying by default as a nice style of writing, but be completely unaware of the massive relative performance penalty.
Lambdas are not automatically/implicitly moved to the heap, only if you explicitly box them (and with impl Trait
it will be possible to avoid that in many more cases than today). Any partial application syntax would presumably desugar the same way.
(Maybe this discussion is getting off-topic?)
Lambdas are not automatically/implicitly moved to the heap, only if you explicitly box them, and with impl Trait it will be possible to avoid that in many more cases than today. Any partial application syntax would presumably desugar the same way.
True, my bad. However, if you return the function itself, then that function's arguments must be boxed & move
:ed, https://doc.rust-lang.org/book/closures.html#returning-closures
It follows that if you want to pass the lambda to another function as an argument that the same logic applies.
In most cases (this is just a hypothesis) where you'd want to partially apply a function, you either want to return the function or pass it to something else. Purely partially applying a function and using it within the same stack frame doesn't seem like a big use case to me.
I like the \dot
syntax, especially for mathematical libraries, but I would caution against going too far with it. Porting ($)
and (.)
and half of Haskell syntax and semantics into Rust, as @Shou seems to suggest, seems like a horrible idea.
IMHO this syntax should only support binary functions. I would also recommend against any precedence or associativity declarations, like infixr 9
. The reason is that operator precedence bugs are already common enough, no need to introduce additional crate-defined rules into the mix. These "synthesized" operators should all have the same precedence and associativity. I'm personally fond of universal right-associativity, from APL, but left-associativity would probably be more obvious to everybody else.
a library-defined operator whose return type is the same as that of the first operand [should] be usable as an augmented assignment. How easy is that to do?
That would be a nice touch. a =\dot b;
or a \dot= b;
or perhaps a \=dot b;
Methods are functions, so &mut v \Vec::push 1
would be allowed if the syntax is arg1 \<callable expression of two arguments> arg2
.
I want to argue against this or anything even vaguely like this, on legibility grounds. People are going to write
x = a \dot b \wedge c \dot e \wedge f \hat a \dot g
and nobody, including the original author six months later, is going to have any idea what it means.
As a rule of thumb, a mixed chain of infix operations is illegible unless there is an operator precedence rule that everyone not only agrees on, but has had drummed into their head since elementary school (i.e. a + b * c
). We already have enough trouble with C's less obvious operators (do you remember what a | b & c
means? are you _sure?_ how about if I throw a shift in there, are you still sure?) -- I was actually a little disappointed to discover that Rust had adopted them verbatim instead of requiring parentheses.
Declaring that "all such operators have the same precedence and are universally (right|left) associative" _does not help_, because that rule hasn't been drummed into everyone's head since elementary school, so reading it that way is not automatic.
I want to argue against this or anything even vaguely like this, on legibility grounds.
I thought about this some more, and my concerns can be addressed with a little syntactic salt and a function annotation.
There are two legibility problems with user-defined operators. One is that the precedence of user-defined infix operators, relative to _any other operators_ (user-defined or not), is unclear: what does a \dot b + c
mean? The other is that their associativity is unclear: what does a \dot b \dot c
mean?
Now, mathematicians make up operators all the time — how is this not a problem for them? _They always parenthesize, unless both precedence and associativity are unambiguous._ They have a fairly broad idea of "unambiguous precedence"; I suspect a mathematician would say that _of course_ dot product has higher precedence than scalar addition. However, their idea of "unambiguous associativity" is quite _narrow:_ if a mathematician writes a $ b $ c
without parentheses, then there is a theorem or axiom that a $ (b $ c) ≣ (a $ b) $ c ∀ a,b,c
, i.e. it _doesn't matter_ whether a $ b
or b $ c
is evaluated first.
So I propose the following rules:
\foo
_has no precedence._ That means, it can appear as the sole operator in a complete arithmetic expression:
if a \isgreater b { ... }
x = y \dot z;
but it cannot be combined with any other operator unless parentheses are used to make the precedence clear:
x = (y \dot z) + c; // ok
x = y \dot (z + c); // ok
x = y \dot z + c; // syntax error
\foo
also can't be combined with _itself_ normally, but you can annotate the definition of foo
to declare that it's an associative operation in the mathematical sense, and then it's OK:
// \cross = 3D vector cross product, not associative
x = (y \cross z) \cross w; // ok
x = y \cross (z \cross w); // ok
x = y \cross z \cross w; // syntax error
// \mprod = matrix product, _is_ associative
// #[associative] annotation on the definition of \mprod
x = y \mprod z \mprod w; // ok
@zackw Thank you for a very well thought-out and reasoned reply!
I really like your solution and would very much like to write an RFC based on it with your (and anyone elses) input.
Bikeshedding:
a \dot b
vs.
a `dot` b
vs.
other syntax
?
Just my opinion: backticks should be reserved for some future hypothetical kind of string literal.
@zackw seams like a reasonable objective reason to prefer:
a \dot b
I like that your reasoning is not just based on subjective taste =)
I have yet to see _one_ really convincing usecase. I can see all sorts of niche stuff, but in the end of the day, it is so uncommon that introducing a syntax just for that is bloat.
People tend to prefer infix notation, rather than function call or method syntax, for dyadic operators.
Operators are pure functions as in, deterministic and side effect-free, but there is no clear definition of which pure functions are operators and which are not. I think the term 'operator' and its distinctive infix notation are used to highlight one or more functions that have some useful algebraic structure over their input domain.
So a dot product is written as an infix operator to remind the reader that the underlying datatype has the useful and rich algebraic structure of an inner product space.
Mathematical beauty is in the eye of the mathematician. Custom operators are neither strictly needed, nor useless bloat. IMHO they fill the same kind of niche as if let
: if used properly, they can make stuff easier to read.
@ticki while I don't have any crate to offer as evidence, I think that any library or program that makes heavy use of mathematical operations on non-primitive sets and types such vectors (R^n) will benefit from this in terms of better readability.
So, crates using a lot of linear algebra will probably benefit.
I agree with @tobia that this is as much bloat as if let
.
IMHO adding unfamiliar syntax that more or less everyone will have to look up to find out what it means the first time they see it has to meet a really high bar in terms of usefulness, and this does one not have that much advantage over just using method call syntax (which is also infix, and familiar to everyone).
Shouldn't it be possible to just write the functions without operator syntax? (a dot b
, not a \dot b
)
Since values are in most cases seperated by ,
or ;
or similar, this wouldn't be a problem.
It may also be required to use a single pair of brackets at minimum.
But without precedence rules (which are too complicated anyway), I don't think, they are useful, since you still need the same number of brackets in most cases, and in other cases ((a op b op c op d)
), it would be better nicer to be able to write this op(a, b, c, d)
.
@glaebhoerl I personally see a \dot b
as quite intuitive and often very useful, esp. in a situation where I expect mathematical functions (such as in a numeric or linear algebra crate or a use-case for one). I would probably look it up the first time I see it - just to make sure my intuitive understanding is correct - but it would be easy to remember.
How much of an advantage does that have over a.dot(b)
, though?
@glaebhoerl Not much. Until you consider (a + b) == (a \add b) == a.add(b)
(differing rules for parenthesis) which leads to
a \floor_div b \floor_div c \floor_div d == a.floor_div(b).floor_div(c).floor_div(d)
. This still isn't so bad but a single set of parenthesis suddenly puts a serious wrench in your spokes (metaphorically speaking) for method notation:
a \floor_div (b \floor_div c) \floor_div d == a.floor_div(b.floor_div(c)).floor_div(d)
. (floor_div
is not commutative, so the parenthesis is essential.)
Yet another use-case for infix functions: println!("a ⓧ b == {}", a \cross b)
seems slightly more intuitive than println!("a ⓧ b == {}", a.cross(b))
, but gets much more intuitive for method notation once you have
a ⓞ (b ⓧ c) * 2 /*etc*/ == a \dot (b \cross c) \mul 2 /*etc*/ == a.dot(b.cross(c)).mul(2)./*etc*/)
. (I mixed operands despite @zackw 's great proposal stating it's illegal since in this case it improves the ease of figuring out WTF is going on in the method notation. Realize, however that using a single operand repeatedly
(a ⓞ b ⓞ c ⓞ d == a \dot b \dot c \dot d == a.dot(b).dot(c).dot(d)
) or repeatedly with parenthesis
(a ⓞ (b ⓞ c) ⓞ d == a \dot (b \dot c) \dot d == a.dot(b.dot(c)).dot(d)
) makes no difference here for infix notation, while it makes matters worse for method notation, the lot of dots of two different types also doesn't help when you're trying to pronounce the method notation exactly ;-).)
PS. I've realized one minor irritant with the proposed notation: println!("a \\add b == {}", a \add b)
. Notice the extra backslash on the left but not on the right. I don't think this is too bad however, since we're already used to duplicating backslashes in string literals.
PPS. If we can find a good solution to the problem of custom precedence and association that solves the problem of the ambiguity of a + b ⓧ c
, infix functions will be far superior to method notation. I do think I have an idea to solve this, inspired by Scala's Spire community library and Rust's nice approach to inheritance vs composition, procedural macros, custom derive, iterators and Sized. But it's rather complex. Then again, it should add considerable power with very little (if any) downsides for the caller as well as generally little additional complexity (which you only pay for when you need it) with immense power - without the freedom to allow ambiguous or too easily unreadable user code - for the crate designer.
LOL. If I can also use HLists and implement Algebraic Effects, see especially http://goto.ucsd.edu/~nvazou/koka/padl16.pdf, my idea even adds support for something like:
// Normally everything below would be much less verbose due to `use Numeric::api::std::*`
// Notice the low learning curve way to specify exactly what types I work with. Although I'm not
// sure this is currently valid syntax.
fn my_op(left: Numeric.Ring, right: T) -> O
where T: Numeric.with_op(Numeric.ops.Op::from_symbol("/")),
// I want to only output a fixed-point real number that can be stored exactly with the fractional
// part requiring no more than 1_000_000 bits. My code will be automagically specialized
// to much cheaper types. See more about how this works after the function body.
O: Numeric.Real.with_format(Numeric.Format::Real{ float = false /*this is the default*/, max_bits_frac = 1_000_000 })
{
return (1+left) / right
}
// Trivially infers:
// Numeric.Primitives.ArbitraryReal<Numeric.Format.Inference::FromIO(Numeric.Format.ComplexityPriorities::Default)>
let pi = Numeric.math.pi();
// Infers `Numeric.Compatibility.U8` (a zero-cost wrapper for u8)
let a = Numeric.input::<Numeric.Primitives.Int, Numeric.Format::Int(max_decimals = 2)>();
// Infers Numeric.Primitives.Rational<Numeric.Format.Inference::FromIo(...::DefaultPriorities)>, notice `Numeric.Ring::from(1)` is not required.
let x = 1 \my_op pi;
// Infers z: Numeric.ops.const<Numeric.Primitives.Int, ...Inference::Literal>
let z = 0;
// If we didn't provide a type hint, we would've caused a panic here due to division by zero.
// First infers `y: Numeric.primitives.AlgebraicRational<Inference::Let>`
// (`AlgebraicRational: Algebraic + Rational` and Algebraic knows how to use L'Hospital.)
// Next infers y = 1/3: Numeric.primitives.Rational<...Inference::Literal>,
// by using L'Hospital and algebraic elimination.
// Finally infers:
// `y: Numeric.primitives.AlgebraicRational<Inference::FromIoAnd(1..=3, ...::DefaultPriorities)>`.
let y: Numeric.Primitives.Algebraic = (1 \my_op pi / z) / (2 \my_op pi / z);
// prints `1/pi` exactly correct to 20 digits with automagically optimized computational and memory costs. At this stage the compiler has enough information to specialize the type of `pi` to `...ArbitraryReal<Stability::ExactTo(min_int_decimals)>`
println!(x.format(Numeric.Format::Real{min_int_decimals = 2, round_to = 1e-20}, Exactness::Exact))
// Prints "1/3".
println!(y)
// Prints "1.234567890123456789 * 10**100). The compiler actually realizes it can do a lot
// of optimization, since `1 / 3 == 1 && 1 / (1 / n) == n` and some additional optimization since
// I'm only printing the result.
println!(y / 3 as Numeric / (1.0 as Numeric / 1234567890123456789 as Numeric * 10 \Numeric::ops::pow 100))
// Panics. I need a typehint stating the result is of type
// `....Algebraic` to avoid the panic.
/* let x = y / 3 / z / z*/
// Panics. I need a typehint stating the result is of type
// `Numeric.Primitives.DivisableByZero`
// to avoid the panic.
/* let x = y / 3 / z */
Panics. The literal 0 loses information that L'Hospital needs:
/* let x = y / 3 / 0 / 0 */
I think I should go sleep now. ;-)
@Centril linked to this issue from the operators for Rust 2018 thread. Here is how I would like it to work:
impl Vector3 {
fn dot (&self, v2: Vector3) -> Vector3 { ... }
}
let value = vec1 dot vec2;
let value = (vec1 dot vec2) dot vec3;
let value = vec1 dot (vec2 dot vec3);
sin a
(prefix) because normal method calls work good enough for this already a.sin()
Idea that I'm not sure about: Require methods to be annotated with #[infix]
not having anything to signify that "dot" should suddenly be taken as an infix operator is very bad.
I'll fight for enclosing backticks all day, but i'll take a single opening backslash before I accept the idea of not having any signifier at all.
@Lokathor I don't think that a signifier is needed. It's simply three identifiers in a row. The middle one is a method on the first. The enclosing parenthesis make their relationship clear:
let value = (num1 checked_add num2).unwrap()
That said, I could live with a signifier if enough people think it's a good idea. There is no technical reason for it, but it can still be a matter of taste.
About backticks in particular: Backticks look similar to Rust's lifetimes syntax if there's just one or strings if there are two (JavaScript has strings like this called "template strings").
Don't one-argument methods already allow infix notation? Why should we add a `op` b
, when a.op(b)
already exists? Making additional infix operators only causes unnecessary confusion about precedence and associativity without adding any significant value.
@H2CO3 Math formulas would be a lot more readable:
let value = a.dot(b).cross(c.dot(d)); // Old
let value = (a dot b) cross (c dot d); // New
I have to disagree with that. I don't find the "old" way any less readable.
@H2CO3 For me it considerably reduces the time I need to parse the formula in my head (about twice as fast). However, the effect of this is probably, at least to an extent, different for different people. It depends a lot on what one's used to.
@MajorBreakfast It may be the case that JS uses backticks for some sort of goofy string, but _Haskell_ uses them for making prefix functions into infix, so that's all the reason I need to want enclosing backticks for rust as well ;P In terms of them potentially looking like something else, it's never confused me if something in backticks might be a string or a character instead of an operator, with or without syntax highlight support. My eyes aren't even the best.
Just to demonstrate how subjective this is, I do find (a dot b) cross (c dot d)
to be a lot more readable than a.dot(b).cross(c.dot(d))
, but I am strongly against infix operators with no delimeter at all (in fact, how is that even an option? isn't that trivially ambiguous and therefore unimplementable?). Despite some exposure to Haskell, I find enclosing backticks and leading backslash to both be really bizarre choices for the delimeter. I have no strong opinion in favor of any particular delimeter. <dot>
, #dot#
, *dot*
, etc all seem acceptable but ungreat to me.
A bit random, but if we do custom infix operators, I'd prefer to make parenthesis mandatory whenever multiple operators are involved, rather than having a system for specifying the relative precedence of all user-defined operators like in Haskell.
Maybe it would be a good Idea to implement a mini language with infix operators, that compiles to rust, and can be used in a macro.
(or use other languages with support for using functions as infix operators instead of rust)
isn't that [no delimiter] trivially ambiguous and therefore unimplementable?
@Ixrec I think there is no problem from a technical standpoint. It's very similar to the notation for a 3-tuple. Minus the commas of course.
let value = a op b; // Top level: No parenthesis
let value = (a op b, 42); // Inside tuple
let value = (a op b) op c; // Nested
let value = func(a op b); // Inside function call
Just for fun the example of my post above with different markings:
let value = (a dot b) cross (c dot d); // No signifier
let value = (a (dot) b) (cross) (c (dot) d); // Parenthesis style
let value = (a `dot` b) `cross` (c `dot` d); // Haskell style
let value = (a \dot b) \cross (c \dot d); // LaTeX style
All of these seem a noisier to me. That said, proper syntax highlighting would go a long way in making it readable. The first one without markings is very short. The second one uses parenthesis similar to function calls. Both (1 & 2) are a bit harder to implement for syntax highlighting. The last one, the LaTeX style syntax, is easy to parse for syntax highlighting and has the benefit that it adds only one additional character. If we want markings, then I think I prefer the last one. Similarity to LaTeX is nice, because it's quite popular and it will be familiar to a lot of people.
(<dot>
should be reserved for XML, #
is for annotations and *dot*
is multiplication)
I'd prefer to make parenthesis mandatory
@Ixrec I totally agree!
Maybe it would be a good Idea to implement a mini language with infix operators, that compiles to rust, and can be used in a macro.
@porky11 In the IRC chat log posted by @Centril above the macro i!(a dot b)
was mentioned. However, to me, the feature looses a lot of its appeal if it is not generally available.
You can invoke proc macros with attributes, right?
#[linear_algebra_dsl]
fn foo<const n: usize>(x: Matrix<n,n,f64>) -> Matrix<2*n,2*n,f64> {
...
(a dot b) cross (c dot d)
...
}
In the longer term, we should encourage functional language researchers to take more interest in Rust, specifically the formal safety properties, ala the memory model, borrowing, lifetimes, etc., the interactions with advanced type system features, ala const generics vs dependent types or ATCs vs HKTs, etc., and Rust's internals like MIR, HIR, and ABI research. It'd rock if someone built an curried ML-style language that say uses Rust crates without any FFI, but make writing DSLs easier.
There are a few interesting "ideological" hiccups in doing this: It's possible "big" DSLs could do more from 1ML style modules than traits, but if traits came out as a natural special case then maybe that's fine. I've forgotten the details but Rust resists currying kinda forcefully, well see past discussions around HKTs, although again maybe that's fine if those restrictions become a special case.
@burdges I like this one very much, and I think that in terms of DSLs, we should probably stick with this direction. By definition, DSLs are domain-specific, so they should certainly not be hard-wired into a general-purpose language. A proc-macro is a very neat tool for creating an EDSL: it's basically the sweet spot between keeping the language minimal and as simple as possible, for the sake of readability from the point of view of a general audience, while still retaining the ability for anyone interested to easily create an EDSL without having to go through a tedious RFC process.
(PS.: I know that the usual argument against proc-macros is that "meh, proc-macros?!", but if one doesn't use a particular EDSL often enough to be worth the effort of writing a proc-macro for it, then probably one shouldn't be using an EDSL at all, and a set of plain old functions would suffice… Besides, with crates such as syn
and quote
, it's particularly easy to write proc-macros. Currently, they are derive
-focused, but I don't see a reason why they couldn't be extended with DSL-friendly features in the near future.)
@burdges Compared to i!(a dot b)
this seems a lot more useful and a real alternative.
It's possible "big" DSLs could do more from 1ML style modules than traits, but if traits came out as a natural special case then maybe that's fine.
I don't fully understand your meaning there
Will syntax highlighting work in an RLS environment if it's a proc macro?
@MajorBreakfast
Just for fun the example of my post above with different markings:
let value = (a dot b) cross (c dot d); // No signifier let value = (a (dot) b) (cross) (c (dot) d); // Parenthesis style let value = (a `dot` b) `cross` (c `dot` d); // Haskell style let value = (a \dot b) \cross (c \dot d); // LaTeX style
All of these seem a noisier to me.
How about?:
let value = (a `dot b) `cross ( c `dot d); // modified/lisp-ish Haskelly style
A little less noise. Should be easy, "like \LaTeX style" for syntax highlighters. I disagree that ` (for in-fix method call) and ' (for life-time) are easily mixed up. In every reasonable font I've ever used for coding, ` and ' are clearly/easily distinguishable. Also, with syntax highlighting, it's a non-issue anyway.
Another possibility would be a "block structured for infix notation" like this:
let value = `{ (a dot b) cross (c dot d) }
where `{ ... } is used to denote blocks where method calls will use in-fix notation. Which brings up the question, "Would you ever want to mix in-fix method calls and normal (arity-2) method calls in the same expression?" My take would be No!
Alternatively, write as:
let value = \{ (a dot b) cross (c dot d) }
If it is felt that ` is too easily mistaken for '.
At little less noise. Should be easy, "like \LaTeX style" for syntax highlighters. I disagree that
(for in-fix method call) and ' (for life-time) are easily mixed up. In every reasonable font I've ever used for coding,
and ' are clearly/easily distinguishable. Also, with syntax highlighting, it's a non-issue anyway.
That formatting is the biggest reason I'd rather not use `
for anything in Rust :rofl:
What is wrong with `, even if using ` multiple times?
@scottmcm: Yeah, we probably shouldn't use
for the same reason in Rust. We should probably use "gt" instead
:smiley:
It could be great to a library with procedural macros for infix notation. If the feature turns out to be sufficiently popular and useful, then it can be added to the language at a later stage. Perhaps it could look like this
~~~ rust
fn foo(... ) -> ... {
(a \cross b) \dot c
}
~~~
I could personally see myself using infix notation, but I am also worried that it will confuse newcomers, and I am slightly worried about the rust syntax budget.
After writing a lot of markdown writing a `
has become rather normal to me. But that is not how it has always been. I remember the first time I learned that there is a difference. For beginners the distinction to '
is not clear. The dream is of course that one day Rust will be the first programming language many people learn. E.g. like C++ (Arduino) and JavaScript (Web) today. The `
character is rather difficult to write because if followed by a compatible letter, it is transformed. E.g. `
+ a
= à
(I've a German keyboard layout, so correct me if that is not the case elsewhere) Even if one's aware of that, it is still a bit annoying to write.
That's absolutely not the case elsewhere. That's totally wild to me.
@Lokathor I've now checked on how it is on a French° keyboard and on an US keyboard. And they all have different ways for writing accented letters! Interesting! xD
° The German keyboard has those accents only for foreign languages like French
On Nordic/Scandinavian keyboards the ` symbol is really hard to come by: shift + acute accent button and after that, a space bar (because it's a dead key).
@golddranks The position of the key (to the left of backspace) and how it operates seems to be exactly the same as on the German keyboard.
Apologies for the derail, but..
How would you do infix operators in a curried ML-style language, without using `
? We cannot use \
since that gives lambda expressions, well not without using |..| ...
for lambda expressions. If functions required a declaration prefix, like say fn
, then you could use bare words but declare them with op
, ala
op<const M: usize, const N: usize> (x: Vector<M>) cross (x: Vector<N>) -> Matrix<M,N> = ...
or
op<const M: usize, const N: usize> Vector<M> cross Vector<N> -> Matrix<M,N>
op x cross y = ...
with the second form being an outer match if multiple forms were provided. You could still invoke cross
function style with (cross) x y
. Except..
I think this bare word makes syntax highlighting not context free, and not necessarily even determined by the file, which complicates editor tooling. Worse, it creates the same complexity for humans.
Worse still, these languages represent type parameters like value parameters, so more realistic syntax becomes:
fn cross (const M: usize) -> (const N: usize) -> (x: Vector M) -> (y: Vector N) -> Matrix M N = ..
with some early parameters being implicit or inferred when used, which matters when you represent traits with modules.
I'm not even sure if Haskell permits expressions as operators, ala x `semigroup_mult context` y
, although presumably that's bad style, if permitted.
@burdges
I can't follow your code examples. Is this Haskell?
Quick recap: The syntax proposed here is intended for method calls. See Add
trait as an example. It defines the add()
method. With this proposal it's possible to write a add b
instead of a.add(b)
. Rust is not like Swift where operators are functions instead of methods.
I think this bare word makes syntax highlighting not context free
I think syntax highlighting can still be done without type information even if there is no marking like \
or `
. In my comment here I list several different examples where the infix operator appears. It's possible to detect an infix operator expression through the shape of it and colorize all items accordingly.
@burdges but... Rust _isn't_ a curried ML language? I'm not sure how any of your statements would apply to Rust. I'm not even sure how most of them would apply to Haskell. In Haskell, you can only make a prefix function into an infix operator if it's a single binding anyway:
> addMul 4 5 6
54
> 5 `addMul 4` 6
<interactive>:6:11: error: parse error on input ‘4’
> let withFour = addMul 4
> 5 `withFour` 6
54
@lokathor fun thing - because haskell is a curried language, you could do
(4 `addMul` 5) 6
also, I'd love to have this feature no matter what it looks like.
@ubsan You could 😆, but you usually don't :) I would consider that bad Haskell style.
@Centril tell that to the wonderful people behind lenses :3
@ubsan I've never used the lens
package that way and I've used it fairly extensively (including the infix operators). I usually first bind x op y
to a variable and then I use binding z
. Besides, my guess is that most of the lens operators are binary.
Tho, cc @ekmett on lens
since they made it ;)
@Centril: In lens
a lot of the operators are technically ternary, but are used in the fashion @ubsan mentions. With slightly lobotomized type signatures:
(.~) :: Lens s t a b -> b -> s -> t
(%~) :: Lens s t a b -> (a -> b) -> s -> t
(+~) :: Num a => Lens s t a a -> a -> s -> t
The common idiom is to use them like
& field1 .~ bar
& field2 .~ baz
& field3 %~ reverse
The operators are binary operators that return functions, and &
is just x & f = f x
.
You could also use them in a fashion like
whatever = (field3 %~ reverse) . (field2 .~ baz) . (field1 .~ bar)
to build a pipeline of mutations, but this is a less common form.
As far as Rust is concerned, I have no dog in this fight, but in Haskell it works quite well and I wouldn't consider it unidiomatic to pipeline functions together in a functional language. ;)
I wouldn't consider it unidiomatic to pipeline functions together in a functional language.
Nor do I, it's not about the ability to build function pipelines. My arguments were about the unnecessary nature of infix notation, since one can already achieve piping with existing language features, namely method call syntax.
Also, Rust is still a different language, and it's not an automatic positive or an obvious gain to make it more like Haskell. Haskell's got many things right, but its syntax is certainly not a part I would praise. Most of the good stuff lies in the semantics of the language, and Rust has borrowed many parts of it already, notably type classes. For which I'm glad — but I'm also glad Rust's designers didn't copy over the weird syntax along with it.
@H2CO3
My arguments were about the unnecessary nature of infix notation, since one can already achieve piping with existing language features, namely method call syntax.
Pipe-lining with method call syntax fails in the face of free functions and borrowing.
Haskell's got many things right, but its syntax is certainly not a part I would praise.
But understand that this is purely subjective. I happen to love Haskell's syntax. There are no objective reasons for why Rust's syntax is better or why Haskell's is better. It is all about what you are used to. Rust's syntax is as it is and not like OCaml's since it was purposefully designed in such a way to not waste the complexity budget on unfamiliar (to C++ programmers) lexical syntax.
One feature I love from OCaml (and Haskell too, if I'm not mistaken?) is that you have a set of characters to build names with, including function names, which is the usual letters-numbers-underscore plus the single quote, and another set of characters to build operator symbols with, which is a subset of the ASCII symbol characters.
So when you see foobar
you know it's a function or variable name, but when you see +:*
you know it's a custom infix operator. No ambiguity.
Maybe we could solve the issue like this and make math library authors happy as well? I still think custom operators should have a fixed precedence level and no associativity (= forced parentheses.)
Incidentally, I happen to love OCaml syntax, except for a few weird points. To me it's a good middle ground between Haskell and C-like languages. (Yes, I know OCaml came before Haskell, what I mean is that to me the latter is weirder, not easier.)
Yes, I know OCaml came before Haskell, what I mean is that to me the latter is weirder, not easier.
Haskell came before OCaml :)
First appeared | 1990; 28 years ago
https://en.wikipedia.org/wiki/Haskell_(programming_language)
First appeared | 1996; 22 years ago
https://en.wikipedia.org/wiki/OCaml
@Centril
But understand that this is purely subjective.
This seems to be a common misconception among programmers; unfortunately, it's false. There are objective reasons, especially when it comes to robustness and fault tolerance, for why one kind of syntax is superior to another. For example, whitespace sensitivity, and in particular, indentation-based blocks are dangerous since transmission through a channel that doesn't respect whitespace can significantly change the meaning of a program. So, no, syntactic choices are not "purely" subjective.
And even if they were: why do you feel your "I like Haskell's syntax" argument is stronger than my "I don't like Haskell's syntax" viewpoint?
Before things get more fiery, I think y'all are arguing the same point -- to quote H2CO3:
Also, Rust is still a different language, and it's not an automatic positive or an obvious gain to make it more like Haskell.
I don't think anyone is saying their preference matters more, just that there are views on both sides (and both views are partially, if not purely, subjective).
@H2CO3
This seems to be a common misconception among programmers; unfortunately, it's false.
Are there any scientific studies to this effect?
For example, whitespace sensitivity, and in particular, indentation-based blocks are dangerous since transmission through a channel that doesn't respect whitespace can significantly change the meaning of a program.
Personally, I think it is dangerous to rely on lexical syntax for fault tolerance and robustness; I would much rather as much as possible rely on a type system that makes fragile things not well typed.
From my time as a teaching assistant for a beginners course in OOP (Java), I think it is equally easy to misplace braces and parenthesis.
My view is that we really use layout syntax even with braces (based on how we format with rustfmt), and that they mostly are redundant noise in the way of reading, but I understand that this is my subjective preference and not a universal constant.
And even if they were: why do you feel your "I like Haskell's syntax" argument is stronger than my "I don't like Haskell's syntax" viewpoint?
I did not make this claim :)
Rust is certainly not going to get Haskell's syntax.
Please keep off-topic comments about Haskell's syntax out of this Rust thread :)
Please keep off-topic comments about Haskell's syntax out of this Rust thread :)
I'm not a Haskeller, but, it is difficult for me to understand why you would consider the comments about another language's syntax, that clearly falls into the category of "Prior Art", to be off-topic on the discussion thread for a proposed feature/RFC for Rust. Sounds more like trying to silence someone who you disagree with rather than arguing the merits one way or another. Odd thing indeed. I'm highly offended by the smugness of telling someone they should shut-up because you disagree with what they have to say.
@gbutler69 Nobody is telling anyone to be quiet, because this is an RFC issue and the "C" stands for "comment". Also, please read @ubsan comment again and you'll see that there is no malice intended by the words she uses. Objectively there's also nothing wrong with wanting to stay on-topic.
I gave @ubsan comment a thumbs up because the debate about Haskell syntax is really hard to follow. It'd be helpful if the relevant parts were explained, so that a Haskell noob like me could follow. I'm just seeing a bunch of squiggly infix operators xD
Objectively there's also nothing wrong with wanting to stay on-topic.
That's the problem though. As I've said, I just can't understand how they could be considered off-topic. If all you need to do is call something "Off-Topic" without justification to say that it shouldn't be included in the discussion, that is tantamount to just saying, "I disagree, so, shut-up". I don't particularly care for that. If that wasn't the intent, then, my apologies for misreading, but, understand, that is how it comes off to myself and probably many others as well. I'm sure others will perceive it differently though.
It'd be helpful if the relevant parts were explained, so that a Haskell noob like me could follow.
In that case, asking the commenter to clarify is highly desirable and appropriate. Calling it off-topic and saying that the comments don't belong because they are off-topic is not. Again, "Prior-Art" is 100% on-topic in any proposal. I just can't see how it couldn't be.
I'm suspecting @ubsan was half-joking?; I don't mind their comment at all ❤️
@MajorBreakfast
It'd be helpful if the relevant parts were explained
Sure thing! (and some bonus parts)
EDIT: I hope this was helpful / understandable; if it wasn't, let me know =)
Let's start simple.
-- Equivalent (up to memory representation) to `enum List<a> { Nil, Cons(Box<List<a>>), }`.
--
-- List is a type constructor from Type -> Type;
-- List Int is applying 'Int' to 'List' giving you back a type.
-- 'a' is a type variable (generic type parameter)
-- 'Nil :: List a' is a data constructor
-- 'Cons :: a -> List a -> List a' as well.
data List a = Nil | Cons a (List a)
length :: List a -> Int -- A function from List of 'a's to 'Int'
length Nil = 0 -- pattern matching on Nil
length (Cons _ xs) = 1 + length xs -- Pattern matching on Cons.
-- explicit quantification of the type variable 'a'
length :: forall a. List a -> Int
length xs = case xs of -- similar to 'match'
Nil -> 0
Cons _ xs -> 1 + length xs
-- A binary function from 'a' to 'a' to 'a'
-- proviso that 'a' satisfies the 'Num' typeclass (trait);
plusMul2 :: Num a => a -> a -> a
-- This is the same; Haskell functions are curried!
plusMul2 :: Num a => a -> (a -> a)
plusMul2 x y = (x + y) * 2
Rust equivalent of the last one (up to currying, laziness, memory model..):
fn plusMul2<A: Add + Mul>(x: A, y: A) -> A {
(x + y) * 2
}
With respect to the example lens operators:
(+~) :: Num a => Lens s t a a -> a -> s -> t
-- explicitly quantified type variables and explicitly showing the currying:
(+~) :: forall a s t. Num a => Lens s t a a -> (a -> (s -> t))
This declares (the type of) the infix ternary custom operator +~
which takes a value of type Lens s t a a
a value of type a
and a value of type s
and produces a value of type t
. The lower case a
, s
, and t
are type variables (parameters). Lens :: Type -> Type -> Type -> Type -> Type
is a type constructor taking 4 types as arguments and returning a type. Again, a
must satisfy the Num
trait / type class.
whatever = (field3 %~ reverse) . (field2 .~ baz) . (field1 .~ bar)
This defines a top level function whatever
; The .
are function composition.
More details: https://hackage.haskell.org/package/lens
and: https://www.schoolofhaskell.com/school/to-infinity-and-beyond/pick-of-the-week/a-little-lens-starter-tutorial
In Kotlin, functions marked with the infix
keyword can be called with the infix notation: https://kotlinlang.org/docs/reference/functions.html#infix-notation
Here are a few arguments for placing this feature in a library rather than in the core language
The syntax looks slightly alien compared to the rest of rust and it is not very googleable, but if we force the programmer to decorate each function with #[linalg_dsl]
or something similar, then the reader has a chance to google for the decorator and discover what is going on.
I will probably only be using the feature in a few math-heavy functions, so at least for me it doesn't require a lot of work to decorate each function separately.
If we end up with several competing macro libraries with varying syntax, then we have a change to pick the best one.
@nielsle
The syntax looks slightly alien compared to the rest of rust
Certainly no more alien than the operators the language already has?
it is not very googleable,
Haskell uses custom search engines (hoogle) to provide you with much better results than google could ever provide. I believe we could do the same.
Decorating functions with a proc macro such as this has the problem that proc macros are a very heavy weight DSL authoring mechanism; it costs a lot to make such a DSL. Meanwhile, custom operators, or functions with infix function notation such as in Haskell are extremely easy to make new ones of.
Would it be possible to define a macro for defining infix dsls? Something like this could be useful.
~~~ rust
// In library
pub fn dot( .., ..) -> .. { ..}
pub fn cross( .., ..) -> .. { ..}
define_infix_dsl!(my_linalg, [dot, cross])
// In code
fn foo() {..}
~~~
Would it be possible to define a macro for defining infix dsls? Something like this could be useful.
Let's experiment ;)
@nielsle In a DSL, the infix call style could just be enabled for all methods (with a self
param and one other param)
I'm trying to find a language that I can write vector math and graphics stuff in more easily than in C++, while retaining good performance. With respect to the people that have commented here saying infix operators are useless or nearly so, the ability to define and use infix operators of well specified fixity makes quite a significant difference in writing legible math-related code and avoiding mistakes.
@Centril
Why not decide associativity by vararg functions (at definition time) and when they don't exist, then assume by default left associativity.
Note, there are functions which are neither right nor left associative, e.g. the cartesian product:
A*B*C=*(A,B,C) != (A*B)*C /\ *(A,B,C) != A*(B*C)
Most helpful comment
a \dot b
is the nicest syntax I have seen proposed for this (taken from the previous discussion).The idea is to introduce one more operator to end all custom operators. Instead of defining custom operators, we can just use functions for this purpose.
I know the argument that methods are infix,
x.dot(y)
is infix. The infix binary function call proposal has advantages:x.dot(y.dot(z.dot(w))))
.x \dot y \dot z \dot w
is easier to read and even a parenthesized infix call version is easier to read:x \dot (y \dot (z \dot w))
.