Crystal: [RFC] Change tuple syntax to use parentheses

Created on 2 Mar 2016 · 28Comments · Source: crystal-lang/crystal

Right now tuple literals use curly braces:

{1, 2, 3}

What do you think if we change it to parentheses?

(1, 2, 3)

The rational for this is (in no specific order of importance):

According to Wikipedia: "Tuples are usually written by listing the elements within parentheses "()" and separated by commas; for example, (2, 7, 4, 1, 7) denotes a 5-tuple. Sometimes other symbols are used to surround the elements, such as square brackets "[ ]" or angle brackets "< >". Braces "{ }" are never used for tuples, as they are the standard notation for sets."
Parentheses are usually used in other languages for tuples, for example in Python, Rust, Julia and Swift (however, Erlang and Elixir use curly braces)
Parentheses feel more lightweight... but this is probably subjective
You could pass a tuple to a method without extra parentheses: foo (1, 2) instead of foo({1, 2}). It's definitely just a small gain.

In the future we could have a named tuple type, written as (x: 1, y: 2). You can think a NamedTuple is to a Hash what Tuple is to Array. We could introduce **args in method definitions and ** for splats, similar to Ruby, so you could do, for example:

def bar(x = 1, y = 2)
end

def foo(**args)
  bar(**args)
end

foo(x: 1, y: 2) # works, args will be (x: 1, y: 2) and when splatted it will match bar
foo(x: 1, z: 2) #  will match foo, but won't match bar, so compile error

a = (x: 1, y: 2)
foo(**a) # works

The above will make the delegate macro (or any other similar macro) to work in more cases (the block handling case still remains an issue). Of course, changing a tuple syntax to use parentheses isn't related to this, but we would need to find a way to write named tuple literals, but {x: 1, y: 2} already means a hash literal.

As in other languages, to write a single-element tuple you have to write:

(1, ) # note the trailing comma

to distinguish it from a grouping expression, but this usage shouldn't be frequent.

The cons is that there's code out there that uses curly braces and would need to upgrade. We could make a release that accepts both syntaxes, and have the formatter upgrade the code to the new syntax.

Another cons is that to specify an array of tuple, right now you do Array({Int32, String}), but with the new syntax it's Array((Int32, String)). The double parens are needed, otherwise Array(Int32, String) is "wrong number of type arguments".

draft compiler

Source

asterite

👍7 👎1

All 28 comments

I think it might be slightly confusing and conflate with function calls, especially since parens on function calls are optional. I personally don't mind the {} syntax.

refi64 on 2 Mar 2016

I dont mind the curly braces, maybe because I associate tuples with Erlang. I like that it looks different than grouping (maybe?).

Curly braces are also used as block delimiter, and using parens could help remove some confusion, or compiler limitation, for example p {a, b} doesn't compile.

So, why not? I don't mind the parens either.

ysbaddaden on 2 Mar 2016

Naive (and total ignorant question), how this affects the constructor/usage of Class{}, like Set{1, 2} or Array{1, 2, 3}?

The other is a detail (subtle, actually) about foo(args) vs foo (tuple) that might appear confusing on a quick review of some code, making a single space the difference between an error or working code, in which case the error being presented should be clear about it :sweat_smile:

Beyond that, I don't mind.

luislavena on 3 Mar 2016

From all the options I prefer the angle bracket.

The only issue (as in C++) is that probably we would need to write < <1, 2>, 3> instead of <<1,2>, 3> but since generics syntax is not under question here it won't appear as much as in C++.

Using < > seems aligned with type theory texts :-). It plays nice with named tuples in the future.

Maybe I am missing something else form the grammar. But besides the cons pointed, I see no other reason why not go for < >.

bcardiff on 3 Mar 2016

👍1

please also think about to change macro brackets (i prefer <%%> <%= %>), instead of
p({{{1}}, 2, {{3}}})

{, } is used too much

kostya on 3 Mar 2016

@kostya You are right. <% is a bit different than everything else in the language so it could be easily recognizable as "macro code". However, <%= ... %> is a bit long compared to {{...}}, and if we leave spaces inside <%= ... %> it looks weird.

macro setter(name)
  def {{name}}=(@{{name}})
  end
end

macro define_method(name)
  def <%=name%>=(@<%=name%>)
  end
end

macro define_method(name)
  def <%= name %>=(@<%= name %>)
  end
end

I do agree that {{...}} and {%...%} are maybe not the best option.

asterite on 3 Mar 2016

@luislavena No problem with Array{1, 2, 3}, though it's true that there might be some semantic relationship, but we'd have to keep the curly braces there as parens are for generic type instantiations.

@bcardiff I wouldn't use <...>, exactly because it's a mess to parse

asterite on 3 Mar 2016

Since named tuples will allow **args and curly braces with names are hash,
square brackets will conflict with arrays when no names are used with
tuples, angle brackets are awful to parse, and other unicode braces are
hard to type... Lets move to parens :-)

bcardiff on 3 Mar 2016

I've been thinking of suggesting this for a long time, for the obvious reason you quoted from wikipedia, and that the braces would be better used for Sets. I've refrained because I figured it wouldn't be appreciated because of confusion with parenthesized calls (as @luislavena examplifies) and procs.

@bcardiff 's idea on using <> is interesting, but of course, as @asterite mentioned, there's the parsing (C++11 fixed the parsing of generics btw).
Right now, as mentioned, braces are over distributed in syntax, but going parens, would shift to them being used to the level of more confuzion.

Hard trades.

_Purely out of a code comprehension perspective, I'd vote for <>._

ozra on 3 Mar 2016

Using <...> is really hard for parsing.

a, b, c = foo <1, 2, 3>

Really hard for the parser, is it a, b, c = foo < 1, 2, 3, ... ups, there's a >

asterite on 3 Mar 2016

FYI, C++ being able to parse something probably is _not_ a good indicator of whether or not it can be easily parsed, considering you need stuff like typename T::xyz and vector<L<s, Z<1>>, allocator<X<Y<Z>>>>.

refi64 on 3 Mar 2016

And in that @asterite example, it is ambiguous if it is
a = foo < 1 ; b = 2 ; c = 3
or
a, b, c = foo(<1, 2, 3>)

C++ does not have that issue since parens in function call are optional.
This is a killer for < > I think :-( .

Maybe we could use #{ x: 2, y: 3} for named tuples. Or some other symbol. Whether or not change the nameless tuples to use that symbol would be another thing to choose.

bcardiff on 3 Mar 2016

@kirbyfan64 - No, C++ is a really bad role model, it does lex-parse-semantics-soup (so it knows what the symbol is in order to decide what the angulars means), which is why it's so slooow.

The angulars are a welcome idea though, cause they might serve in Onyx.

"Purely isolated" parentheses are of course superior, and more beautiful (and correct). The only con as I see it is the visual confusion in actual code, as mentioned by several (the one space difference between call with tuple arg, or multiple args).

ozra on 3 Mar 2016

I know of several languages that use #{} for sets, though...

refi64 on 3 Mar 2016

You could pass a tuple to a method without extra parentheses: foo (1, 2) instead of foo({1, 2})

This point is the biggest downside to me actually. Typos around that area will lead to more confusing error messages or even wrong behavior if a method is liberal about what types it accepts. It becomes even more confusing with single element tuples

add 1, 2   # two params
add(1, 2)  # two params, still
add (1, 2) # single param, ups

set 1    # single param
set(1)   # single param, still
set (1)  # single param or tuple?
set (1,) # ok, tuple...
set(1,)  # oh syntax error

I think in the edge cases it becomes harder to read by using regular parenthesis, and after all code is far more often read than written.

jhass on 4 Mar 2016

👍1

I believe in Ruby foo (1, 2) is a parse error at the comma because foo (1) is parsed as (1) being a grouping expression, not parentheses for a method call.

I agree it may lead to some confusion, but I'm not sure we'll see this code much. Do we often pass a tuple literal to a method call?

ysbaddaden on 4 Mar 2016

I agree with @ysbaddaden here. Passing an explicit tuple to a method is pretty rare. Also, if you make this mistake you will probably immediately receive a compiler error at some point. Or, we could prohibit passing a tuple to a method like that, forcing you to use double parentheses, maybe.

asterite on 4 Mar 2016

I'm not sure it's worth handling a special case, is it?

ysbaddaden on 4 Mar 2016

I guess not. We could add it if we later find that it's indeed a real issue (but I doubt it).

asterite on 4 Mar 2016

I wouldn't disallow foo (1,2) neither.

Also I don't think there will be lots of singleton tuples (1,) so we won't see that a lot.

I would support always a trailing comma in tuple though. That way macro generated tuples are easy to create (a for loop will always add a comma after the element).

Empty tuples will be (). That actually has the same think between foo () / foo() and foo (1,2) / foo(1,2) .

And just for completeness, I would not parse (,) as an empty tuple.

I do want to see named tuples (subtyping will come along probably :-) ) and is neat to have a similar notation for named & positional tuples.

bcardiff on 4 Mar 2016

@bcardiff +1 for named tuples

wontruefree on 5 Mar 2016

On the other hand, parentheses for tuples sometimes are hard to spot, or look weird. For example:

# with new syntax it's one of these
assert { 5.divmod(3).should eq((1, 2)) }
assert { 5.divmod(3).should eq (1, 2) }

# with current syntax
assert { 5.divmod(3).should eq({1, 2}) }

Or:

# Parentheses used for grouping and for tuples, so when you see a parentheses
# you have to visually check if there's a comma
a = (foo(bar, baz) + 3) * (4 + x)
b = (foo(bar, baz), 4, x)

# With curly braces maybe it's more obvious
a = (foo(bar, baz) + 3) * (4 + x)
b = {foo(bar, baz), 4, x}

Or:

# A bunch of code, there are parentheses, have to look careful to see that it's returning tuples
res.map do |cur_piece|
  dy, dx = cur_piece.min
  cur_piece.map do |yx|
    (yx[0] - dy, yx[1] - dx)
  end
end

# With curly braces it's a bit more obvious
res.map do |cur_piece|
  dy, dx = cur_piece.min
  cur_piece.map do |yx|
    {yx[0] - dy, yx[1] - dx}
  end
end

Or:

(foo bar).baz
(foo, bar).max # kinda looks like a call on an object, and it is, but the tuple is maybe not that obvious

Or:

# This is real code in the compiler
has_pkg_config = Process.run("which", {"pkg-config"}, output: false).success?

# Trailing comma needed for one-sized tuple, ugh
has_pkg_config = Process.run("which", ("pkg-config",), output: false).success?

Maybe it's just that I'm used to curly braces for tuples already.

Of course one can argue the same with tuple literal vs. hash literal:

{1, 2, 3]
{1 => 2}
{foo: bar, baz: qux}

but I think the difference is bigger, either because of the => or because the keys get colored.

So, if we decide to use parentheses for tuples we can, in the future, use (x: 1, y: 2) for named tuples. If we stick with curly braces for tuples, my vote would be to make:

{x: 1, y: 2}

be a named tuple, not a hash. For a hash, you'd have to use {:x => 1} if you wanted a symbol as a key. We could still maybe support {"foo": 1} as a hash literal with a string key, because it's convenient and it resembles JSON, but that would be the only exception. Or just use =>, which might be overall more consistent.

asterite on 6 Mar 2016

👍1

Scala uses parentheses for tuples and I have never met any of these problems, moreover I have never had any problem with it. Also i think tuples with curly braces are very similar to blocks.

hangyas on 6 Mar 2016

I like curly braces more since it is easier to identify whether something is a tuple or not. I would not mind if a different character would be used, but not parentheses because IMO this makes tuples harder to spot and creates some ambiguities and special cases that one has to learn about.

Ragmaanir on 26 Mar 2016

I've implemented both parentheses and angular tuple-notation in Onyx now; I'll "share my experiences" for how it works out in this issue, if it can help in any way.

ozra on 27 Mar 2016

at risk of a dumb idea I was staring at my keyboard for symbols that might work.
what about colons

:1, 2, 3, 4:
:foo: 1, bar: 2:
:foo => 1, bar => 2:

wontruefree on 29 Mar 2016

We decided that curly braces are just fine for tuples and named tuples.

asterite on 10 May 2016

🎉1

I can chip in here also, after having paren-tuples (and angular-tupes) in Onyx for some time now, that it _is very unclear_ syntactically, still thinking about alternatives. Curlies are no doubt clearer to distinguish.

ozra on 10 May 2016

Was this page helpful?

0 / 5 - 0 ratings