Elixir: Compiler fails to catch incorrect guard usage in defmacro

Created on 23 Mar 2018  路  8Comments  路  Source: elixir-lang/elixir

Given the following:

defmodule Adder do
  defmacro add(x, y) when is_integer(x) when is_integer(y) do
    quote do
      unquote(x) + unquote(y)
    end
  end
end
  1. This actually compiles, notice the duplicate when usage which is obviously wrong
  2. If you then require and call add/2 with incorrect arguments, it blows up at runtime. In the case above that's obvious, but in the case I encountered this, it resulted in some really hard to trace behaviour because I couldn't understand how the macro in question was ever being called
iex(4)> Adder.add(1, "foo")
** (ArithmeticError) bad argument in arithmetic expression
    :erlang.+(1, "foo")

I would expect this construction to not even compile at all.

Most helpful comment

To clarify the Erlang inheritance because there isn't really multiple when and the translation is not obvious. In Erlang a guard sequence is a semi colon separated list of guards, and each guard is a comma separated list of guard expressions. A guard expression is a subset of an Erlang expression. For a guard sequence to match one of the guards must return true. For a guard to return true, all of its guard expressions must return true. If any guard expression fails (does anything but successfully return true) the guard also fails, and the next guard in the guard sequence is tried until none are left and there isn't a match.

The gotcha here is how a failure of a guard expression propagates and it can be surprising or confusing. I think many people who know Erlang for a long time will not know exactly this.

In Elixir each guard has a single expression (i.e. guard with single guard expression) by combining with and and or when required. A guard sequence is then created with multiple when - there is not an exact equivalent to Erlang's guard with multiple guard expressions. Neither does Elixir allow both and and andalso, nor or and orelse, so can skip those. Therefore there is much less complexity and avoid inheriting guard gotchas.

If there is too much cognitive overhead then likely the guards are probably too complex! Moving the logic into functions or perhaps defguard might be more appropriate to simplify the code. when could help with this though, and there are cases where it can produce simpler guards.

I think that knowing the rough Erlang inheritance can help here, if you model the flow control in your head as multiple sequences then use multiple when and if it is a single expression use a single when with or.

Sorry this was a long answer @eksperimental.

All 8 comments

That's not invalid code though...

Two when's are entirely valid and are a non-failing 'or' test, essentially this:

def blah(x, y) when is_integer(x) when is_integer(y)

Acts like:

def blah(x, y) when is_integer(x)
def blah(x, y) when is_integer(y)

Which is distinctly different than:

def blah(x, y) when is_integer(x) or is_integer(y)

Because something like this:

def blah(x) when x==0 or is_tuple(x)

Will never work if it is a tuple because if you compare a tuple with 0 then the whole guard fails out, where this will work:

def blah(x) when x==0 when is_tuple(x)

In 'general' you can think of multiple when's being just like or's except they don't early-fail. :-)

Erlang works the same way (it also supports multiple when's for identical usage).

So nope, it is not obviously wrong and is indeed entirely required at times. :-)

Multiple whens in guards are valid Erlang and Elixir code. The whens will be tried in order; when one fails (is false or raises - for example, hd(1)), we move on to the next. If one matches, we execute the clause.

If you have this module:

defmodule A do
  def foo(x)
    when is_integer(x)
    when is_atom(x) do
    {:ok, x}
  end

  def foo(_), do: :error
end

Then this happens:

iex> A.foo(1)
{:ok, 1}
iex> A.foo(:foo)
{:ok, :foo}
iex> A.foo([])
:error

This is expected and correct behaviour. I can't recall if we document this somewhere but I will check again and write something about it if it's not.

Edit

def blah(x) when x==0 or is_tuple(x) will never work if it is a tuple because if you compare a tuple with 0 then the whole guard fails out

@OvermindDL1 this is not true though. If you compare a tuple with 0 it's perfectly fine, it's just always gonna return false. A correct example would be something like this:

def blah(x) when hd(x) == 0 or is_map(x), do: :ok

In this case, if x is not a list then hd/1 will fail and the whole guard will fail. If written like:

def blah(x)
  when hd(x) == 0
  when is_map(x) do
  :ok
end

then if x is not a list we will still try the is_map/1 guard.

Other edit

There's some examples of this in Elixir itself as well. For example: https://github.com/elixir-lang/elixir/blob/92162f5738763654187151d4ddfed316f18c1ebc/lib/ex_unit/lib/ex_unit/doc_test.ex#L735-L740

Personally I think that is terrible behavior, and not at all what I would expect. It is better to require multiple clauses than to support this (imo). I spent a fair amount of time trying to figure out why a specific macro clause was being called, and it took me forever before I realized what the problem was - and if I'm hitting that after spending this long with the language, I can only imagine someone new dealing with it. Even when I realized what the problem was, I assumed it had to be a bug, I didn't even realize this was supported in Erlang.

It seems like it was intentional to support this usage, so not much to be done here, but this feels like something Elixir should prevent - whatever flexibility this was intended to provide is outweighed by the potential for pain.

So does this mean that as a rule of thumb it is always better to use multiple whens than to use "or", as it is too much overhead thinking if a previous guard would raise and the following expression never gets executed?

To clarify the Erlang inheritance because there isn't really multiple when and the translation is not obvious. In Erlang a guard sequence is a semi colon separated list of guards, and each guard is a comma separated list of guard expressions. A guard expression is a subset of an Erlang expression. For a guard sequence to match one of the guards must return true. For a guard to return true, all of its guard expressions must return true. If any guard expression fails (does anything but successfully return true) the guard also fails, and the next guard in the guard sequence is tried until none are left and there isn't a match.

The gotcha here is how a failure of a guard expression propagates and it can be surprising or confusing. I think many people who know Erlang for a long time will not know exactly this.

In Elixir each guard has a single expression (i.e. guard with single guard expression) by combining with and and or when required. A guard sequence is then created with multiple when - there is not an exact equivalent to Erlang's guard with multiple guard expressions. Neither does Elixir allow both and and andalso, nor or and orelse, so can skip those. Therefore there is much less complexity and avoid inheriting guard gotchas.

If there is too much cognitive overhead then likely the guards are probably too complex! Moving the logic into functions or perhaps defguard might be more appropriate to simplify the code. when could help with this though, and there are cases where it can produce simpler guards.

I think that knowing the rough Erlang inheritance can help here, if you model the flow control in your head as multiple sequences then use multiple when and if it is a single expression use a single when with or.

Sorry this was a long answer @eksperimental.

This is expected and correct behaviour. I can't recall if we document this somewhere but I will check again and write something about it if it's not.

We do have a page about guards and it is documented there in its own section:

https://hexdocs.pm/elixir/guards.html#multiple-guards-in-the-same-clause

@fishcakez this was an excellent and very detailed (and not unnecessary long) answer.
My comment regarding the overhead was related to the fact that I will have to double or triple check if given invalid data to the first expression this one will raise and the second expression after or will never get executed.
Thank you for your answer!

@OvermindDL1 this is not true though. If you compare a tuple with 0 it's perfectly fine, it's just always gonna return false. A correct example would be something like this:

I entirely meant to put something like elem instead of the == test there... ^.^;

_/me blames end of day brain mush from Friday... >.>_

Was this page helpful?
0 / 5 - 0 ratings

Related issues

lukaszsamson picture lukaszsamson  路  3Comments

chulkilee picture chulkilee  路  3Comments

Irio picture Irio  路  3Comments

andrewcottage picture andrewcottage  路  3Comments

GianFF picture GianFF  路  3Comments