TL;DR Replace the use of System.get_stacktrace() with a pseudo-variable __STACKTRACE__ similar to __MODULE__ or __CALLER__ available only inside catch and rescue clauses of the try expression.
Erlang recently set out to overhaul how stack traces are handled. The :erlang.get_stacktrace/0 call (and thus it's Elixir's equivalent System.get_stacktrace/0) are problematic - they may cause the emulator to hold on to the stack trace for a very long time. This can be harmful, since the stack trace can be very big - in case of FunctionClauseError it includes all the arguments. The emulator has to hold on the stack trace for the last exception in general until the next exception is raised - which can be a long time.
To rectify this potential performance issue, OTP 20 started warning on the use of :erlang.get_stacktrace() outside of the catch clause of a try expression (in Elixir this would be outside of catch and rescue). The plan was to make the call return an empty list in some future Erlang version when called outside of catch.
OTP 21 will introduce an even more restrictive mechanism, to understand it we need to understand the syntax for catch in Erlang:
try
...
catch
Kind:Reason ->
...
end
Where Kind can be one of :exit, :error and :throw. The proposed (and merged) enhancement replaces this with:
try
...
catch
Kind:Reason:Stack ->
...
end
Where the Stack variable would contain the stack-trace for the just-rescued exception. It is forbidden to pattern match on this variable or use an already-bound variable (which would be equivalent to pinning it in Elixir). It's now planned that some future Erlang release will always return empty lists from :erlang.get_stacktrace() and the only way to acquire the stacktrace will be with this syntax. This means we need a compatible Elixir solution.
To solve this, we propose introduction of a new pseudo-variable __STACKTRACE__ that would be available inside catch and rescue and would return the stack trace of the exception being currently handled.
This means that this code:
try do
...
rescue
e ->
...
reraise(e, System.get_stacktrace())
end
Would become:
try do
...
rescue
e ->
...
reraise(e, __STACKTRACE__)
end
The __STACKTRACE__ pseudo-variable would be only available inside catch and resuce parts of try - it should fail to compile if present in any other place, similar to how __CALLER__ is not available outside of macros.
Having this as this kind of pseudo-variable is also compatible with how this works in Erlang - since neither pattern matching nor pinning of the stack variable is allowed, we don't loose any features.
Since OTP 21 is not stable yet, and we also want to support multiple Erlang versions, a conditional compilation of this feature is required.
When compiling on systems without support for the new syntax, the __STACKTRACE__ pseudo-variable should compile into a call to :erlang.get_stacktrace/0.
Additionally, the call to :erlang.get_stacktrace/0 should happen as the very first thing in the block, assigning the result to some internal variable, which could be later substituted for the __STACKTRACE__ expression. This provides a very important advantage compared to the current System.get_stacktrace/0 call by protecting against the "wrong stack trace" bug, which is very easy to introduce accidentally:
try do
...
catch
kind, reason ->
cleanup()
report_error(kind, reason, System.get_stacktrace())
end
defp cleanup() do
{:ok, this_may_throw()}
catch
throw -> {:error, throw}
end
If the this_may_throw() operation does indeed throw, the System.get_stacktrace/0 call would return the value for that throw, instead of the exception we've just caught, since it always returns the stack trace of the last exception. With the use of the __STACKTRACE__ pseudo-variable, this would compile into something conceptually like:
try do
...
catch
kind, reason ->
stracktrace = :erlag.get_stacktrace()
cleanup()
report_error(kind, reason, stacktrace)
end
Which does not suffer from the issue. This means not only is the proposed feature giving us compatibility with Erlang, it is also a significant improvement in terms of preventing strange and hard to find bugs.
When compiling on a VM that supports the new syntax, the __STACKTRACE__ call should just use it, the previous example would compile into the following equivalent Erlang code:
try do
...
catch
Kind:Reason:Stacktrace ->
cleanup()
report_error(Kind, Reason, Stacktrace)
end
An alternative would be to also support something like:
try do
...
catch
kind, reason, stack ->
...
end
There are couple of problems with that approach:
rescue or catch with just one pattern (which catches only throws).Because of the conditional-compilation, the feature becomes compatible with all the Erlang releases. There is the constraint that code compiled on VM that supports the feature wouldn't work on older VMs, but this is also true today - code compiled on OTP 20 wouldn't work on OTP 18, even though Elixir in itself supports running on both VMs.
The System.get_stacktrace() call should be soft-deprecated in the next release that includes the __STACKTRACE__ feature. Additionally a loud deprecation should happen after 2 releases or as soon as Elixir declares compatibility for an Erlang VM that would always return an empty list for :erlang.get_stacktrace(). This might break the general deprecation policy approach of minimum 2 releases, but this is somewhat out of our hands.
So I looked a bit into actually implementing this, and one problem with __STACKTRACE__ is that it can be confusing and a bit inconsistent in nested trys. For example:
__STACKTRACE__ # 0
try do
__STACKTRACE__ # 1
throw :foo
catch
:foo ->
try do
__STACKTRACE__ # 2
throw :bar
catch
:bar ->
__STACKTRACE__ # 3
end
__STACKTRACE__ # 4
end
It's clear that #0 and #1 should just fail to compile. It's also clear that #4 should return the stack for throw :foo. But now #2 should return the stack for throw :foo as well, but then in #3 it should probably return the stack for throw :bar - it suddenly changes from one stack-trace to another. The erlang solution does not have this problem as you need to explicitly bind a variable to the stacktrace.
While it's true that nested trys don't appear often, it is something we should decide how to deal with.
@michalmuskala we could make #2 fail, forcing the user to explicitly set a variable before.
Yeah, this also comes down to approach to compiling the thing. I initially tried to make it work similar to __CALLER__ - translating to a variable and after translation mark a flag in elixir_erl record and later read it and prepend the __STACKTRACE__ = :erlang.get_element() call. The problem with that approach is that it won't work, since the variables would conflict.
@michalmuskala we can generate fresh variables every time. :)
@michallepicki we could even expand our catch/rescue to have three elements, the same as Erlang, but that would be completely internal. This way __STACKTRACE__ belongs exclusively to Elixir land and exclusively to the expansion pass.
You pinged the wrong Micha艂. 馃槅
But that would mean we'd need to expand rescue clauses to catch clauses in elixir, wouldn't it?
that's fine, I got used to it already, happens every few months ;)
Oops, sorry!
But that would mean we'd need to expand rescue clauses to catch clauses in elixir, wouldn't it?
I don't think we should do that because that is a behaviour that is fairly specific to Erlang. For example, ElixirScript probably wants to keep the high level semantics so rescue is also capable of rescuing JavaScript exceptions or something similar?
Maybe rescue should have two arguments too...
I am reopening because we need to do more work to prepare for Erlang 21.
How should we handle situations where System.stacktrace/0 is called when an exception doesn't occur, that is, there is no rescue/catch?
For example, we have some code where we handle error tuples by collecting information, which includes the stacktrace, and send the data to a monitoring service.
def set_error(transaction, name, message, reason) do
backtrace = ([reason] ++ System.stacktrace()) |> Enum.map(&inspect/1)
Appsignal.Transaction.set_error(transaction, name, message, backtrace)
end
@dideler You should not rely on the ability to get the stacktrace outside of a rescue/catch, it might get removed from the BEAM with any major release.
A workaround is just to throw an exception and then immediately catch it of course, which can be wrapped up in a function.
Moving to https://elixirforum.com/t/using-system-stacktrace-within-exq-middleware-behaviour-implementation/15813, sorry for the noise.
@dideler no problem at all.
Just one last note: System.stacktrace is stateful. It returns the status of the last exception. If you want to get the current stacktrace, then you should do Process.info(self(), :current_stacktrace). I will improve the docs here.
What's the correct way of printing the __STACKTRACE__? In my case, I rescue from an error but I still want to log it to console. This is what I'm doing right now:
def unreliable_method(item) do
# Do something with `item`
:ok
rescue
_err ->
Logger.error("Failed for item: #{inspect(item)}")
Logger.error(inspect(__STACKTRACE__))
{:error, :processing_failed}
end
The canonical way would be to use Exception.format_stacktrace/1 https://hexdocs.pm/elixir/Exception.html#format_stacktrace/1.
@michalmuskala Thank you. I looked at other methods of the Exception module and ended up using Exception.format/3 which made more sense for my use-case:
Logger.error(Exception.format(:error, err, __STACKTRACE__))
Most helpful comment
that's fine, I got used to it already, happens every few months ;)