Julia: tail call elimination

Created on 28 Nov 2013  Â·  38Comments  Â·  Source: JuliaLang/julia

It's interesting that this is valid syntax in Julia and Lua:

function rec()
    return rec()
end

The difference is that in Lua, when you call rec() it will stare at you and engage your motherboard's built-in space heater until you ctrl-C out. In Julia:

julia> rec()
ERROR: stack overflow
 in rec at none:2 (repeats 80000 times)

So the stack 80000 things deep. That's interesting.

Why does this matter? I'm not sure. But some people care a lot:

http://www.lua.org/pil/6.3.html

speculative

Most helpful comment

I can't figure it out either, but it's extraordinarily addictive. Perhaps because it's so rewarding.

Julia specializes in having a very high result-to-effort ratio :-)

All 38 comments

I'm a schemehead and appreciate TCO as much as anybody, but we're not actually planning to implement this so I'm just going to close this. Apologies.

There's a volcano of conversation coming from julia-dev. I'll definitely
search it as well is Issues in the future before bringing something up.
Clearly this topic has been discussed. I think the final word is:

Jeff Bezanson
Nov 24
No, no update. It's not a high priority, but eventually we will
probably do some of the optimizations, but not the semantic guarantee.

On Wed, Nov 27, 2013 at 6:24 PM, Isaiah [email protected] wrote:

See
https://groups.google.com/forum/#!searchin/julia-dev/tail-call/julia-dev/POP6YXCnP-k/qjEw2OiFVd8J

—
Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4964#issuecomment-29435198
.

Michael

Oh hey Jeff. I just quoted you from julia-dev but you had already responded
to me. How are you doing that?!? You actually showed up in person before I
finished reading the thread where you rejected the idea last year.

I want to figure out how this community is so engaged. A few hours before
thanksgiving and people are online. This is like the Reddit of ... I'm out
of metaphors. I need to go soak my head. What? Wow.

On Wed, Nov 27, 2013 at 7:14 PM, Jeff Bezanson [email protected]:

I'm a schemehead and appreciate TCO as much as anybody, but we're not
actually planning to implement this so I'm just going to close this.
Apologies.

—
Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4964#issuecomment-29436759
.

Michael

I refuse to believe you're out of metaphors. We're all enjoying them :)

Yes, we are enjoying all the metaphors. Please keep them coming. :-)

I must admit, I'm not sure why or even how I've gotten so involved with Julia. I think it has something to do with the other contributors, but the experiments proving this are still running.

I refuse to believe you're out of metaphors. We're all enjoying them :)

+1. I'm confident and hopeful that you can come up with something. I haven't yet seen one that involves both bicycle grease and wildebeests :-).

I must admit, I'm not sure why or even how I've gotten so involved with Julia. I think it has something to do with the other contributors, but the experiments proving this are still running.

I can't figure it out either, but it's extraordinarily addictive. Perhaps because it's so rewarding.

I can't figure it out either, but it's extraordinarily addictive. Perhaps because it's so rewarding.

Julia specializes in having a very high result-to-effort ratio :-)

Mentioning TCO and "tail call optimization" here so that we can avoid future duplicates of this issue.

If someone wants to go ahead and implement tail call elimination, then feel free to go for it. I'll leave this open for that eventuality.

+1

Also, language support for lazy evaluation is the other side of that coin.
LazySequences.jl demonstrates the power but not the performance that you
could have.

On Thu, Jan 9, 2014 at 5:51 AM, Stefan Karpinski
[email protected]:

Reopened #4964 https://github.com/JuliaLang/julia/issues/4964.

—
Reply to this email directly or view it on GitHubhttps://github.com/JuliaLang/julia/issues/4964
.

Michael

Look, we don't really need this and it is just a huge pain. It also conflicts with exposing C-callable entry points.

How about supporting the LLVM sibling calls? It doesn't mess with calling conventions and it would already go a long way, assuming the last criterion, which I don't understand, isn't too bothersome.

  • _If any of the callee arguments are being passed in stack, they must be available in caller’s own incoming argument stack and the frame offsets must be the same._

This might also be relevant (and not engaging if you do use fastcc somewhere):

  • _Implications of -tailcallopt [the LLVM flag that enables TCO]:_

    • _To support tail call optimization in situations where the callee has more arguments than the caller a ‘callee pops arguments’ convention is used. This currently causes each fastcc call that is not tail call optimized (because one or more of above constraints are not met) to be followed by a readjustment of the stack. So performance might be worse in such cases._

Yes, we might as well mark some calls as tail calls and turn on this optimization. But that last criterion is pretty strict; the caller and callee have to have some of the same arguments. I don't know how often that happens. It seems like it excludes even tail-recursive counting loops.

For anyone stumbling across this as I did, I implemented a little macro called @rec for Lazy.jl that may help you out.

using Lazy
@rec function fryeggs()
    return fryeggs()
end
fryeggs() # may take a few hours

@rec re-enables Julia's unique cooking facilities (compatible metal-cased laptop sold seperately, disable fan for best results).

Less interesting example:

@rec reduce(f::Function, v, xs::List) =
  isempty(xs) ? v : reduce(f, f(v, first(xs)), rest(xs))

Might be nice to have simple optimisations like this as a compile step, even if optimising all tail calls is a pain.

Recurssion is a very in-style thing for Julia.
Because using multiple dispatch to recognise terminating condition based on change of argument.
Thus programmer are already thinking recussively.

Though I'm not sure the best usecases are solved with tail call recussion optimistation, vs more significant optimisation to flatten a (overlapping) broader class of recussive algorithms.
Which I suspect LLVM is doing already.

Please consider TCO in implementation of finite machine state, in Erlang is relative easy because this, would be great if Julia include too:


%%%% get_content_service.erl
%% Implement next state machine:
%%
%%   From (A) to (B)  when "begin_tag"
%%   From (B) to (B)  when "content"
%%   From (B) to (A)  when "end_tag"
%%   From (A) to end  when "end_document"
%%
%%                end
%%                 ^                                         ___  content
%%   end_document  |                                       /     \
%%                 |             begin_tag                 |     /
%%             -> (A) ---------------------------------> (B) <-
%%            |                                           |
%%             \ ________________________________________/
%%                                end_tag
%%
%%
-module(get_content_service).
-export([start/0]).
-export([a/1]).
start() ->
  register(get_content_service, spawn(get_content_service, a, [[]])),
  run_test(),
  ok.
a(Values) ->
  receive
    {begin_tag} -> b(Values);
    {end_document, From} -> From ! {result, Values}
  end.
b(Values) ->
  receive
    {content, Value} ->
      NewValues = lists:append(Values, [Value]),
      b(NewValues);
    {end_tag} -> a(Values)
  end.
run_test() ->
  get_content_service ! {begin_tag},
  get_content_service ! {content, 1},
  get_content_service ! {content, 2},
  get_content_service ! {end_tag},
  get_content_service ! {begin_tag},
  get_content_service ! {content, 99},
  get_content_service ! {end_tag},
  get_content_service ! {end_document, self()},
  receive
    {result, Values} -> io:format("~w~n", [Values])
  end.

Then in erlang shell:


1> c("get_content_service").
{ok,get_content_service}
2> get_content_service:start().
[1,2,99]
ok

No one here is against TCO, the problem with saying "we have TCO" is that we then have to guarantee that the optimization is performed for any call in tail position, and LLVM does not at the moment provide a way to enforce this. I would expect it to happen in simple cases though.

I would expect the pop of the GC frame would prevent TCO from happening in many cases... (not this one though)

Correct me if I'am wrong, but can a tail recursive call not be rewritten to a loop by the julia compiler?. I mean that julia recursive code is transformed in loop llvm IR code.

As an optimization, sure, with very low priority.
As a semantic guarantee, very unlikely.

@yuyichao
What do you mean with semantic guarantee?

That functions like

function rec()
    return rec()
end

will be guaranteed to have tail call and not stack overflow.

Thanks for replying.
And why could not the julia compiler write this to
while(true) end

There's no reason it can't for this case, but it most likely can't guarantee that.
Doing that in the compiler level requires guaranteed inferablility (the compiler has to know that you are calling the same function/calling with the same argument types), which is not part of the semantics/guarantee. Doing that without the compiler transformation requires a runtime interface that can do tail call, that's pretty hard/impossible to do (when you need to interleave compiled and "interpreted" frame and somehow tail call both of them).

Local recursion is the easy case. Full tail-call semantics mean that every call in tail position must use no stack space, no matter how many functions are involved or what the structure of the call graph is.

Is it a impossible task to guarantee it or does it only slow down your compilation step. If only the latter is true, then something like tag @\doTCO in source code or julia -O3 for the command line would be a solution.
I have another question, is it possible to manipulate the julia compiler via a plugin framework, I've heard that some languages do support it?

It is impossible with the current compiler infrastructure if only because LLVM doesn't support it. Whether there is a hypothetical implementation of the language that could provide the guarantee is up in the air. Hooking into the compiler is generally done via generated functions at this point.

It's not impossible, it's just very difficult for little practical benefit. Also, the point of "guaranteed" TCO is not raw performance --- it's not always faster --- the point is to be able to express constant-space control flow with function calls.

Julia supports macros and @generated functions, which effectively let you extend the language at two stages. Look through the metaprogramming section of the manual. Beyond that I'd recommend asking more specific questions on discourse, explaining what you'd like to accomplish.

Maybe it's been obvious for you all along, but it just occurred to me that you could use more than one LLVM calling convention, @JeffBezanson maybe that's what you meant by "very difficult" rather than "impossible with the current compiler infrastructure".

Couldn't you use a calling convention that supports TCO for Julia => Julia calls, and use CCC for Julia => C and C => Julia? (the latter with a thin wrapper on the C side of things to bridge the CC mismatch... alternatively, Julia functions that are called from both C and Julia could be compiled twice, once for each CC...)

That would enable TCO for pure Julia code, though off course inserting a C frame would break the optimization.

A quick DDG search yielded this: http://lists.llvm.org/pipermail/llvm-dev/2008-December/018955.html so it is not impossible to achieve.

Edit: noting that LLVM only supports TCO for a couple of CPU arch, still not sure it is worth it. My use case would have been fast parser combinators, but I understand it isn't the main focus of the language... I've been out for a while, the progress made in the mean time is impressive :-)

Using multiple LLVM calling conventions is not the problem,

  1. We already don't use C calling convention between julia functions though that does not help tail call at all.
  2. The generic case of tail call basically can't avoid calling C function, since type inference and call site de-virtualization (both of which are necessary for removing C calls) are both optimizations and not guaranteed.
  3. A different calling convention can't trivially solve this. Ref my comment right above https://github.com/JuliaLang/julia/issues/4964#issuecomment-117299850

Basically to have this as a guarantee rather than an optimization, you'll need to change many different part of the runtime so that they will not get in the way of tail call.

Ooh, ok my bad...

Couldn't you let users request TCO explicitly (@TCO return foo()?) and have the compiler error out if type inference and devirtualization can't be achieved?

A feature that depends on type inference and compilation is not something we want to have. We could of course have a very limited version that are defined to work, but it seems hard at a glance to come up with a restriction that's both useful and easily implementable.

Sorry to bother, but after 3 years I wonder if there is tail calls support for Julia now ?

No, it's easy to check the function at the top, in the latest master, and in my 1-day old master I got:

julia> rec()
ERROR: StackOverflowError:
Stacktrace:
 [1] rec()
   @ Main ./REPL[1]:2

That said, there's still the different Easter-egg lnaguage (with TCO)::

$ julia --lisp

Why do you need tail call elimination?

Why do you need tail call elimination?

So that (mutually) recursive functions can be compiled efficiently.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yurivish picture yurivish  Â·  3Comments

i-apellaniz picture i-apellaniz  Â·  3Comments

Keno picture Keno  Â·  3Comments

StefanKarpinski picture StefanKarpinski  Â·  3Comments

omus picture omus  Â·  3Comments