A comment in TailRec states that
* Self-recursive calls in tail-position are replaced by jumps to a
* label at the beginning of the method. As the JVM provides no way to
* jump from a method to another one, non-recursive calls in
* tail-position are not optimized.
Is it possible to build a self-recursive composite method out of mutually recursive methods, in order to benefit from the self-recursive tail call optimization of the JVM? i.e. when the following is detected:
def a(...): Result = {
if (...) new Result(...)
else b(...)
}
def b(...): Result = {
if (...) new Result(...)
else a(...)
}
then to replace it with something like this:
def ab(..., method: Int): Result = {
method match {
case 0 => { // code from a()
if (...) new Result(...)
else ab(..., 1)
}
case 1 => { // code from b()
if (...) new Result(...)
else ab(..., 0)
}
}
}
N.B. I am not familiar with the compiler internals at all, just wondering whether such a thing is possible, as I would love for Scala to have better TCO :smile:
AFAIK not on the JVM - @densh might have something more for you on what is going on in scala-native.
On the current JVM, the only thing that is impossible to do is tail call optimisation between different functions (mutual recursion). This isn't particularly complex to implement (other languages like Scheme have had this feature from the beginning) but it would require changes to the JVM spec. For example, you'd have to change the rules about preserving the complete function call stack.
@k0ala, it is possible, and you won't even need to jump back into match around from inner branches, you'll be able to jump directly to the expected case.
There are several issues with such transformation though:
I was thinking about implementing such a transformation when I started working on Dotty. Given the restrictions, I thought that it makes sense to do it in those two cases:
@mutualTailrecIn first case, user explicitly requested this, in second case, you're side-stepping the separate compilation issue.
Additionally, it may be lot safer to do this in linker, as call-graph would be able to find the calls even between different classes. It would severely hit debuggability though.
@felixmulder I meant that the compiler should rewrite such a constellation into a single composite tail-recursive method, so that the (current) JVM is able to optimize it. But I'll also look into scala-native, thanks.
@DarkDimius Interesting. What do you think are major obstacles to this approach?
We have some other transformations which are higher priority for us.
Working on this transformation right wouldn't be the easiest way to start working on Dotty, but I'd be glad to help if some-one steps up.
@k0ala - ah well then, as Dmitry mentions - that would be possible. @DarkDimius - is this something to consider for linker?
@felixmulder, yes, this is a nice idea.
Though getting the transformed tree to Ycheck may be very complex, even more if the methods used to originate from different classes. It may be made easier after we have decided on what are the restrictions on @inline methods. Thinking about it, the restrictions in both cases seem similar to me.
Related discussion happened in comments here: http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6316156
in short, this transformation may introduce slowdowns if some method inlining stops happening.
@felixmulder Speaking of Native, as soon as we fix https://github.com/scala-native/scala-native/issues/122, LLVM should optimise tail calls for us.
@DarkDimius Interesting links about inlining/unrolling. Could it make sense to make the composite method only a kind of dispatcher, with most of the bodies of the functions "outsourced"? So that would yield something like
def ab(..., method: Int): Result = {
method match { // (or without the match statement, as you mentioned)
case 0 => {
... = aBody(...)
ab(..., 1)
}
case 1 => {
... = bBody(...)
ab(..., 0)
}
}
}
def aBody(...): ... = {
... // code from a()
}
def bBody(...): ... = {
... // code from b()
}
This would reduce the size of the method, and although it would cause the creation of stack frames when calling a body, there would always be only one stack frame on top of the loop (i.e. no StackOverflow). The different bodies could then also get inlined, or at least those that get called most frequently. The outsourcing might be difficult though.
/cc @wheaties, who is experimenting in this space https://github.com/wheaties/TwoTails
@k0ala that's a technique I'm going to explore once I get the base case of what I'm writing worked out.
@DarkDimius are you saying labels are not easily optimized?
@wheaties, lablels are gotos, vm does not know about labels. But VM knows a lot about egg, method size, and does not actually count number of executed instructions inside the method. By creating a huge method that contains bodies of all the methods in the recursive group, you would quickly hit some constants in the JIT that would make a lot of important optimisations not performed, most important of which is inlining. Note that even the current tailrec we can stop VM from applying some optimisations, see for example here: https://shipilev.net/blog/2014/java-scala-divided-we-fail/
Most helpful comment
@k0ala, it is possible, and you won't even need to jump back into
matcharound from inner branches, you'll be able to jump directly to the expected case.There are several issues with such transformation though:
I was thinking about implementing such a transformation when I started working on Dotty. Given the restrictions, I thought that it makes sense to do it in those two cases:
@mutualTailrecIn first case, user explicitly requested this, in second case, you're side-stepping the separate compilation issue.
Additionally, it may be lot safer to do this in linker, as call-graph would be able to find the calls even between different classes. It would severely hit debuggability though.