Rfcs: Reviving tail-call elimination

Created on 20 Apr 2019  路  10Comments  路  Source: rust-lang/rfcs

It's been dead for a while, but I'd like to propose starting to think about TCE again. The discussion that I saw before was mostly centred around certain algorithms being more convenient to implement using recursion, but I'd like to give an example that's impossible to implement without TCE and that I actually genuinely do have a personal need for. To create a fast interpreter with 100% safe code, one possible implementation is like so:

pub struct Instruction(pub fn(&mut Stack, &[Instruction]) -> Result<Value, Error>);

pub fn convert(instructions: &[OpCode]) -> Vec<Instruction> {
    instructions
        .iter()
        .map(|i| match i {
            OpCode::Add => Instruction(instructions::add),
            // ...
        })
        .collect()
}

mod instructions {
    use super::{Error, Instruction, Stack, Value};

    pub fn add(stack: &mut Stack, next: &[Instruction]) -> Result<Value, Error> {
        let (a, b) = (
            stack.pop().ok_or(Error::new())?,
            stack.pop().ok_or(Error::new())?,
        );
        let out = a.checked_add(b).ok_or(Error::new())?;
        if let Some((next, rest)) = next.split_first() {
            stack.push(out);
            return (next.0)(stack, rest);
        } else {
            Ok(out)
        }
    }

    // ...
}

This can compile to excellent assembly, with similar performance to a naive compiler modulo the worse instruction cache usage and jump for each instruction, but you can see in this toy implementation that it compiles to call+ret using current Rust. With proper TCE error handling should just be the cost of creating the error plus a ret. Unfortunately, it is impossible to express this in Rust right now, since TCE is not guaranteed and there's no way to assume that it will happen in debug mode. Another issue is that recursive fn definitions aren't allowed, but that doesn't prevent this from being written, it just prevents it from being safe.

A-optimization A-tail-recursion T-compiler T-lang

Most helpful comment

I am just here to add a small voice in support of TCE. Seems like this topic has not had any love in over a year :(

All 10 comments

This is besides the main point, but is there some reason this is not a suitable alternative to recursive typedefs?

struct Instruction {
    fun: fn(&mut Stack, &[Instruction])
}

This is besides the main point, but is there some reason this is not a suitable alternative to recursive typedefs?

struct Instruction {
    fun: fn(&mut Stack, &[Instruction])
}

Yes, that's a great solution that I hadn't thought of

EDIT: Updated original post to use this method

It's worth noting that LLVM doesn't even emit tail calls for noreturn functions - it emits call + ud2 https://play.rust-lang.org/?version=nightly&mode=release&edition=2018&gist=2ccde35f7cc96744cb46df6b11ff13c9

EDIT: Now that I think about it, this and the original example are probably both explicitly prevented by Rust for indirect function calls in order to preserve panic safety. Maybe Rust should only prevent TCE in the case that the caller has drop code to run.

The reason LLVM can't emit tail calls for noreturn functions is that enabling TrapUnreachable means that they're no longer tail calls at all https://github.com/rust-lang/rust/pull/45920

Would recommend explaining that TCE means tail call elimination in your post for those of us who had to read the thread to figure that out.

Would recommend explaining that TCE means tail call elimination in your post for those of us who had to read the thread to figure that out.

Good point, I've fixed the title

I am just here to add a small voice in support of TCE. Seems like this topic has not had any love in over a year :(

Is there a reason we shouldn't require caller and callee signatures to match? That would allow the use of musttail.

Is there a reason we shouldn't require caller and callee signatures to match? That would allow the use of musttail.

Mutual recursion? Requiring matching signatures is sort of acceptable as a stop-gap restriction assuming that the elimination requires the explicit use of a dedicated keyword. However, if it would be implicit, you'd get the nasty surprise of mutually recursive functions sometimes working and sometimes not, with no indication from the compiler. That would be pretty bad.

@Vurich
See #1888 for status. Would be nice, if you can close the issue after the thread is clarifying for you.

Was this page helpful?
0 / 5 - 0 ratings