Ghidra: SLEIGH: how to implement an "erepeat" instruction

Created on 7 Mar 2019  路  2Comments  路  Source: NationalSecurityAgency/ghidra

Hello,

I'm trying to add Toshiba MeP support to ghidra however I'm having trouble implementing the erepeat instruction.

The instruction works as follows:

                 erepeat loc_80B5AA
                 lw      $3, ($2)

 loc_80B5AA:
                 and3    $1, $3, 1
                 beqz    $1, loc_80B5B0

is equivalent to

while (1) {
                 lw      $3, ($2)

 loc_80B5AA:
                 and3    $1, $3, 1
                 beqz    $1, loc_80B5B0
}

(Notice that 1 more instruction is executed past the label specified in erepeat)

The problem is that there is no instruction at the end of the loop so I cannot add a symbol at the end which would perform a branch back to the top. The instruction/symbol is only present at the start of the loop so I need to propagate the label somehow.

What I've tried:

I've implemented erepeat as follows:

:erepeat rpe_ctx is op0015=0xe019 ; simm1631 [ rpe_ctx = inst_start + (simm1631 << 1); repeat_end = 2; globalset(rpe_ctx, repeat_end); ] {
    rpb = inst_next;
    rpe = rpe_ctx;
}

I'm using globalset to set a context variable on the end instruction that indicates it is the end of repeat. Also, I'm saving the next instruction address (beginning of the loop) to the $rpb register.

Then,

# Simulate mep delay slot in repeat/erepeat instructions
:^instruction is repeat_end=2 & instruction [ repeat_end=1; globalset(inst_next, repeat_end); repeat_end=0; ] {
    build instruction;
}

:^instruction is repeat_end=1 & instruction [ repeat_end=0; ] {
    build instruction;
    goto [rpb];
}

I'm propagating the end 1 instruction further. Once the "real" end instruction is identified, I append a branch back to the start of the loop.

Now, the problem is that since I'm performing a jump to a register, it generates BRANCHIND rpb p-code. Understandably, this confuses the decompiler a lot and it generates very weird code:

bad

As a test I've tried replacing goto [rpb]; with a hardcoded integer value specific to that function, and the generated code looked much better:

screenshot_20190306_185609

So my question is how should I do it properly? Is there a way to propagate the label specified in erepeat into instructions below it as a constant, or maybe there's some other way to implement it I'm not thinking of?

Question

Most helpful comment

Okay, I figured out a possible solution:

So first I need to propagate repeat-begin in the context so that we know where to jump (the globalset(end, rpb_ctx); part):

:erepeat end is op0015=0xe019 ; simm1631 [ rpb_ctx = inst_start + 4; end = inst_start + (simm1631 << 1); repeat_end = 2; globalset(end, repeat_end); globalset(end, rpb_ctx); ] {
    rpb = inst_next;
    rpe = end;
}

As well as propagate it in the delay-slot emulator:

# Simulate mep delay slot in repeat/erepeat instructions
:^instruction is repeat_end=2 & instruction [ repeat_end=1; globalset(inst_next, repeat_end); repeat_end=0; globalset(inst_next, rpb_ctx); ] {
    build instruction;
}

Then I'm making a new symbol that makes use of that context variable:

RepeatTgt: is rpb_ctx {
    export *[ram]:4 rpb_ctx;
}

and using it:

:^instruction is repeat_end=1 & instruction & RepeatTgt [ repeat_end=0; ] {
    build instruction;
    goto RepeatTgt;
}

And now it generates a constant branch and the decompile looks just like I expected it to:
image

I'll leave this open just in case there's a better solution I'm not aware of (which to be honest there probably is).

All 2 comments

Okay, I figured out a possible solution:

So first I need to propagate repeat-begin in the context so that we know where to jump (the globalset(end, rpb_ctx); part):

:erepeat end is op0015=0xe019 ; simm1631 [ rpb_ctx = inst_start + 4; end = inst_start + (simm1631 << 1); repeat_end = 2; globalset(end, repeat_end); globalset(end, rpb_ctx); ] {
    rpb = inst_next;
    rpe = end;
}

As well as propagate it in the delay-slot emulator:

# Simulate mep delay slot in repeat/erepeat instructions
:^instruction is repeat_end=2 & instruction [ repeat_end=1; globalset(inst_next, repeat_end); repeat_end=0; globalset(inst_next, rpb_ctx); ] {
    build instruction;
}

Then I'm making a new symbol that makes use of that context variable:

RepeatTgt: is rpb_ctx {
    export *[ram]:4 rpb_ctx;
}

and using it:

:^instruction is repeat_end=1 & instruction & RepeatTgt [ repeat_end=0; ] {
    build instruction;
    goto RepeatTgt;
}

And now it generates a constant branch and the decompile looks just like I expected it to:
image

I'll leave this open just in case there's a better solution I'm not aware of (which to be honest there probably is).

Closing this out due to inactivity.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

astrelsky picture astrelsky  路  16Comments

woachk picture woachk  路  33Comments

lab313ru picture lab313ru  路  16Comments

SocraticBliss picture SocraticBliss  路  26Comments

niedabao1 picture niedabao1  路  23Comments