Describe the bug
Take any ELF64 file using PowerPC64 and 32-bit addressing. I was using "PowerISA-Altivec-64-32addr", but any PPC64 language with 32-bit addressing will suffice.
This is handled by `PowerPC64_ElfExtension". It satisfies this check:
However when processElf is called then this check is included:
Which fails since it is equal to 32.
This may just be an unimplemented feature (I would need to see what assumptions the class makes), but we can clearly see that files imported using this PPC variation are not properly processed.
To Reproduce
Steps to reproduce the behavior:
TOC_BASE symbol is never created and how r2 is not set at all in any function.Environment (please complete the following information):
If you set the r2 register to the correct TOC value, it sometimes works but in many cases fails to resolve the correct addresses in the decompiler, while it works fine in the disassembly window.
It generates code like this:
puVar10 = &DAT_006b6f88;
....
iVar15 = **(puVar10 + -0x4888);
if(iVar15 != 0) {
...
The way it tries to access the location varies, sometimes it uses code like above, sometimes it tries to index it as an array and so on.
In the disassembly, it calculates the address just fine and also shows the label of that location.
I've made bold what is most relevant to this issue but here are PS3 (PPU) ABI changes for reference:
This appendix identifies the primary differences between the PPU ABI for Cell OS Lv-2 and the 64-bit
PowerPC ELF ABI.
• The size and alignment of a pointer are each 4 bytes. The PowerPC ABI defines these as 8 bytes.
• The size and alignment of long are each 4 bytes. The PowerPC ABI defines these as 8 bytes.
• The “plain” char type is signed char whereas the plain char type is an unsigned char in the
PowerPC ABI.
• Additional data types have been defined to represent some fundamental data types for OS and
library interfaces. (See Table 2.)
• Each function descriptor has two 32-bit address fields and no environment pointer. As a result, this
ABI does not support nested functions. Nested functions are a GCC language extension. For the
PowerPC ABI, each function descriptor consists of three 64-bit address fields: the address of the
function body, a pointer to the TOC referred to by the function, and the environment pointer used by
the function.
• The initial values of some registers have been changed, for process initialization.
• New relocation types, R_PPC64_TOC32, R_PPC64_DTPMOD32, R_PPC64_TPREL32 and
R_PPC64_DPTPREL32 are defined.
• The OS and ABI identifier in the ELF file header is defined as 102 for Cell OS Lv-2.
• VRSAVE is not required to be updated at runtime; however, it should be initialized to 0xffff_ffff.
• Traceback tables are not supported.
• The “extended precision” format is not supported.
Relocations are handled differently as well (note the use of patchseg and symseg): https://github.com/aerosoul94/ida_gel/blob/master/src/ps3/cell_loader.cpp#L334
OPD section recovery is likely necessary because they strip sections (or maybe just section names actually) for retail binaries.
@Bo98 A workaround would be taking the TOC from the function descriptor, then setting the r2 register value for that function.
If using gcc to build, could you please provide a sample gcc execution command with options so we can try to reproduce a sample. Are you working with an object file or fully linked binary?
It sounds like this may require quite a few changes to the current ELF extension. It also sounds like a non-standard gcc toolchain may be required to build such a binary since PS3 ABI support is not present.
@ghidra1 Yeah you probably need a non-standard toolchain.
You can use this https://github.com/ps3dev/PSL1GHT, a homebrew PS3 SDK.
Else, I or someone else can provide _executable_ samples (instead of source code), if that's enough.
Update:
@aerosoul94 discovered that if r2 is added to the <unaffected> list here:
https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/PowerPC/data/languages/ppc_64_32.cspec#L96
then the decompiler issue with it is fixed.
Here's a comparison:

Left is before the fix, right is after. Look specifically at the TOC-relative string references that are resolved after the fix.
(right is also on 9.2, but it doesn't matter, 9.1.2 is fixed with this too)
Let me know if you need anything else.
Update: Even with the above fix and Ghidra 9.2 (build 1937), this still sometimes happens:

Most helpful comment
Update:

@aerosoul94 discovered that if
r2is added to the<unaffected>list here:https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Processors/PowerPC/data/languages/ppc_64_32.cspec#L96
then the decompiler issue with it is fixed.
Here's a comparison:
Left is before the fix, right is after. Look specifically at the TOC-relative string references that are resolved after the fix.
(right is also on 9.2, but it doesn't matter, 9.1.2 is fixed with this too)
Let me know if you need anything else.