Describe the bug
De-compiling code that uses conditional float division and multiplication will not show these operations.
It also does not recognize the _ftol function and the conversion from float to dword.
Thanks
To Reproduce
Using standard options to decompile some code.
The code was presumably compiled using Visual Studio 6.0
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Attachments
Listing of the function
**************************************************************
* FUNCTION *
**************************************************************
float10 __stdcall FUN_01008f03(PVec3 * vec3, PVec2 * dest)
float10 ST0:10 <RETURN>
PVec3 * Stack[0x4]:4 vec3 XREF[1]: 01008f0f(R)
PVec2 * Stack[0x8]:4 dest XREF[1]: 01008f53(R)
undefined4 Stack[-0x8]:4 local_8 XREF[2]: 01008f1c(R),
01008f3c(R)
undefined4 Stack[-0xc]:4 local_c XREF[1]: 01008f50(R)
undefined4 Stack[-0x10]:4 local_10 XREF[2]: 01008f0b(*),
01008f3f(R)
FUN_01008f03 XREF[1]: FUN_01015c08:01015c47(c)
01008f03 8b ff MOV EDI,EDI
01008f05 55 PUSH EBP
01008f06 8b ec MOV EBP,ESP
01008f08 83 ec 0c SUB ESP,0xc
01008f0b 8d 45 f4 LEA EAX=>local_10,[EBP + -0xc]
01008f0e 50 PUSH EAX
01008f0f ff 75 08 PUSH dword ptr [EBP + vec3]
01008f12 68 60 81 PUSH PMat4X3_01028160 =
02 01
01008f17 e8 32 fe CALL Math_mul_mat4x3_vec3 undefined Math_mul_mat4x3_vec3(P
ff ff
01008f1c d9 45 fc FLD dword ptr [EBP + local_8]
01008f1f dd 05 f8 FLD qword ptr [DAT_010016f8]
16 00 01
01008f25 da e9 FUCOMPP
01008f27 df e0 FNSTSW AX
01008f29 f6 c4 44 TEST AH,0x44
01008f2c 7a 08 JP LAB_01008f36
01008f2e d9 05 18 FLD dword ptr [DAT_01001718] = FEh
17 00 01
01008f34 eb 09 JMP LAB_01008f3f
LAB_01008f36 XREF[1]: 01008f2c(j)
01008f36 d9 05 40 FLD dword ptr [DAT_01028140] = ??
81 02 01
01008f3c d8 75 fc FDIV dword ptr [EBP + local_8]
LAB_01008f3f XREF[1]: 01008f34(j)
01008f3f d9 45 f4 FLD dword ptr [EBP + local_10]
01008f42 56 PUSH ESI
01008f43 d8 c9 FMUL ST1
01008f45 d8 05 3c FADD dword ptr [DAT_0102813c] = ??
81 02 01
01008f4b e8 16 82 CALL _ftol undefined _ftol()
01 00
01008f50 d9 45 f8 FLD dword ptr [EBP + local_c]
01008f53 8b 75 0c MOV ESI,dword ptr [EBP + dest]
01008f56 d8 c9 FMUL ST1
01008f58 89 06 MOV dword ptr [ESI],EAX
01008f5a d8 05 38 FADD dword ptr [DAT_01028138] = ??
81 02 01
01008f60 e8 01 82 CALL _ftol undefined _ftol()
01 00
01008f65 dd d8 FSTP ST0
01008f67 d9 05 1c FLD dword ptr [DAT_0100171c]
17 00 01
01008f6d 89 46 04 MOV dword ptr [ESI + 0x4],EAX
01008f70 5e POP ESI
01008f71 c9 LEAVE
01008f72 c2 08 00 RET 0x8
01008f75 cc ?? CCh
01008f76 cc ?? CCh
01008f77 cc ?? CCh
01008f78 cc ?? CCh
01008f79 cc ?? CCh
Result of decompilation
float10 FUN_01008f03(PVec3 *vec3,PVec2 *dest)
{
float fVar1;
PVec3 local_10;
Math_mul_mat4x3_vec3(&PMat4X3_01028160,vec3,&local_10);
fVar1 = (float)_ftol();
dest->x = fVar1;
fVar1 = (float)_ftol();
dest->y = fVar1;
return (float10)1.00000000;
}
Environment (please complete the following information):
Also, note that this function is using global data DAT_010016f8, DAT_01001718, DAT_01028140, DAT_0102813c, DAT_01028138 and DAT_0100171c.
These are also not shown in the decompiled code.
I'm not sure if it's specifically conditional floating point instructions that get messed up, but it seems like the decompiler struggles with floating point instruction sequences (e.g. nothing shows up in the decompiled output for several floating point instructions, sometimes very long sequences of floating point instructions).
It seemed like it was reading global data fine in some instances, so it maybe it gives up with subsequent floating point instructions after encountering one it doesn't understand.
FUN_004ab350
MOV EAX ,[DAT_00542074 ]
TEST EAX ,EAX
JLE LAB_004ab372
FILD dword ptr [DAT_00542074 ]
FMUL qword ptr [DAT_004d27b0 ]
FSQRT
FMUL qword ptr [DAT_004d27b8 ]
CALL stdc::__ftol
LAB_004ab372
MOV [DAT_0054206c ],EAX
RET
Decompiles as:
void FUN_004ab350(void)
{
int iVar1;
iVar1 = DAT_00542074;
if (0 < DAT_00542074) {
iVar1 = __ftol();
}
DAT_0054206c = iVar1;
return;
}
The issue is the fact that before __ftol, no value is returned to memory, as __ftol does that.
I've set the signature of __ftol to "longlong __ftol (double param_1)", with custom storage set as thus:
Return: longlong (EAX, EDX)
param1: double (ST0)
After setting this the decompilation shows:
void FUN_004ab350(void)
{
int iVar1;
longlong lVar2;
iVar1 = DAT_00542074;
if (0 < DAT_00542074) {
lVar2 = __ftol(SQRT((double)DAT_00542074 * 0.00001526) * 65536.00000000);
iVar1 = (int)lVar2;
}
DAT_0054206c = iVar1;
return;
}
Which looks correct to me.
@Jaguar83 thanks for the tip! I'm still having some issues with it, the image below shows my settings:
An example of the instructions Ghidra is getting:
PUSH EBP
MOV ECX,dword ptr [ESP + param_2]
MOV EBP,ESP
SUB ESP,0x4
MOV EAX,dword ptr [EBP + param_1]
CMP ECX,EAX
JG LAB_00424e33
XOR EAX,EAX
JMP LAB_00424e54
LAB_00424e33
MOV dword ptr [EBP + local_8],ECX
FILD dword ptr [EBP + local_8]
MOV dword ptr [EBP + local_8],EAX
FLD ST0
FISUB dword ptr [EBP + local_8]
FDIVP
FSUB qword ptr [DOUBLE_004c8f18]
FMUL qword ptr [DOUBLE_004c8f28]
CALL _ftol
LAB_00424e54
MOV ESP,EBP
POP EBP
RET
And the output I get after from Ghidra is this:
undefined4 uVar1;
longlong lVar2;
if (param_1 < param_2) {
lVar2 = _ftol((double)CONCAT44(param_1,0x424e54));
uVar1 = (undefined4)lVar2;
}
else {
uVar1 = 0;
}
return uVar1;
Which still looks messed up, as the FDIV, FMUL, and FSUB aren't showing up -- the same function run through the HexRays decompiler shows the expected operations.
Storage for param_1 needs to be ST0 - the FPU register, not Stack[0x0]
Thanks! Yea, I realized that and made the change. The new output is much better, the relevant decompiler output now looks like:
lVar2 = _ftol(((double)param_2 / ((double)param_2 - (double)param_1) - 1.00000000) * 5000.00000000);
A screenshot of the settings I used for any people that come across this and want a visual:
As a general note, it would seem that the analyser would need to pick up the signature of __ftol to fix this issue in the long term.
However, as this function uses a non-standard calling convention, I don't think it would fit under standard analysis rules, so a special rule would need to added for this function. Not sure if this is possible, as I haven't had a look at the relevant source code yet.
Last night I was trying to figure out how ghidra knows what the function signatures and calling conventions for the other msvcrt functions -- it looks like a lot of the other msvcrt functions have entries in the gdt files here: https://github.com/NationalSecurityAgency/ghidra/tree/master/Ghidra/Features/Base/data/typeinfo/win32
I noticed that adding a _ftol function definition to the data type manager made it use that function signature for _ftol, though the data type manager doesn't seem to have any custom storage support to handle the special calling convention the _ftol function uses.
There seem to be a few other functions in the list of Internal CRT Globals and Functions microsoft has, with some of the subpages like _CIsin mentioning the special calling convention used. Interfacing with the _ftol2 function also appears in the LLVM mailing list with some talk about how it is called here and here
Most helpful comment
As a general note, it would seem that the analyser would need to pick up the signature of __ftol to fix this issue in the long term.
However, as this function uses a non-standard calling convention, I don't think it would fit under standard analysis rules, so a special rule would need to added for this function. Not sure if this is possible, as I haven't had a look at the relevant source code yet.