I have a simple program:
#include <stdio.h>
int main()
{
int a;
a = func(15, 3);
return a;
}
int func(int i, int j)
{
int b1[5], b2[10];
b2[i] = 1;
printf("%d\n", b1[j]);
return 0;
}
I am using python script to get local variables from the stripped binary, compiled using above program.
I use: function.getLocalVariables() or something like function.getStackFrame().getStackVariables() to get the local variables. Interestingly I observed that, this script doesn't give me all the variables which can be seen in the decompiler window. For e.g., in the above case, I get following in the decompiled window (for function func):

Here, the predicted buffers can be seen. But instead I get:
array(ghidra.program.model.listing.Variable, [[undefined4 local_5c@Stack[-0x5c]:4], [undefined4 local_60@Stack[-0x60]:4]])
which are clearly not the predicted buffers. Is there any way to get those buffers?
Hey @Ruturaj4,
What you're asking for is possible. You need to use the decompiler interface to get that information. Here's an example using Python:
from ghidra.app.decompiler import DecompileOptions
from ghidra.app.decompiler import DecompInterface
from ghidra.util.task import ConsoleTaskMonitor
name = "myFunctionName"
func = getGlobalFunctions(name)[0]
options = DecompileOptions()
monitor = ConsoleTaskMonitor()
ifc = DecompInterface()
ifc.setOptions(options)
ifc.openProgram(func.getProgram())
res = ifc.decompileFunction(func, 60, monitor)
high_func = res.getHighFunction()
lsm = high_func.getLocalSymbolMap()
symbols = lsm.getSymbols()
for i, symbol in enumerate(symbols):
print("Symbol {}: {} (size: {})".format(i+1, symbol.getName(), symbol.size))
And here's an example out put:
Symbol 1: auStack56 (size: 40)
Symbol 2: auStack88 (size: 32)
Symbol 3: in_FS_OFFSET (size: 8)
Symbol 4: local_10 (size: 8)
Symbol 5: param_1 (size: 4)
Symbol 6: param_2 (size: 4)
Note that the sizes returned here are in bytes. So something like undefined4 auStack88 [12] will return size: 48 (12 * 4). Use print(dir(symbol)) to get more information on what you can get from these symbols. Everything you're looking for should be there.
Here, the predicted buffers can be seen. But instead I get:
array(ghidra.program.model.listing.Variable, [[undefined4 local_5c@Stack[-0x5c]:4], [undefined4 local_60@Stack[-0x60]:4]])which are clearly not the predicted buffers. Is there any way to get those buffers?
i am smelling entropy here.
Variadic function like printf snprintf,... take a format string that leak entropy:
1) Number of params
2) Type of params that could be propagated .
ie printf("%d",auStack40[param_2] ) => signed int => auStack40[param_2] is signed int => auStack40 array of signed int. but the decompiler analyse vars and chosed uint.
Most helpful comment
Hey @Ruturaj4,
What you're asking for is possible. You need to use the decompiler interface to get that information. Here's an example using Python:
And here's an example out put:
Note that the sizes returned here are in bytes. So something like
undefined4 auStack88 [12]will returnsize: 48(12 * 4). Useprint(dir(symbol))to get more information on what you can get from these symbols. Everything you're looking for should be there.