Is your feature request related to a problem? Please describe.
I'm using the BinaryLoader on a number of programs that all use the same API. In this situation, the loader-specific lists of non-returning functions don't apply, so that there's no way to use the analyzer Non-returning functions - Known.
Instead, for each program I have to manually flag whatever non-returning functions aren't discovered by Non-returning functions - Discovered.
Describe the solution you'd like
I'd like an easy way for the average user to provide a user-specified file of non-returning function names.
How this might be done:
Describe alternatives you've considered
Additional context
See issue #2101.
Although my request addresses the BinaryLoader, it also highlights a problem with the linkage between the three specific loaders each with their own relatively fixed list of non-returning functions. This is especially the case with elf format, which is used across many hardware architectures, operating systems, and other APIs.
The files containing such function names are contained within the Ghidra installation and may be readily modified within the directory Ghidra/Features/Base/data. These files are distinguished based upon the load format as identified by the specification file noReturnFunctionConstraints.xml. For example, ELF non-returning function names are contained within the file ElfFunctionsThatDoNotReturn while PE (i.e., windows) non-returning function names are contained within the file PEFunctionsThatDoNotReturn.
@ghidra1, there is some background for this request in #2101. In this case, he's using the BinaryLoader, so none of those files will be applied.
As a work-around a Raw Binary entry could be added to the noReturnFunctionConstraints.xml file and a new data file added which contains the non-returning function names. :
<executable_format name="Raw Binary">
<functionNamesFile>BinaryFunctionsThatDoNotReturn</functionNamesFile>
</executable_format>
Additional constraints can be added to the above entry (by nesting) to narrow it down to a specific case. Other constraints which may be of use include: compiler, language and property. Where compiler and language constraints specify an id attribute, and property constraints identify a specific _Program Information_ property by name and value attribute. The language id also supports the use of a wild card between :'s (e.g., id="ARM:LE:32:*"). Example:
<executable_format name="Raw Binary">
<property name="OriginalFilename" value="mybinary">
<functionNamesFile>MyBinaryFunctionsThatDoNotReturn</functionNamesFile>
</property>
<language id="ARM:LE:32:*">
<functionNamesFile>BinaryARMFunctionsThatDoNotReturn</functionNamesFile>
</language>
</executable_format>
NOTE: I have not actually tried the above which I have based on code inspection of the constraint parser DecisionTree.java and the various ProgramConstraint implementations.
This will probably be okay as a workaround.
I'd like to leave my feature request out there, though, in hopes that a better approach can be found for a future release.
Additional constraints can be added to the above entry (by nesting) to narrow it down to a specific case. Other constraints which may be of use include:
compiler,languageandproperty. Wherecompilerandlanguageconstraints specify anidattribute, andpropertyconstraints identify a specific _Program Information_ property bynameandvalueattribute. The languageidalso supports the use of a wild card between :'s (e.g., id="ARM:LE:32:*"). Example:<executable_format name="Raw Binary"> <property name="OriginalFilename" value="mybinary"> <functionNamesFile>MyBinaryFunctionsThatDoNotReturn</functionNamesFile> </property> <language id="ARM:LE:32:*"> <functionNamesFile>BinaryARMFunctionsThatDoNotReturn</functionNamesFile> </language> </executable_format>NOTE: I have not actually tried the above which I have based on code inspection of the constraint parser
DecisionTree.javaand the variousProgramConstraintimplementations.
It's not clear to me how the example constraint(s) would work.
Of course, it's obvious that the constraint(s) are only applicable to Raw Binary format.
Beyond that, though, are/is there:
MyBinaryFunctionsThatDoNotReturn to _just_ raw binary files named mybinary.BinaryARMFunctionsThatDoNotReturn to _just_ raw binary files that match the language ID.mybinary, the function name list comes from MyBinaryFunctionsThatDoNotReturn and the second constraint is ignored. [Else,]ARM:LE:32:*, the function name list comes from BinaryARMFunctionsThatDoNotReturnAs you pointed out these are two independent constraints, although I am unsure how the constraint precedence mechanism works. This will take some code inspection or asking the right person to find out.
The constraints can be 'and-ed' by nesting them:
<executable_format name="Raw Binary">
<property name="OriginalFilename" value="mybinary">
<language id="ARM:LE:32:*">
<functionNamesFile>MyBinaryFunctionsThatDoNotReturn</functionNamesFile>
</language>
</property>
</executable_format>
Arranged that way, it is much clearer that the effect is like a logical "and" of the two constraint conditions. Thanks. That example is better suited for the write-up I'm making for our other team members here.
We are considering some changes to data type archives, and it got me thinking.
Function signatures can be tagged by name as non-returning in a data type archive.
You can apply all the function signatures from an archive by name.
So for a project or IDE, etc. there are certain functions that are non-returning.
You could apply the non-returning functions from there regardless of format.
I suppose the Known non-returning analyzer could look at whatever archives you have open, to see any tagged non-returning ones.
That said, if you import a binary as raw, where did the names come from?
If you then apply some names, you can then apply the function signatures from an archive that match the names, then if they have the non-returning attribute they would be made non-returning.
That said, if you import a binary as raw, where did the names come from?
If you then apply some names, you can then apply the function signatures from an archive that match the names, then if they have the non-returning attribute they would be made non-returning.
This is what I've been doing. 馃槃 Fortunately, due to limitations on the code I can access from home due to COVID-19, the number of functions has been very small and easy to deal with.
Most helpful comment
As a work-around a Raw Binary entry could be added to the
noReturnFunctionConstraints.xmlfile and a new data file added which contains the non-returning function names. :Additional constraints can be added to the above entry (by nesting) to narrow it down to a specific case. Other constraints which may be of use include:
compiler,languageandproperty. Wherecompilerandlanguageconstraints specify anidattribute, andpropertyconstraints identify a specific _Program Information_ property bynameandvalueattribute. The languageidalso supports the use of a wild card between :'s (e.g., id="ARM:LE:32:*"). Example:NOTE: I have not actually tried the above which I have based on code inspection of the constraint parser
DecisionTree.javaand the variousProgramConstraintimplementations.