Ghidra: Bitfield support in decompiler

Created on 1 Jun 2019  路  7Comments  路  Source: NationalSecurityAgency/ghidra

Is your feature request related to a problem? Please describe.
Right now the decompiler shows bitfield access simply as shift and mask (in other words, it is unaware of bitfields).

For example, consider:

  • a big-endian bitfield that is a byte long, and
  • a member 3 bits long starting at the 2nd bit.

A member read might look like bitfield >> 3 & 0x7, and a member write like bitfield = (bitfield & 0xc7) | (member << 3 & 0x38). This makes understanding decompiler output difficult.

The data type manager allows the declaration of bitfields only by importing them through the "Parse C Source" menu item (great if you have a header file for your platform), however the decompiler does not make use of this information.

Describe the solution you'd like

  • Ability to declare bitfields in the data type manager
  • Control over implementation-specific details like member allocation order
  • Decompiler recognizes data/variables typed as a bitfield + shift-and-mask pcode matching defined offsets and lengths as a bitfield member access, and shows the member access instead of the shift and mask

The above example would then look like var1 = bitfield.member and bitfield.member = var1 for the read and write cases.

Describe alternatives you've considered
No real alternative besides the current situation of consulting datasheets and my own notes for bitfield layout.

  • Bitfield layout will depend on architecture and endianness
  • There is no definitive way for a function to access a bitfield member. It could shift first then mask, or mask then shift. Recognizing member access, even by pcode, might not be trivial.

Additional context
This is mainly for embedded systems that pack many short parameters into registers.

Enhancement

Most helpful comment

The ability to represent bitfields within Structures has just been added to the master branch . Support for bitfields has been added to the CParser, PDB parser and DWARF. The PDB XML file format has changed for bitfields - any retained PDB XML files will need to be regenerated to benefit from the bitfield improvements (bitfield bit-offset information was missing from XML). Note that "aligned" bitfield packing support is currently to msb filled first for big-endian and lsb filled-first for little-endian data. These bitfield component definitions are currently not conveyed to the decompiler and there is currently no bitfield reference mechanism. Structure Data instances in memory will reflect bitfield data. See Structure Editor help content for some additional information.

All 7 comments

This is something I'd really like to see implemented, both in the decompiler and just in the disassembly list view. I feel like a lot of good additions could be done to the enumerations feature. In addition to this, the ability to specify values within bitmasks within the enum would be great. Systems that use their own flag registers may group multiple independent sets into a single register each with a different mask.

Separating enumerations from the overall "data types" in some way would make navigating them easier as well.

Are you by any chance trying to decompile mips binaries? In recent ISAs (r2 and above) there are specific instructions for accessing fields which could be decompiled if you have the type straight as a C bitfield operation.

@nihilus Don't know about mips but I'm working on a powerpc binary right now. Most bitfield access is done with the rlwinm and rlwimi instructions which make it very clear which range of a register is being read and written. But of course this doesn't translate into decompiled output.

on x86 its a mess of shifting and masking

I'm on ARM currently, and there are a ton of processor specific SFR's as well as flags within the user firmware that would drastically benefit from this

The ability to represent bitfields within Structures has just been added to the master branch . Support for bitfields has been added to the CParser, PDB parser and DWARF. The PDB XML file format has changed for bitfields - any retained PDB XML files will need to be regenerated to benefit from the bitfield improvements (bitfield bit-offset information was missing from XML). Note that "aligned" bitfield packing support is currently to msb filled first for big-endian and lsb filled-first for little-endian data. These bitfield component definitions are currently not conveyed to the decompiler and there is currently no bitfield reference mechanism. Structure Data instances in memory will reflect bitfield data. See Structure Editor help content for some additional information.

I am closing this ticket since no immediate action is required. We are investigating bitfield support for the decompiler.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

astrelsky picture astrelsky  路  21Comments

cattrace picture cattrace  路  20Comments

lab313ru picture lab313ru  路  16Comments

rszibele picture rszibele  路  35Comments

0x6d696368 picture 0x6d696368  路  17Comments