Ghidra: Does not detect DOS Executables

Created on 24 Sep 2019  路  12Comments  路  Source: NationalSecurityAgency/ghidra

ghidra_9.1-BETA_DEV (and also 9.0.4)

if i open up a simple 16 bit dos exe (build with nasm assembler and ulink) Ghidra doesn't detect it as Old-Style DOS Exe

The exe is working and correctly assembled - checked with dosbox debugger and IDA Pro
(also tested with several other assemblers - its a linker thing)

single.asm

; build with: 
;  nasm.exe (https://www.nasm.us/pub/nasm/releasebuilds/2.14.02/)
;  ulink.exe (ftp://ftp.styx.cabel.net/pub/UniLink/)
; 
;  nasm.exe -f obj -o single.obj single.asm
;  unlink.exe single.obj

BITS 16

segment seg000 align=16

text: db 'Hello World!',0ah,0dh,'$'

segment seg001 align=16

..start:
mov ax,seg000
mov ds,ax

push ax
pop ax

call far print

mov ax,0x4c00
int 0x21

segment seg002 align=16

print:
mov dx,text
mov ah,9
int 0x21
retf

segment seg003 stack
resb 256

checked it with serveral Linkers:
wlink.exe: Open Watcom Linker Version 2.0 beta Sep 13 2019 01:44:55 (64-bit)
link.exe: Microsoft (R) Segmented Executable Linker Version 5.60.339 Dec 5 1994
optlink.exe: OPTLINK (R) for Win32 Release 8.00.17 (from the dmd package: dmd.2.088.0.windows)
ulink.exe: UniLink v1.11 [beta] (build 11.27) from ftp://ftp.styx.cabel.net/pub/UniLink/

all exes except the ulink.exe linked exe getting detected as Old-Style DOS Exe
the only real difference is a "UniLink" string between the header and relocation table

IDA Pro detects all of them as DOS MZ Executables

optlink.single.exe -> detected as Old-Style DOS Exe

exe_header:
  signature: MZ
  bytes_in_last_block: 0x0068
  blocks_in_file: 0x0001
  num_relocs: 0x0002
  header_paragraphs: 0x0003
  min_extra_paragraphs: 0x0010
  max_extra_paragraphs: 0xffff
  ss:sp: 0x0004:0x0100
  checksum: 0x0000
  cs:ip: 0x0001:0x0000
  reloc_table_offset: 0x001e
  overlay_number: 0x0000

data between header and relocation table:
00000000  00 00                                            ..

relocation_table:
0    0x0001:0x000A
1    0x0001:0x0001

ulink.single.exe -> detected as Raw binary

exe_header:
  signature: MZ
  bytes_in_last_block: 0x0088
  blocks_in_file: 0x0001
  num_relocs: 0x0002
  header_paragraphs: 0x0005
  min_extra_paragraphs: 0x0011
  max_extra_paragraphs: 0xffff
  ss:sp: 0x0004:0x0100
  checksum: 0x0000
  cs:ip: 0x0001:0x0000
  reloc_table_offset: 0x0040
  overlay_number: 0x0000

data between header and relocation table:  
00000000  55 6E 69 4C 69 6E 6B 00 00 00 00 00 00 00 00 00  UniLink.........
00000010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000020  00 00 00 00                                      ....

relocation table:
0    0x0001:0x0001
1    0x0001:0x000A

nasm sample dos exes build with serveral linkers:
http://s000.tinyupload.com/?file_id=77824670479507329081

data between header and relocation table by linker:

optlink.exe: 00 00
link.exe: 01 00
wlink.exe: 00 00 00 00
ulink.exe:
00000000  55 6E 69 4C 69 6E 6B 00 00 00 00 00 00 00 00 00  UniLink.........
00000010  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000020  00 00 00 00                                      ....
LoadeMZ Bug

All 12 comments

I don't know why that binary is not detected (tried it myself and it really doesn't work, maybe you should try with 9.0.4 just in case) but MS-DOS binaries certainly work, you can even find many issues here related to x86 in 16 bit mode.

I just wanted to clarify that the only one not working is ulink, correct?

private short [] e_res = new short[4]; // Reserved words
that is wrong - there is nothing that prevents a linker to use more or less then 4 shorts, its not fixed and depends only on the value of e_lfarlc, as you can see 4 linkers 3 different sizes

according to http://bytepointer.com/resources/win16_ne_exe_format_win3.0.htm

A new-format .EXE file is
identified if the segmented executable header contains a valid
signature. If the signature is not valid, the file is assumed to be an
old-style format .EXE file

NE-file samples:
http://justsolve.archiveteam.org/wiki/New_Executable (Exe Samples)
http://cd.textfiles.com/aztechmb/AZTECH.EXE (its a real NE file)

even valid PE-Files (according to IDA) are detected as NE files - but there is no valid NE-Header ('NE' is missing in Segmented Header) for example MSVC 1.5 link.exe

AZTECH.EXE (NE) and old MSVC 1.5 LINK.EXE (PE)
http://s000.tinyupload.com/?file_id=12663273413500823409

The main issue here is that
https://github.com/NationalSecurityAgency/ghidra/blob/208433c9f7b2e4af8cd26d0b757ecb74a37e8f07/Ghidra/Features/Base/src/main/java/ghidra/app/util/bin/format/mz/DOSHeader.java#L255-L261
isn't checking what's at 0x3c when e_lfarlc is 0x40. When it's0x40, we need to actually confirm that the NE signature is present before declaring it a new executable.

why to check for value 0x40 in 0x3c at all? the distance is a linker related thing, no standard according to the docs i've read, why not just get the value at offset 0x3c, check for 'NE', PE, PE32 or PE+ signature and if not its an Old-Style (exact as decribed in http://bytepointer.com/resources/win16_ne_exe_format_win3.0.htm)

https://github.com/zfigura/semblance/blob/cbdadd3cec26c686cc48782b9702151f522d66ce/src/dump.c#L28-L59

i don't know if you copied the header from here but its just not correct: https://blog.kowalczyk.info/articles/pefileformat.html

The word at offset 18h in the old-style .EXE header contains the
relative byte offset to the stub program's relocation table. If this
offset is 40h, then the double word at offset 3Ch is assumed to be the
relative byte offset from the beginning of the file to the beginning
of the segmented executable header. A new-format .EXE file is
identified if the segmented executable header contains a valid
signature. If the signature is not valid, the file is assumed to be an
old-style format .EXE file.

According to that, 0x40 is required for it to be a New Executable. However, we still need to go to the header and actually make sure the NE signature is there before saying it's a New Executable.

0x40: im not sure if that is a need, because of

A new-format .EXE file is
identified if the segmented executable header contains a valid
signature.

Perhaps it's not needed, but since I'm going to fix this for 9.1, I don't want to risk changing the behavior too much. I'm just going to add the additional check to address the problem while minimizing impact.

i would also add 'PE' to the check

if MZ
  if NE
  if PE
  else Old-Style

even valid PE-Files (according to IDA) are detected as NE files - but there is no valid NE-Header ('NE' is missing in Segmented Header) for example MSVC 1.5 link.exe

must correct me: IDA detects the MSVC 1.5 link.exe wrongly as PE file, but its an Old-Style DOS Exe with Pharlab extender
http://s000.tinyupload.com/?file_id=12663273413500823409 (the PE.msvc1.5.LINK.EXE seems to be a Old-Style DOS Exe)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rrivera1849 picture rrivera1849  路  3Comments

pd0wm picture pd0wm  路  3Comments

CalcProgrammer1 picture CalcProgrammer1  路  3Comments

lab313ru picture lab313ru  路  3Comments

chibicitiberiu picture chibicitiberiu  路  3Comments