Radare2: Python compilation of ELF file need to be analyzed with "-e anal.strings=true" in r2 command line

Created on 20 Sep 2018  ·  10Comments  ·  Source: radareorg/radare2

Work environment

| Questions | Answers
|------------------------------------------------------|--------------------
| OS/arch/bits (mandatory) | OSX Darwin Kernel Version 17.4.0 x86_64 & FreeBSD x86_32
| File format of the file you reverse (mandatory) | ELF
| Architecture/bits of the file (mandatory) | 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, stripped
| r2 -v full output, not truncated (mandatory) | radare2 3.0.0-git 19702 @ darwin-x86-64 git.2.9.0-156-gcb1be2727 commit: cb1be27272cd071823b75a88e91b2512d4f7a5f3 build: 2018-09-20__14:19:15
| sample | MD5 (sample) = 56303f9c9b3ec89f4a883a4d7b079f65

Expected behavior

[0x00401a92]> pdf 0x00401a75
            ;-- entry0:
┌ (fcn) fcn.rip 41
│   fcn.rip (int arg3);
│           ; arg int arg3 @ rdx
│           0x00401a75      31ed           xor ebp, ebp
│           0x00401a77      4989d1         mov r9, rdx                 ; arg3
│           0x00401a7a      5e             pop rsi
│           0x00401a7b      4889e2         mov rdx, rsp
│           0x00401a7e      4883e4f0       and rsp, 0xfffffffffffffff0
│           0x00401a82      50             push rax
│           0x00401a83      54             push rsp
│           0x00401a84      49c7c0a05240.  mov r8, 0x4052a0
│           0x00401a8b  ~   48c7c1305240.  mov rcx, 0x405230           ; '0R@' ; 
            0x00401a92  ~   48c7c7701a40.  mov rdi, 0x401a70 ; section..text
            0x00401a99      e822fdffff     call sym.imp.__libc_start_main ; int __libc_start_main(func main, int argc, char **ubp_av, func init, func fini, func rtld_fini, void *stack_end)
            0x00401a9e      f4             hlt

Actual behavior

[0x00401a75]> pdf
            ;-- rip:
┌ (fcn) entry0 41
│   entry0 (func rtld_fini);
│           ; arg func rtld_fini @ rdx
│           0x00401a75      31ed           xor ebp, ebp
│           0x00401a77      4989d1         mov r9, rdx                 ; func rtld_fini
│           0x00401a7a      5e             pop rsi                     ; int argc
│           0x00401a7b      4889e2         mov rdx, rsp                ; char **ubp_av
│           0x00401a7e      4883e4f0       and rsp, 0xfffffffffffffff0
│           0x00401a82      50             push rax
│           0x00401a83      54             push rsp
│           0x00401a84      49c7c0a05240.  mov r8, 0x4052a0            ; func fini
│           0x00401a8b  ~   48c7c1305240.  mov rcx, 0x405230           ; '0R@' ; "AWA\x89\xffAVI\x89\xf6AUI\x89\xd5ATL\x8d%`! " ; func init
│           ;-- "kzĊ>ܺ":
│           0x00401a8d     .string "kz\xc4\x8a>\xdc\xba" ; len=8
│           0x00401a95      701a           jo 0x401ab1                 ; fcn.00401aa0+0x11
│           0x00401a97  ~   4000e8         add al, bpl
│           ;-- "1=k`kq)":
│           0x00401a98     .string "1=k`kq)" ; len=8
[0x00401a75]>

Steps to reproduce the behavior

  • use the sample shared in above hash, can not share openly.
  • execute r2 , run command: aaa, af, pdf in the entry point

Suggested problem

opcode parsing for string was overwriting the opcode (assembly) reading

Additional Logs, screenshots, source-code, configuration dump, ...

Screenshot1: wrong opcode reading PoC:

Screenshot2: how to produce the correct opcode reading PoC:

Additional:

This is the python compiled ELF binary.

This "unusual phenomenon" is reported by @unixfreaxjp

ELF bug

Most helpful comment

I fixed it in master now.
@unixfreaxjp if you have a legitimate binary where this behavior can be found, let us know so we can add a testcase to avoid future regressions.

All 10 comments

Greeting @unixfreaxjp,

Thanks for reporting, I can reproduce.

Last time we had this issue it was due to section overlapping and got fixed, but I cannot find this issue back in the tracker.

Team, I have uploaded the binary in the usual location for further review.

Regards,

Thank's for reply

Last time we had this issue it was due to section overlapping and got fixed

I worry this is _maybe_ the capstone bug, let me know if that is so and can't be fixed by our side.

A lot of the ".strings" misread like this one too:

  ││└────< 0x00404e50      0f896affffff   jns 0x404dc0
│  ││ ⁝││   0x00404e56      bec0010000     mov esi, 0x1c0              ; 448
│  ││ ⁝││   0x00404e5b      4c89e7         mov rdi, r12
│  ││ ⁝││   0x00404e5e      e81dc8ffff     call sym.imp.mkdir
│  ││ │││   ;-- "}NW-0ˡ":
│  ││ └───< 0x00404e63     .string "}NW-0\xcb\xa1" ;  <==== THIS
..
│  │        0x00404e72      004881         invalid
│  │        0x00404e73      90             nop
│  │        0x00404e74      2000           and byte [rax], al
│  │        0x00404e76      0031           add byte [rcx], dh
│  │        0x00404e78      c05b5d41       rcr byte [rbx + 0x5d], 0x41
│  │        0x00404e7c      5c             pop rsp
│  │        0x00404e7d      c3             ret

cc; @Maijin @XVilka @radare @wargio

-e anal.strings=true and output should be good. This is unrelated to capstone

This bin probably have no strings rodata section. So it finds raw strings anywhere

On 20 Sep 2018, at 11:34, unixfreaxjp notifications@github.com wrote:

Thank's for reply

Last time we had this issue it was due to section overlapping and got fixed

I worry this is the capstone bug, let me know if that is so and can't be fixed by our side.
cc; @Maijin @XVilka @radare @wargio


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@radare there are strings in .rodata, but there is also a HUGE pydata section probably causing overlap. The var anal.strings=true is fixing the issue for this use-case

-e anal.strings=true and output should be good.

Thank you. It seems working, however, these below 2 (two) different setups on setting anal.strings=true (the first one is to set variable anal.strings to true after r2 default loading the sample has started, second one is in the command line during loading sample) are resulted in only one correct result, may I ask why? If I may comment, user will tend to do the first way.

This one is resulted an incorrect opcode readings, anal.strings=true

This one is resulted a correct opcode reading, anal.strings=true

cc: @maijin @radare @XVilka @wargio

This eval variable anal.strings=true needs to be passed at opening time because it is checked while the information in the binary is parsed.

Okay, it seems this issue is a "new" condition of ELF file that many people didn't see it before.
As solution is, to handle such sample is to run sample with:

r2 -e anal.strings=true {sample-file-name}

Hereby I closed this #PythonELF issue. Feel free to improve the dev (if needed) to be more automatically detected similar ELF as enhancement for 3.0 maybe? ( suggestion). Thank's for the replies.

This bin should be handled properly by default. Even if anal strings=true fixes the issue and thats how all other tools (bn, ida, hopper) work. Its not default in r2 for some reasons. Maijin should have a fix in a PR soon

On 20 Sep 2018, at 17:18, unixfreaxjp notifications@github.com wrote:

Closed #11590.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

This bin should be handled properly by default

@radare Oh, that is wonderful. Thank you.

I fixed it in master now.
@unixfreaxjp if you have a legitimate binary where this behavior can be found, let us know so we can add a testcase to avoid future regressions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

YugoCode picture YugoCode  ·  6Comments

securitykitten picture securitykitten  ·  4Comments

XVilka picture XVilka  ·  6Comments

XVilka picture XVilka  ·  3Comments

Manouchehri picture Manouchehri  ·  3Comments