Radare2: Missing Exports and Imports on ELF

Created on 5 Nov 2018  ยท  30Comments  ยท  Source: radareorg/radare2

Work environment

| Questions | Answers
|------------------------------------------------------|--------------------
| OS/arch/bits (mandatory) | Ubuntu x86 64
| File format of the file you reverse (mandatory) | ELF
| Architecture/bits of the file (mandatory) | Aarch64.
| r2 -v full output, not truncated (mandatory) | radare2 3.1.0-git 20076 @ linux-x86-64 git.3.0.1-140-ga0844ef2c commit: a0844ef2c3a2e2852e975634686f0eca4a447093 build: 2018-11-04__22:23:18

Expected behavior

Show exports (1) and imports(5 or so from libc)

  • Imports:
Address Ordinal Name    Library
0000000000020068        realloc 
0000000000020070        __cxa_finalize  
0000000000020078        __stack_chk_fail    
0000000000020080        malloc  
0000000000020088        memcpy  
0000000000020090        memset  
0000000000020098        free    
00000000000200A0        __cxa_atexit    
00000000000200A8        _bss_end__  
00000000000200B0        __bss_start 
00000000000200B8        __end__ 
00000000000200C0        __bss_start__   
00000000000200C8        _edata  
00000000000200D0        __bss_end__ 
00000000000200D8        _end    
  • Exports:
Name    Address Ordinal
dc_::Java_o__003dc_e    000000000000AA20    
start   0000000000000730    [main entry]

Actual behavior

Empty showup

Steps to reproduce the behavior

See attached file
libarm64.zip

ELF IMPORTANT RBin bug

All 30 comments

the arch is totally unrelated. this issue is related to the ELF parser

Is this then related to the segments within the binary?

I think so

On 14 Nov 2018, at 12:38, Eduardo Novella notifications@github.com wrote:

Is this then related to the segments within the binary?

โ€”
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

@radare I don't think I will be able to look at this for 3.1. Can we move it to 3.2. I will work on ELF and Bin for next release too, so hopefully I will fix this as part of the refactoring.

๐Ÿ‘

On 20 Nov 2018, at 15:36, Riccardo Schirone notifications@github.com wrote:

@radare I don't think I will be able to look at this for 3.1. Can we move it to 3.2. I will work on ELF and Bin for next release too, so hopefully I will fix this as part of the refactoring.

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

can you retry @enovella there was a similar fix last week

also we need tests

Testing new changes with no luck so far:

[13:27 edu@love tmp] >  r2 libarm64.so 
 -- Find wide-char strings with the '/w <string>' command
[0x00000730]> il
[Linked libraries]
libandroid.so
liblog.so
libm.so
libdl.so
libc.so

5 libraries

[0x00000730]> is
[Symbols]
Num Paddr      Vaddr      Bind     Type Size Name
001 0x00000730 0x00000730  LOCAL   SECT    0
002 0x00010000 0x00020000  LOCAL   SECT    0

[0x00000730]> iS
[Sections]
Nm Paddr       Size Vaddr      Memsz Perms Name
00 0x00000000     0 0x00000000     0 ---- 
01 0x00000450   278 0x00000450   278 -r-- .dynstr
02 0x0000fd90   528 0x0001fd90   528 -rw- .dynamic
03 0x000100bc   216 0x00000000   216 ---- .shstrtab

Empty import/exports:

[13:31 edu@love tmp] >  r2 -v
radare2 3.2.0-git 20713 @ linux-x86-64 git.3.0.0-717-gdc46527d1
commit: dc46527d16d757ad6d1c3299429551cbfe0a9158 build: 2019-01-03__13:24:06

[13:31 edu@love tmp] >  r2 libarm64.so 
 -- r2 talks to you. tries to make you feel well.
[0x00000730]> aaa
[x] Analyze all flags starting with sym. and entry0 (aa)
[x] Analyze value pointers (aav)
[x] Value from 0x00020058 to 0x00020060 (aav)
[x] 0x00020058-0x00020060 in 0x20058-0x20060 (aav)
[x] 0x00020058-0x00020060 in 0x1fd78-0x20058 (aav)
[x] 0x00020058-0x00020060 in 0x0-0xf860 (aav)
[x] Value from 0x0001fd78 to 0x00020058 (aav)
[x] 0x0001fd78-0x00020058 in 0x20058-0x20060 (aav)
[x] 0x0001fd78-0x00020058 in 0x1fd78-0x20058 (aav)
[x] 0x0001fd78-0x00020058 in 0x0-0xf860 (aav)
[x] Value from 0x00000000 to 0x0000f860 (aav)
[x] 0x00000000-0x0000f860 in 0x20058-0x20060 (aav)
[x] 0x00000000-0x0000f860 in 0x1fd78-0x20058 (aav)
[x] 0x00000000-0x0000f860 in 0x0-0xf860 (aav)
[x] Emulate code to find computed references (aae)
[x] Analyze function calls (aac)
[x] Analyze len bytes of instructions for references (aar)
[x] Constructing a function name for fcn.* and sym.func.* functions (aan)
[x] Type matching analysis for all functions (afta)
[x] Use -AA or aaaa to perform additional experimental analysis.
[0x00000730]> ii
[Imports]
Num  Vaddr       Bind      Type Name
   0 0x00000000  (null)  (null) 

[0x00000730]> iE
[Exports]
Num Paddr      Vaddr      Bind     Type Size Name

[0x00000730]> 

(Please note that you don't need to analyze to get imports or exports or anything related to the information command.)

hey @Maijin, yeah I knew that, but thanks for info.

@ret2libc any comments on this? i assume we should move to 3.5 again :{ but will be good to know ifi anybody have looked at iti and we can have at least an estimation of the problem

No comments yet, sorry. I'll try to prioritize this for next 3.5.0, to at least understand what's the exact problem and fix it if possible.

So I've been looking at this issue and it seems the problem comes with parsing the binary relying on sections for symbol resolution. as a quick overview, the file has a .dynamic and a .dynstr section but is missing the .dynsym section which is the symbol table section for .dynstr. However we can obtain the same address by parsing the dynamic section/segment entries and finding the DT_SYMTAB entry, which points to the address .dynsym would point to:

 readelf -d libarm64.so | grep SYMTAB
 0x0000000000000006 (SYMTAB)             0x288

This would enable to recover imports and exports. Since the few sections that the binary has seem to not be corrupted this will work. but I would change the approach of resolving the dynamic symbols via sections and rewrite the parsing to be more segment oriented.

readelf fails for the same reason:

readelf --dyn-syms libarm64.so 
(empty)

Will try to look at the code in the coming days. A bit busy right now. but just wanted to comment out this so the issue is known :).

Thanks for the feedback! I bet the fix should be easy to just honor the segments, and use sections as fallback but i dont think the linkers use sections at all

On 28 Apr 2019, at 17:14, ulexec notifications@github.com wrote:

So I've been looking at this issue and it seems the problem comes with parsing the binary relying on sections. as a quick overview, the file has a .dynamic and a .dynstr section but is missing the .dynsym section which is the symbol table section for .dynstr. However we can obtain the same address by parsing the .dynamic section entries and finding the DT_SYMTAB entry, which points to the address .dynsym would point to:

readelf -d libarm64.so | grep SYMTAB
0x0000000000000006 (SYMTAB) 0x288

This would enable to recover the imports. Since the few sections that the binary has seem to not be corrupted this will work. but I would change the approach of resolving the dynamic section/segment address via sections and rewrite the parsing to be more segment oriented.

readelf fails for the same reason:

readelf --dyn-syms libarm64.so

In terms of the exports, since there is not .symtab section they are not easy to recover, although it seems the only exports that the binary has are 2 and one of them is the entrypoint.

Will try to look at the code in the coming days. A bit busy right now. but just wanted to comment out this so the issue is known :).

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

can you try again @enovella ? i think this issue is fixed

Still here it seems -
image

@ret2libc is this ready for testing?

R2 is always ready

On 22 Apr 2020, at 18:21, Eduardo Novella notifications@github.com wrote:

๏ปฟ
@ret2libc is this ready for testing?

โ€”
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.

I meant for this test

[0x00000730]> ii
[Imports]
nth vaddr      bind type lib name
โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•
0   0x00000000 NONE NONE     


[0x00000730]> iE
[Exports]

nth paddr vaddr bind type size lib name
โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•โ€•

[0x00000730]> !r2 -v
radare2 4.5.0-git 25192 @ linux-x86-64 git.4.4.0-64-gd3ecd271c
commit: d3ecd271cf81cc21a8a43ff9b59474023def5ead build: 2020-04-22__20:01:14

Oh, that's a pity :( I thought latest changes done by @08A would fix this as well, but there is something else probably. Anyway, there were already a lot of changes in the right direction, but unfortunately many places uses section names instead of DYNAMIC entries :(

So, I had a very quick look and I think there are two problems:
1) symbols/imports still rely on sections too much, instead of using first information from DYNAMIC and then, at most, augment them with section info
2) reloc_convert does not handle many of the relocations that are not x86/x86_64/arm

@08A since you worked heavily on elf.c, would you enjoy moving symbols/imports from sections to dynamic+sections as you have done for other parts of the code? (don't feel forced to, though, I'm just asking :) )

@enovella if you feel familiar enough with aarch64, what about doing the point 2? :) It will allow to find the right relocations in AARCH64 binaries.

@08A since you worked heavily on elf.c, would you enjoy moving symbols/imports from sections to dynamic+sections as you have done for other parts of the code? (don't feel forced to, though, I'm just asking :) )

Okay why not. But before doing that I will try to do a small refactoring on the relocations usage inside elf.c.

All relocations entries are loaded 3 times... in a row...

All relocations entries are loaded 3 times... in a row...

I see, yeah. Do you plan to cache relocs entries in ELFOBJ?

All relocations entries are loaded 3 times... in a row...

I see, yeah. Do you plan to cache relocs entries in ELFOBJ?

Yes

I start tracing the problem:

r_bin_object_set_items => libr/bin/bobj.c
imports => libr/bin/p/bin_elf.inc
Elf_(r_bin_elf_get_imports) => libr/bin/format/elf/elf.c
Elf_(_r_bin_elf_get_symbols_imports) => libr/bin/format/elf/elf.c
Elf_(get_phdr_symbols) => libr/bin/format/elf/elf.c
Elf_(r_bin_elf_get_phdr_symbols) => libr/bin/format/elf/elf.c
get_symbols_from_phdr => libr/bin/format/elf/elf.c

I am pretty sure that the problem come from this function get_symbols_from_phdr

If i hard code the number of import (readelf --symbols --use-dynamic libarm64.so).
2020-06-02-171117_378x311_scrot

โฏ readelf --symbols --use-dynamic libarm64.so

Symbol table for image:
  Num Buc:    Value          Size   Type   Bind Vis      Ndx Name
   15   0: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND memset
   13   0: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND memcpy
   11   0: 0000000000020058     0 NOTYPE  GLOBAL DEFAULT UND _edata
   10   0: 0000000000020058     0 NOTYPE  GLOBAL DEFAULT UND __bss_start__
    7   0: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND __stack_chk_fail
    2   0: 0000000000020000     0 SECTION LOCAL  DEFAULT bad
    1   0: 0000000000000730     0 SECTION LOCAL  DEFAULT bad
   18   1: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND __cxa_atexit
   17   1: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND free
   16   1: 0000000000020060     0 NOTYPE  GLOBAL DEFAULT UND _end
   14   1: 0000000000020060     0 NOTYPE  GLOBAL DEFAULT UND __bss_end__
    6   1: 000000000000aa20  1176 FUNC    GLOBAL DEFAULT bad Java_o__003dc_e
   12   2: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND malloc
    9   2: 0000000000020060     0 NOTYPE  GLOBAL DEFAULT UND __end__
    8   2: 0000000000020058     0 NOTYPE  GLOBAL DEFAULT UND __bss_start
    5   2: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND __cxa_finalize
    4   2: 0000000000000000     0 FUNC    GLOBAL DEFAULT UND realloc
    3   2: 0000000000020060     0 NOTYPE  GLOBAL DEFAULT UND _bss_end__

if we look at the entry 2, the value is > 10000 bytes (the size of the binary).
It is why the number of symbols => 0
We have two solution:

  • we don't check that tmp_offset > bin->size
  • we don't check that tmp_offset > bin->size and we use a better solution to compute the number of symbols

@ret2libc we can close this issue?

Yes, thanks for fixing this!

Was this page helpful?
0 / 5 - 0 ratings