Radare2: Resolve calls to objc_msgSend

Created on 12 Dec 2017  Â·  13Comments  Â·  Source: radareorg/radare2

This is an important feature request, because it makes analyzing iOS apps with _radare2_ possible/better.

The problem

In Objective-C methods are not called directly, but instead converted to a function called _objc_msgSend_, which is then responsible to call the appropriate method [1]. This means you will see almost only calls to _objc_msgSend_ in the disassembly.

[obj message] -> objc_msgSend(obj, @selector(message))

The first parameter is the receiving Objective-C object itself, the second parameter is the selector, and the rest of parameters are the parameters passed to the Objective-C method.

The problem is now that _radare2_ cannot build an useful call- or control flow graph, because it can’t consider the actual called methods. This means you cannot tell what the app is actually doing or perform any analysis.

Other tools have already a solution

Other tools are capable in resolving the calls and patching them with the actual methods. For example IDA had for a long time this script by zynamics [2], [3] and with the current version 7.0 they even added their own solution for the problem [4], [5]. Also Hopper is able to generate an useful control flow graph with the actual calls [6].

How to resolve the calls?

Fortunately you can tell which methods will be called by determining the parameters for the _objc_msgSend_ function. To do that you need to determine which values are stored in certain registers.

Arm (32 bit):
r0: Pointer to the object, which implements the method
r1: Name of the method
r2, r3: Parameters passed to the method

If there are more parameters they will be stored on the stack.

Arm64 (64 bit):
x0: Pointer to the object, which implements the method
x1: Name of the method
x2 to x7: Parameters passed to the method

A research team developed a technique they call _„backward slicing and forward constant propagation“_ for determining the values [7].

Current approach with radare2

First I tried to evaluate all the instructions, which modify r0/x0 and r1/x1. But why all the work, if _radare2_ can determine the values by emulating the instructions with ESIL. With e asm.emu=true and pdf or pdc it will even show me for the most cases the arguments of _objc_msgSend_ as comments. But unfortunately pdfj does not include the comments, so I have to parse the arguments of _objc_msgSend_ from the comments. And this really isn’t elegant.

Feature request

I think _radare2_ has with ESIL already everything it needs in order to resolve the calls. It would be great if you could adjust the parts, which are needed in order to provide these new commands, so we can skip the dirty parsing:

  • command 1: Override all the _objc_msgSend_ calls with the actual call (_class.method_) for all methods
  • command 2: Override all the _objc_msgSend_ calls with the actual call (_class.method_) for the current method

Then we would have useful call graphs, CFGs and could perform real analysis. Only this feature could be reason enough to choose _radare2_ for static iOS analysis.

_*Note that there are 4 other functions for sending messages: _objc_msgSendSuper, objc_msgSendSuper_stret, objc_msgSend_stret_ and _objc_msgSend_fpret_ [8]._

Most helpful comment

@YugoCode I wrote the script for radare2 https://github.com/alvarofe/r2-scripts/blob/master/ios/objc.py, in the upcoming days I will do a package for r2pm

All 13 comments

Just e asm.emustr=true

On 12 Dec 2017, at 15:18, Daniel Corak notifications@github.com wrote:

This is an important feature request, because it makes analyzing iOS apps with radare2 possible/better.

The problem

In Objective-C methods are not called directly, but instead converted to a function called objc_msgSend, which is then responsible to call the appropriate method [1]. This means you will see almost only calls to objc_msgSend in the disassembly.

[obj message] -> objc_msgSend(obj, @selector(message))

The first parameter is the receiving Objective-C object itself, the second parameter is the selector, and the rest of parameters are the parameters passed to the Objective-C method.

The problem is now that radare2 cannot build an useful call- or control flow graph, because it can’t consider the actual called methods. This means you cannot tell what the app is actually doing or perform any analysis.

Other tools have already a solution

Other tools are capable in resolving the calls and patching them with the actual methods. For example IDA had for a long time this script by zynamics [2], [3] and with the current version 7.0 they even added their own solution for the problem [4], [5]. Also Hopper is able to generate an useful control flow graph with the actual calls [6].

How to resolve the calls?

Fortunately you can tell which methods will be called by determining the parameters for the objc_msgSend function. To do that you need to determine which values are stored in certain registers.

Arm (32 bit):
r0: Pointer to the object, which implements the method
r1: Name of the method
r2, r3: Parameters passed to the method

If there are more parameters they will be stored on the stack.

Arm64 (64 bit):
x0: Pointer to the object, which implements the method
x1: Name of the method
x2 to x7: Parameters passed to the method

A research team developed a technique they call „backward slicing and forward constant propagation“ for determining the values [7].

Current approach with radare2

First I tried to evaluate all the instructions, which modify r0/x0 and r1/x1. But why all the work, if radare2 can determine the values by emulating the instructions with ESIL. With e asm.emu=true and pdf or pdc it will even show me for the most cases the arguments of objc_msgSend as comments. But unfortunately pdfj does not include the comments, so I have to parse the arguments of objc_msgSend from the comments. And this really isn’t elegant.

Feature request

I think radare2 has with ESIL already everything it needs in order to resolve the calls. It would be great if you could adjust the parts, which are needed in order to provide these new commands, so we can skip the dirty parsing:

command 1: Override all the objc_msgSend calls with the actual call (class.method) for all methods
command 2: Override all the objc_msgSend calls with the actual call (class.method) for the current method
Then we would have useful call graphs, CFGs and could perform real analysis. Only this feature could be reason enough to choose radare2 for static iOS analysis.

*Note that there are 4 other functions for sending messages: objc_msgSendSuper, objc_msgSendSuper_stret, objc_msgSend_stret and objc_msgSend_fpret [8].

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

This was resolved long time ago in r2 already. There are also some third party scripts that do the same

On 12 Dec 2017, at 15:18, Daniel Corak notifications@github.com wrote:

This is an important feature request, because it makes analyzing iOS apps with radare2 possible/better.

The problem

In Objective-C methods are not called directly, but instead converted to a function called objc_msgSend, which is then responsible to call the appropriate method [1]. This means you will see almost only calls to objc_msgSend in the disassembly.

[obj message] -> objc_msgSend(obj, @selector(message))

The first parameter is the receiving Objective-C object itself, the second parameter is the selector, and the rest of parameters are the parameters passed to the Objective-C method.

The problem is now that radare2 cannot build an useful call- or control flow graph, because it can’t consider the actual called methods. This means you cannot tell what the app is actually doing or perform any analysis.

Other tools have already a solution

Other tools are capable in resolving the calls and patching them with the actual methods. For example IDA had for a long time this script by zynamics [2], [3] and with the current version 7.0 they even added their own solution for the problem [4], [5]. Also Hopper is able to generate an useful control flow graph with the actual calls [6].

How to resolve the calls?

Fortunately you can tell which methods will be called by determining the parameters for the objc_msgSend function. To do that you need to determine which values are stored in certain registers.

Arm (32 bit):
r0: Pointer to the object, which implements the method
r1: Name of the method
r2, r3: Parameters passed to the method

If there are more parameters they will be stored on the stack.

Arm64 (64 bit):
x0: Pointer to the object, which implements the method
x1: Name of the method
x2 to x7: Parameters passed to the method

A research team developed a technique they call „backward slicing and forward constant propagation“ for determining the values [7].

Current approach with radare2

First I tried to evaluate all the instructions, which modify r0/x0 and r1/x1. But why all the work, if radare2 can determine the values by emulating the instructions with ESIL. With e asm.emu=true and pdf or pdc it will even show me for the most cases the arguments of objc_msgSend as comments. But unfortunately pdfj does not include the comments, so I have to parse the arguments of objc_msgSend from the comments. And this really isn’t elegant.

Feature request

I think radare2 has with ESIL already everything it needs in order to resolve the calls. It would be great if you could adjust the parts, which are needed in order to provide these new commands, so we can skip the dirty parsing:

command 1: Override all the objc_msgSend calls with the actual call (class.method) for all methods
command 2: Override all the objc_msgSend calls with the actual call (class.method) for the current method
Then we would have useful call graphs, CFGs and could perform real analysis. Only this feature could be reason enough to choose radare2 for static iOS analysis.

*Note that there are 4 other functions for sending messages: objc_msgSendSuper, objc_msgSendSuper_stret, objc_msgSend_stret and objc_msgSend_fpret [8].

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

This was resolved long time ago in r2 already.

But how does r2 resolve the calls? Could you please explain me, how I can tell which method from which class will be called behind the _objc_msgSend_ call using r2? So I can build a control flow graph for an iOS app.

There are also some third party scripts that do the same

But the scripts, which are available are only for IDA Pro [1] and Hopper [2], but not for r2. Actually for porting these scripts to r2 there is still an open issue https://github.com/radare/radare2/issues/7259.

Yep there are scripts. Not sure if public.. but that should be implemented in C. Also, aae can emulate the clde to resolve those calls but that will not create the refs. Those are just comments

The support can be improved. So im not saying that its a perfect solution but works well for some use cases.

On 12 Dec 2017, at 15:18, Daniel Corak notifications@github.com wrote:

This is an important feature request, because it makes analyzing iOS apps with radare2 possible/better.

The problem

In Objective-C methods are not called directly, but instead converted to a function called objc_msgSend, which is then responsible to call the appropriate method [1]. This means you will see almost only calls to objc_msgSend in the disassembly.

[obj message] -> objc_msgSend(obj, @selector(message))

The first parameter is the receiving Objective-C object itself, the second parameter is the selector, and the rest of parameters are the parameters passed to the Objective-C method.

The problem is now that radare2 cannot build an useful call- or control flow graph, because it can’t consider the actual called methods. This means you cannot tell what the app is actually doing or perform any analysis.

Other tools have already a solution

Other tools are capable in resolving the calls and patching them with the actual methods. For example IDA had for a long time this script by zynamics [2], [3] and with the current version 7.0 they even added their own solution for the problem [4], [5]. Also Hopper is able to generate an useful control flow graph with the actual calls [6].

How to resolve the calls?

Fortunately you can tell which methods will be called by determining the parameters for the objc_msgSend function. To do that you need to determine which values are stored in certain registers.

Arm (32 bit):
r0: Pointer to the object, which implements the method
r1: Name of the method
r2, r3: Parameters passed to the method

If there are more parameters they will be stored on the stack.

Arm64 (64 bit):
x0: Pointer to the object, which implements the method
x1: Name of the method
x2 to x7: Parameters passed to the method

A research team developed a technique they call „backward slicing and forward constant propagation“ for determining the values [7].

Current approach with radare2

First I tried to evaluate all the instructions, which modify r0/x0 and r1/x1. But why all the work, if radare2 can determine the values by emulating the instructions with ESIL. With e asm.emu=true and pdf or pdc it will even show me for the most cases the arguments of objc_msgSend as comments. But unfortunately pdfj does not include the comments, so I have to parse the arguments of objc_msgSend from the comments. And this really isn’t elegant.

Feature request

I think radare2 has with ESIL already everything it needs in order to resolve the calls. It would be great if you could adjust the parts, which are needed in order to provide these new commands, so we can skip the dirty parsing:

command 1: Override all the objc_msgSend calls with the actual call (class.method) for all methods
command 2: Override all the objc_msgSend calls with the actual call (class.method) for the current method
Then we would have useful call graphs, CFGs and could perform real analysis. Only this feature could be reason enough to choose radare2 for static iOS analysis.

*Note that there are 4 other functions for sending messages: objc_msgSendSuper, objc_msgSendSuper_stret, objc_msgSend_stret and objc_msgSend_fpret [8].

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.

@YugoCode I wrote the script for radare2 https://github.com/alvarofe/r2-scripts/blob/master/ios/objc.py, in the upcoming days I will do a package for r2pm

Bear in mind it does not patch but creates xref

Okay, thanks to @alvarofe we can now access the _call-xrefs_ with radare2. But how do we solve the initial issue - creating a call graph (+ control flow graph) for iOS apps with radare2?

Call Graph

My idea for creating the call graph with radare2:

1. Create a data structure, which contains all the methods from the flags classes and imports
2. Iterate through every method from the data structure
2.1. Get the call-xrefs for every method with axtj method.*class*.*methodName*
2.2. Get the method name from the _call-xref_
2.3. Add this method name as the calling method to the current iterated method in the data structure
3. We should now be able to draw a call graph from the data structure

Bug in step 2.2.

With axtj the json-entry "realname" should show me the real name of the calling function. And it often does with the structure: method.*className*.*methodName*. But sometimes I get this: "-[MyTestClass addTwoNumbers]". This should be "method.MyTestClass.addTwoNumbers". Could you fix this in the script @alvarofe?

Control Flow Graph

For the CFG it is not enough to just know, which function called another function. We need to know which actual _objc_msgSend_ is responsible for the function call. @alvarofe Could we adjust the script somehow to store the information, which _objc_msgSend_ was responsible for the call?

„The support can be improved.“

@radare So can we assign this issue then to a milestone (the related https://github.com/radare/radare2/issues/7259 is due to 2.2.0)? First implementing the script in C, so we have the _call-xrefs_ and then most importantly patching the calls to _objc_msgSend_?

The agc command created a callgraph, did you tried it? Its not tested and few people uses it so i guess it can be improved a lot. Let me know if that works for you after running alvaro’s script

On 19 Dec 2017, at 13:20, Daniel Corak notifications@github.com wrote:

Okay, thanks to @alvarofe we can now access the call-xrefs with radare2. But how do we solve the initial issue - creating a call graph (+ control flow graph) for iOS apps with radare2?

Call Graph

My idea for creating the call graph with radare2:

  1. Create a data structure, which contains all the methods from the flags classes and imports
  2. Iterate through every method from the data structure
    2.1. Get the call-xrefs for every method with axtj method.class.methodName
    2.2. Get the method name from the call-xref
    2.3. Add this method name as the calling method to the current iterated method in the data structure
  3. We should now be able to draw a call graph from the data structure

Bug in step 2.2.

With axtj the json-entry "realname" should show me the real name of the calling function. And it often does with the structure: method.className.methodName. But sometimes I get this: "-[MyTestClass addTwoNumbers]". This should be "method.MyTestClass.addTwoNumbers". Could you fix this in the script @alvarofe?

Control Flow Graph

For the CFG it is not enough to just know, which function called another function. We need to know which actual objc_msgSend is responsible for the function call. @alvarofe Could we adjust the script somehow to store the information, which objc_msgSend was responsible for the call?

„The support can be improved.“

@radare So can we assign this issue then to a milestone? First implementing the script in C, so we have the call-xrefs and then most importantly patching the calls to objc_msgSend?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Yes, I tried the agc command. Unfortunately it doesn't consider the _call-xrefs_ from the script at all. The output is the same like without running the script, i.e. only calls to _objc_msgSend_.

So its probably that the ax command doesnt puts the call refs inside the function structure. Thats why its not working. The fix should be to do that. Maybe 10 lines of C is enough. Wanna send a pr?

On 20 Dec 2017, at 14:22, Daniel Corak notifications@github.com wrote:

Yes, I tried the agc command. Unfortunately it doesn't consider the call-xrefs from the script at all. The output is the same like without running the script, i.e. only calls to objc_msgSend.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

Should work in the current version now.

@YugoCode open issues on the repo where the script is so when I get spare time I can tackle it.

closing?

Was this page helpful?
0 / 5 - 0 ratings