Describe the bug
iOS devices ship with all shared libraries (dylibs) in a unified cache. Ghidra seems to offer the feature to import individual dylibs from a shared cache, but does not seem to parse them correctly.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
IDA Pro successfully imports a dylib from a shared cache as if they were individual dylib files. It correctly locates the Mach-O header and processes the dylib.
Screenshots
The first screenshot shows IDA 7.2's listing of Foundation.framework from an iPhone XS 12.3.1 shared cache. The cache can be retrieved from the IPSW here.

Note that IDA has correctly located the dylib's Mach-O header and parsed it correctly. However, in Ghidra, the load address isn't a Mach-O header at all.

In the next two screenshots I am trying to show the NSLog() function. The correct function is displayed in IDA here:

But Ghidra has just placed the NSLog() label in the middle of some other code.

Environment (please complete the following information):
Thanks for reporting this, it's something we are aware of and its in the queue. I chose to implement a new loader for the entire DYLD cache first, since things like jtool do a good job of extracting DYLIB's which can then be fed into Ghidra. The new loader will be released in Ghidra 9.1, but it's currently in master if you want to try it out...I would love some feedback. It's main limitation right now is speed, since there are so many embedded DYLIB's to analyze.
I do plan on investigating this issue though.
Awesome, I'll try and get the master branch going sometime this week and let you know how it goes.
So I started playing around a little with the master branch and loading an entire dyld cache, and I have to say I'm impressed. It managed to load the entire cache... I had to first bump up the amount of memory, and as expected its a little slow, but it did it, which is something IDA can't do. :) I'm still playing around with it, but I did notice that it doesn't seem to be picking up symbol names. See the attached screenshot for NSLog (the same function above). Should I open a new issue for this or any other issues with the master branch?
Thanks for the awesome work so far on this, I'm excited to see where you guys take it.

Thanks for the positive feedback! Sure, I think new issues that pertain to the DYLD loader would be best...we can keep this issue about the DYLD filesystem and extracting individual DYLIB's from it.
Regarding the symbols though...there can be so many that it takes a really long time for the symbol tree and symbol table to display them. It looks like your symbol tree is currently trying to filter them. But, if that function you are on should currently be labeled as NSLog, that is new issue that I can look into.
Hmmm, all the symbol addresses seem to have been truncated to 32-bits. When I fix that hopefully that will fix the behavior you are seeing.
The symbol issue should be fixed now.
Another way to split the dyld is to use Xcode, however, when I load the split dylib into Ghidra it does not recognize it as a macho or what processor to use??
jtool split:
$ head -c 100 dyld_shared_cache.JavaScriptCore|hexdump -C
00000000 cf fa ed fe 0c 00 00 01 02 00 00 00 06 00 00 00 |................|
00000010 16 00 00 00 d8 0e 00 00 85 00 11 c2 00 00 00 00 |................|
00000020 19 00 00 00 68 03 00 00 5f 5f 54 45 58 54 00 00 |....h...__TEXT..|
00000030 00 00 00 00 00 00 00 00 00 f0 14 88 01 00 00 00 |................|
00000040 00 90 d3 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000050 00 90 d3 00 00 00 00 00 05 00 00 00 05 00 00 00 |................|
00000060 0a 00 00 00 |....|
00000064
Xcode split:
$ head -c 4200 System/Library/Frameworks/JavaScriptCore.framework/JavaScriptCore|hexdump -C
00000000 ca fe ba be 00 00 00 01 01 00 00 0c 00 00 00 02 |................|
00000010 00 00 10 00 01 03 d0 00 00 00 00 0c 00 00 00 00 |................|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000 cf fa ed fe 0c 00 00 01 02 00 00 00 06 00 00 00 |................|
00001010 16 00 00 00 d8 0e 00 00 85 00 11 42 00 00 00 00 |...........B....|
00001020 19 00 00 00 68 03 00 00 5f 5f 54 45 58 54 00 00 |....h...__TEXT..|
00001030 00 00 00 00 00 00 00 00 00 f0 14 88 01 00 00 00 |................|
00001040 00 90 d3 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00001050 00 90 d3 00 00 00 00 00 05 00 00 00 05 00 00 00 |................|
00001060 0a 00 00 00 00 00 00 00 |........|
Notice that Xcode seems to be wrapping the machO in a CAFEBABE ??
I have heard that Xcode only splits the dyld well enough to attach a debugger to, so the branch islands etc etc aren't handled and symbols might not be fixed. It would be amazing if Ghidra could handle this as I waited for many hours for Ghidra to load the entire dyld_shared_cache only to have my instance crash :(
UPDATE: I tried using XCode 11 beta and it seems to not wrap it in the CAFEBABE so... nevermind 馃槚
馃憗 鉂わ笍 馃悏
Is the best way to use ghidra still to either load the entire cache or use another tool to split first?
Most helpful comment
Thanks for reporting this, it's something we are aware of and its in the queue. I chose to implement a new loader for the entire DYLD cache first, since things like jtool do a good job of extracting DYLIB's which can then be fed into Ghidra. The new loader will be released in Ghidra 9.1, but it's currently in master if you want to try it out...I would love some feedback. It's main limitation right now is speed, since there are so many embedded DYLIB's to analyze.
I do plan on investigating this issue though.