FOX – Fix Objective-C XREFs in Ghidra

Hi! This is my first article on HN Security‘s blog and I think that showcasing a little tool developed together with Marco to help us in our everyday’s mobile assessments could be a good pick for a new beginning!

The tool is called FOX. It’s a Ghidra script created to speed up iOS analysis by adding XREFs (and potential XREFs) to iOS disassembled and decompiled code. This tool will hopefully be the first of a series of useful Ghidra scripts that will be published in my repository (and in Marco’s repository).

First, why do we need such a tool?

The reason is that Objective-C functions are executed using dynamic function pointers called “selectors”, which are resolved by name during runtime. In practice, this means that if we disassemble/decompile an iOS binary and we look at Objective-C method invocations in the code, we will not find “classic” C-like function calls, but we will only see invocations to a bunch of methods (the most common is “objc_msgSend”, we will use a generic msgSend term in this article from now on to point to this group of functions). These particular methods are invoked supplying as arguments the class name, the method name and the method arguments (if any). For example:

In this example, the method “decryptStrFromBase64:Key:IV:” of class “FWEncryptorAES” is executed, probably (if the decompiler has not made any mistake) with arguments “local_28”, “local_30” and “local_38”.

From a reverse engineer’s perspective, this mechanism has both pros and cons: the binary is usually more readable, because method names are passed as strings to the msgSend function, but on the other hand we don’t have cross-references (XREFs): XREFs are very valuable during reverse engineering sessions because they create links between a function and all the locations in the binary in which that function is called, improving the efficiency of reversing, without having to employ dynamic analysis techniques.

This is FOX’s purpose: it tries to add XREFs to the functions called with the msgSend mechanisms.

Let’s have a look at how it works. From a high-level perspective, the script simply scans the binary, searches for msgSend functions, tries to infer the class name and the method name from the msgSend parameters and finally adds the XREFs.

Extracting class names and method names from the disassembled code in most situations is not difficult, but sometimes it can require complex logic. Luckily, Ghidra takes care of this during its analysis tasks, in order to populate the “Decompiler” pane.

FOX uses the following approach:

  1. It searches for msgSend functions in the symbol table of the binary
  2. It retrieves the list of all functions that call one of those msgSend functions
  3. It decompiles each function and stores information related to all msgSend calls, extracting class and method names if present
  4. Finally, it adds the XREFs (and some useful PRE and PLATE comments)

As a result, the following msgSend call in the decryptResponse function…

… produces the following XREF reported at the beginning of method “b:” of A class:

Unfortunately, Ghidra’s analyzer is not always able to retrieve class and method names. Especially the class name is often missing. In order to overcome this issue, the user can supply the output of a Frida script that can be found in the same repository (this step is not mandatory). This script retrieves a complete list of all Objective-C classes and methods of the application. This list cannot be retrieved statically from the analyzed binary itself, because it lacks all the methods of the system libraries and of other libraries packed with the target application.

Based on this list, the script is able to insert an XREF even if the class name has not been retrieved by Ghidra’s analyzer, if and only if there is only one method in the binary with that specific method name. Without this list, the script would not be able to know if there are other methods with the same name outside the binary.

Additionally, a PRE comment is inserted in the objSend line, containing the retrieved class name (either external or internal to the binary) and, if available, also the function address:

If there is more than one method with the same name (or if the Frida-generated list has not been supplied), a PRE comment with a list of the potential callers (external or internal) is added to the msgSend call:

The PRE comment is also added to the msgSend call if both class name and method name are present in the decompiled code, in order to simplify analysis executed directly on the disassembly:

When we have potential XREFs related to internal methods, the script also adds a PLATE comment with the potential XREFs at the beginning of potentially referenced internal methods, like the following one:

The script can be executed directly from the GUI or can be executed in headless mode as follows:

$ analyzeHeadless #PROJECT_DIRECTORY #PROJECT_NAME -import #BINARY_PATH \
-scriptPath #SCRIPT_FOLDER_PATH -postScript FOX.java #OPTIONAL_FRIDA_OUTPUT_PATH

In the same repository I added a simple Ghidra script to output the project in the gzf archive format, useful if the project is analyzed on one computer and then opened on another machine. The script is a Java transposition of beigela’s code, posted here. Headless analysis followed by export can be executed as follows:

# without deleting the project folder after the export
$ analyzeHeadless #PROJECT_DIRECTORY #PROJECT_NAME -import #BINARY_PATH \
-scriptPath #SCRIPT_FOLDER_PATH -postScript FOX.java #OPTIONAL_FRIDA_OUTPUT_PATH \
-postScript ExportToGzf.java #ARCHIVE_OUTPUT_PATH

# deleting the project folder after the export
$ analyzeHeadless #PROJECT_DIRECTORY #PROJECT_NAME -import #BINARY_PATH \
-scriptPath #SCRIPT_FOLDER_PATH -postScript FOX.java #OPTIONAL_FRIDA_OUTPUT_PATH \
-postScript ExportToGzf.java #ARCHIVE_OUTPUT_PATH -deleteProject

The .gzf file can then be imported by creating a new Ghidra project and selecting the “Import file…” option. One attention point: if deleteProject option is selected, the project will be deleted also in the case of errors in the plugins (an example is if there is already a file at the path selected for the export). The risk, especially for binaries that require long analysis time, is to wait for hours for nothing. If you need at least the output of the Ghidra analysis and you have a rigid schedule, it is advisable to avoid the deleteProject option. The project can be safely manually deleted after checking the output of the exporter plugins.

Finally, let’s talk about the processing time of the script. By looking at the code, it may seem more complex and articulated than necessary (and probably it is! 😀 ). The reason of this complexity is that I tried to scroll the binary listing the least possible to create XREFs and comments, instead of going back and forth multiple times, populating Maps and Lists with the references to functions and msgSend calls that I needed to create the XREFs. Consequently, the code is less readable but it should be more performing (but surely there are much better ways to do the same thing…).

The script can be downloaded from my Ghidra scripts repository: https://github.com/federicodotta/ghidra-scripts

A lighter version of the script that uses a slightly different approach can be downloaded from Marco’s Ghidra scripts repository. This version tries to recover class and method names from the disassembly code, instead of relying on the decompiled code. It has less features than the one based on the decompiler and it makes some approximations, but it usually uses less memory and anyhow it is always better to have more alternative tools with this kind of analysis.

In Marco’s repository you can also find Rhabdomancer, a simple Ghidra script to assist with vulnerability research tasks based on a candidate point strategy, against closed source software written in C/C++. The purpose of this tool is to speed up vulnerability research activities on C/C++ code.

Well, if you are interested in our Ghidra scripts, follow our repositories! We will soon release other useful tools! 🙂

Cheers!