Skip to content

Router Firmware Cryptographic Function Identification and Localization Tool Based on Binary Similarity Detection

License

Notifications You must be signed in to change notification settings

lilian-lilifox/Crypt_Func_Detect

Repository files navigation

Cryptographic Function Detector for Router Firmware

This project is the author’s undergraduate thesis work. The core code is adapted from two open-source repositories: NSSL-SJTU/HermesSim and Cisco-Talos/binary_function_similarity. These were integrated and extended for the task of identifying and locating cryptographic functions in router firmware. This repository contains the corresponding automation scripts and tooling.

Additionally, two datasets were constructed:

  • Dataset-Finetuning: for fine-tuning the similarity model.
  • Dataset-Crypt: for the cryptographic function identification task. (Put the binaries to be analyzed in the Binaries/Dataset-Crypt/vul/ folder following the naming convention.)

Project Structure

Binaries/         # Raw binary files
DBs/              # Preprocessed graph/data outputs
IDA_script/       # IDA Python scripts for extracting ACFG graphs
IDBs/             # IDA analysis database files
bin/              # External tools and dependencies
lifting/          # Scripts for lifting binary functions into Pcode based graphs
preprocess/       # Scripts for graph normalization and encodin
model/            # Neural network model and related experiments configures
postprocess       # scripts for testing pairs generation, fast evaluation and visualization
inputs/           # Inputs for the model (iscg, tscg, sog)
outputs/          # Outputs for the model (checkpoint files, inferred embeddings, log)
Dockerfile        # An OpenWrt 23.05-specific cross-compilation environment

Notice: External tool gsat-1.0.jar is needed to be downloaded and place in bin/.

The author's intermediate and final experimental results are published in the Releases section.

Setup Instructions

  1. Python Environment (Python 3.10 required)
conda create -n cfd python=3.10
conda activate cfd
pip install -r requirements.txt \
    --extra-index-url https://download.pytorch.org/whl/cu116 \
    -f https://data.pyg.org/whl/torch-1.13.1+cu116.html
  1. IDA Pro Requirement IDA Pro 9.1 for Linux is required. Please update the paths in run_Finetuning.sh and run_Crypt.sh accordingly.

How to Use

  1. Prepare firmware binaries Unpack IoT firmware images and place the target binaries under:

    Binaries/Dataset-Crypt/vul/
    
  2. (Optional - already provided) Fine-tune the HermesSim model using:

    ./run_Finetuning.sh
    ./run_Finetuning2.sh
  3. Update config In outputs/Finetuned/config.json, set:

    "checkpoint_name": "checkpoint_*.pt"

    to the desired checkpoint (e.g., best-performing one).

  4. Run cryptographic function detection

    ./run_Crypt.sh
    ./run_Crypt2.sh
  5. Outputs

    • Fine-tuned checkpoints: outputs/Finetuned/graph-ggnn-batch_pair-pcode_sog
    • Detection results: outputs/Crypt
  6. (Optional) To extract top-K most similar functions from output similarity CSVs:

    python postprocess/3.pp_results/top_k.py <*_sim.csv> <num_of_results>

Q&A

Q: I want to avoid matching very small functions. What can I do?

A: Edit the filtering rule in DBs/Dataset-Crypt/Dataset-Crypt_creation.py, line 8:

flowchart = flowchart[flowchart["bb_num"] >= 0]

Increase the threshold (e.g., to >= 5) to exclude short functions from matching.

Q: How can I create my own Dataset-Finetuning samples?

A: The source code includes a Dockerfile for building an OpenWrt 23.05-specific cross-compilation environment. You can use it to compile your own binaries for dataset generation:

docker build -t openwrt-crosscompile .
docker run -it --rm -v $(pwd)/src:/workspace openwrt-crosscompile

Inside the Docker container, place and build your source files under /workspace. The resulting binaries can be used to construct new samples for fine-tuning the model.

About

Router Firmware Cryptographic Function Identification and Localization Tool Based on Binary Similarity Detection

Resources

License

Stars

Watchers

Forks

Packages

No packages published