Cryptographic Function Detector for Router Firmware

This project is the author’s undergraduate thesis work. The core code is adapted from two open-source repositories: NSSL-SJTU/HermesSim and Cisco-Talos/binary_function_similarity. These were integrated and extended for the task of identifying and locating cryptographic functions in router firmware. This repository contains the corresponding automation scripts and tooling.

Additionally, two datasets were constructed:

Dataset-Finetuning: for fine-tuning the similarity model.
Dataset-Crypt: for the cryptographic function identification task. (Put the binaries to be analyzed in the Binaries/Dataset-Crypt/vul/ folder following the naming convention.)

Project Structure

Binaries/         # Raw binary files
DBs/              # Preprocessed graph/data outputs
IDA_script/       # IDA Python scripts for extracting ACFG graphs
IDBs/             # IDA analysis database files
bin/              # External tools and dependencies
lifting/          # Scripts for lifting binary functions into Pcode based graphs
preprocess/       # Scripts for graph normalization and encodin
model/            # Neural network model and related experiments configures
postprocess       # scripts for testing pairs generation, fast evaluation and visualization
inputs/           # Inputs for the model (iscg, tscg, sog)
outputs/          # Outputs for the model (checkpoint files, inferred embeddings, log)
Dockerfile        # An OpenWrt 23.05-specific cross-compilation environment

Notice: External tool gsat-1.0.jar is needed to be downloaded and place in bin/.

The author's intermediate and final experimental results are published in the Releases section.

Setup Instructions

Python Environment (Python 3.10 required)

conda create -n cfd python=3.10
conda activate cfd
pip install -r requirements.txt \
    --extra-index-url https://download.pytorch.org/whl/cu116 \
    -f https://data.pyg.org/whl/torch-1.13.1+cu116.html

IDA Pro Requirement IDA Pro 9.1 for Linux is required. Please update the paths in run_Finetuning.sh and run_Crypt.sh accordingly.

How to Use

Prepare firmware binaries Unpack IoT firmware images and place the target binaries under:
```
Binaries/Dataset-Crypt/vul/
```
(Optional - already provided) Fine-tune the HermesSim model using:
```
./run_Finetuning.sh
./run_Finetuning2.sh
```
Update config In outputs/Finetuned/config.json, set:
```
"checkpoint_name": "checkpoint_*.pt"
```
to the desired checkpoint (e.g., best-performing one).
Run cryptographic function detection
```
./run_Crypt.sh
./run_Crypt2.sh
```
Outputs
- Fine-tuned checkpoints: outputs/Finetuned/graph-ggnn-batch_pair-pcode_sog
- Detection results: outputs/Crypt
(Optional) To extract top-K most similar functions from output similarity CSVs:
```
python postprocess/3.pp_results/top_k.py <*_sim.csv> <num_of_results>
```

Q&A

Q: I want to avoid matching very small functions. What can I do?

A: Edit the filtering rule in DBs/Dataset-Crypt/Dataset-Crypt_creation.py, line 8:

flowchart = flowchart[flowchart["bb_num"] >= 0]

Increase the threshold (e.g., to >= 5) to exclude short functions from matching.

Q: How can I create my own Dataset-Finetuning samples?

A: The source code includes a Dockerfile for building an OpenWrt 23.05-specific cross-compilation environment. You can use it to compile your own binaries for dataset generation:

docker build -t openwrt-crosscompile .
docker run -it --rm -v $(pwd)/src:/workspace openwrt-crosscompile

Inside the Docker container, place and build your source files under /workspace. The resulting binaries can be used to construct new samples for fine-tuning the model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cryptographic Function Detector for Router Firmware

Project Structure

Setup Instructions

How to Use

Q&A

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Binaries		Binaries
DBs		DBs
IDA_scripts		IDA_scripts
IDBs		IDBs
bin		bin
inputs/fine_tuning		inputs/fine_tuning
lifting		lifting
model		model
outputs		outputs
postprocess		postprocess
preprocess		preprocess
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run_Crypt.sh		run_Crypt.sh
run_Crypt2.sh		run_Crypt2.sh
run_Finetuning.sh		run_Finetuning.sh
run_Finetuning2.sh		run_Finetuning2.sh

License

lilian-lilifox/Crypt_Func_Detect

Folders and files

Latest commit

History

Repository files navigation

Cryptographic Function Detector for Router Firmware

Project Structure

Setup Instructions

How to Use

Q&A

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages