Skip to content

Latest commit

 

History

History
124 lines (103 loc) · 10.5 KB

README.md

File metadata and controls

124 lines (103 loc) · 10.5 KB

A Framework for Fuzz Target Generation and Evaluation

This framework generates fuzz targets for real-world C/C++/Java/Python projects with various Large Language Models (LLM) and benchmarks them via the OSS-Fuzz platform.

More details available in AI-Powered Fuzzing: Breaking the Bug Hunting Barrier: Alt text

Current supported models are:

  • Vertex AI code-bison
  • Vertex AI code-bison-32k
  • Gemini Pro
  • Gemini Ultra
  • Gemini Experimental
  • Gemini 1.5
  • OpenAI GPT-3.5-turbo
  • OpenAI GPT-4
  • OpenAI GPT-4o
  • OpenAI GPT-4o-mini
  • OpenAI GPT-4-turbo
  • OpenAI GPT-3.5-turbo (Azure)
  • OpenAI GPT-4 (Azure)
  • OpenAI GPT-4o (Azure)

Generated fuzz targets are evaluated with four metrics against the most up-to-date data from production environment:

  • Compilability
  • Runtime crashes
  • Runtime coverage
  • Runtime line coverage diff against existing human-written fuzz targets in OSS-Fuzz.

Here is a sample experiment result from 2024 Jan 31. The experiment included 1300+ benchmarks from 297 open-source projects.

image

Overall, this framework manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.

Note that these reports are not public as they may contain undisclosed vulnerabilities.

Usage

Check our detailed usage guide for instructions on how to run this framework and generate reports based on the results.

Collaborations

Interested in research or open-source community collaborations? Please feel free to create an issue or email us: [email protected].

Bugs Discovered

So far, we have reported 30 new bugs/vulnerabilities found by automatically generated targets built by this framework:

Project Bug LLM Prompt Builder Target oracle
cJSON OOB read Vertex AI Default Far reach, low coverage
libplist OOB read Vertex AI Default Far reach, low coverage
hunspell OOB read Vertex AI default Far reach, low coverage
zstd OOB write Vertex AI default Far reach, low coverage
gdbm Stack buffer underflow Vertex AI default Far reach, low coverage
hoextdown Use of uninitialised memory Vertex AI default Far reach, low coverage
pjsip OOB read Vertex AI Default Low coverage with fuzz keyword + easy params far reach
pjsip OOB read Vertex AI Default Low coverage with fuzz keyword + easy params far reach
gpac OOB read Vertex AI Default Low coverage with fuzz keyword + easy params far reach
gpac OOB read/write Vertex AI Default All
gpac OOB read Vertex AI Default All
gpac OOB read Vertex AI Default All
sqlite3 OOB read Vertex AI Default All
htslib OOB read Vertex AI Default All
libical OOB read Vertex AI Default All
croaring OOB read Vertex AI Test-to-harness All
openssl CVE-2024-9143 - OOB read/write Vertex AI Default All
liblouis] Use of uninitialised memory Vertex AI Test-to-harness Test identifier
libucl OOB read Vertex AI Default Low coverage with fuzz keyword + easy params far reach
openbabel Use after free Vertex AI Default Low coverage with fuzz keyword + easy params far reach
libyang OOB read Vertex AI Default All
openbabel OOB read Vertex AI Default All
exiv2 OOB read Vertex AI Default All
Undisclosed Java RCE (pending maintainer triage) Vertex AI Default Far reach, low coverage
Undisclosed Regexp DoS (pending maintainer triage) Vertex AI Default Far reach, low coverage
Undisclosed OOB read Vertex AI Default All
Undisclosed OOB write Vertex AI Default All
Undisclosed OOB read Vertex AI Default All
Undisclosed OOB read Vertex AI Default All
Undisclosed Use after free Vertex AI Agent prompt All

These bugs could only have been discovered with newly generated targets. They were not reachable with existing OSS-Fuzz targets.

Current top coverage improvements by project

Project Total coverage gain Total relative gain OSS-Fuzz-gen total covered lines OSS-Fuzz-gen new covered lines Existing covered lines Total project lines
phmap 98.42% 205.75% 1601 1181 574 1120
usbguard 97.62% 26.04% 24550 5463 20979 3564
onednn 96.67% 7057.14% 5434 5434 77 210
avahi 82.06% 155.90% 3358 2814 1805 3046
pugixml 72.98% 194.95% 9015 6646 3409 7662
librdkafka 66.88% 845.57% 5019 4490 531 1169
casync 66.75% 903.23% 1171 1120 124 1678
tomlplusplus 61.06% 331.10% 4755 3652 1103 5981
astc-encoder 59.35% 177.88% 2726 1745 981 2940
mruby 48.56% 0.00% 34493 34493 0 71038
arduinojson 42.10% 85.80% 3344 1800 2098 4276
json 41.13% 66.51% 5051 3339 5020 8119
double-conversion 40.40% 88.12% 1663 779 884 1928
tinyobjloader 38.26% 77.01% 1157 717 931 1874
glog 38.18% 58.69% 895 331 564 867
cppitertools 35.78% 45.07% 253 151 335 422
eigen 35.38% 190.70% 2643 1947 1021 5503
glaze 34.55% 30.06% 2920 2416 8036 6993
rapidjson 31.83% 148.07% 1585 958 647 3010
libunwind 30.58% 83.25% 2899 1342 1612 4388
openh264 30.07% 50.14% 6607 5751 11470 19123

* "Total project lines" measures the source code of the project-under-test compiled and linked by the preexisting human-written fuzz targets from OSS-Fuzz.

* "Total coverage gain" is calculated using a denominator of the "Total project lines". "Total relative gain" is the increase in coverage compared to the old number of covered lines.

* Additional code from the project-under-test maybe included when compiling the new fuzz targets and result in high percentage gains.

Citing This Work

Please click on the 'Cite this repository' button located on the right-hand side of this GitHub page for citation details.