Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Early results for vulnerability analysis and remediation for OSS-Fuzz bugs #458

Open
inferno-chromium opened this issue Jul 8, 2024 · 6 comments

Comments

@inferno-chromium
Copy link
Collaborator

inferno-chromium commented Jul 8, 2024

Sharing some early results from trying out vulnerability analysis and remediation for OSS-Fuzz bugs (see script here, enclosed llm output for 5 vulns). This is using Gemini 1.5 Pro with the 1M context window. The high level idea is OSS-Fuzz vulnerabilities have a precise regression range on where the issue was introduced (and sometimes even exact changeset, but a range of few commits works fine as well) and a detailed sanitizer crash stacktrace with the source files (where the issue most probably exists). If we feed Gemini the crash sanitizer stacktrace, the regression changeset git diff and source code of the files for every sanitizer stack frame (at crash revision), we see:

  • It is able to precisely explain the vulnerability and its root cause in most cases. This could be useful to automatically add in issue descriptions to help developers understand the bug and speed up the overall patching process.
  • It is able to generate pretty good patches in several cases. See example rawoutput files attached on these patches (file has oss-fuzz bug link, full cmd line and LLM output). Patching feels like an art as a developer can fix a problem in multiple ways, so it generates slightly different patches each time.

This is early work, there can be a lot of improvements that can be done:

  • Run the crash testcase under a coverage build and capture a list of all source files touched during the process (and hence is a more accurate list than just the set of files in crash stacktrace). The reason for this is sometimes the fix can go in a completely different file that ones in the crash stacktrace (and we can't feed entire codebases yet!).
  • In some cases, we run into context window limits and can optimize by passing just the touched functions and globals, instead of sending the whole file.
  • How can we detect the quality of a suggested patch and detect if it is an incorrect or a hallucinated response.
  • Explore advantages of fine-tuning on previously fixed security bugs.
  • [Stretch] How can we improve this to fix similar bug variants in other places.

69527.libical.rawoutput.txt
69527.stacktrace.txt
65777.qpdf.rawoutput.txt
65777.stacktrace.txt
64337.pcre2.rawoutput.txt
64337.stacktrace.txt
69497.ArduinoJson.rawoutput.txt
69497.stacktrace.txt

@inferno-chromium
Copy link
Collaborator Author

inferno-chromium commented Jul 10, 2024

Example:
Bug: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65777
Cmd: python -m experimental.manual.oss_fuzz_vuln_prompt
--repo_url=https://github.com/qpdf/qpdf
--regression_range=1d96af8b9f72f66d48b77c5232b3e4d38d6683ca:452e1f5c20ec6adf62cd296cb9dddacbc06e6ffa
--crash_revision=7315AA8C9B924C9351838EC5C1351A8C75E66BF1
--crash_stacktrace=65777.stacktrace.txt > prompt.txt ; python -m experimental.manual.prompter -p prompt.txt -l vertex_ai_gemini-1-5

LLM output below:

## 1. Vulnerability Identification

**Type:** Use-after-free

**Impact:** Crash leading to denial-of-service. Potentially, arbitrary code execution depending on the heap exploitation technique used.

**Code Patterns:** 

* The crash occurs in `QPDF::read_xref`, line 715, when accessing a `QPDFXRefEntry` that has already been freed. 
* This is caused by the call to `removeObject` within the loop iterating over the `m->xref_table`. 
* The `removeObject` function, introduced in the changeset, erases entries from the `m->xref_table` while the loop continues to iterate, potentially accessing a dangling pointer.

**Approximate line number matches:**

* The stack trace points to line 715 in `QPDF::read_xref`, while the changeset introduces the `removeObject` call around line 714. This is a close match considering the changeset is applied to a slightly earlier revision.

## 2. Technical Explanation

1. The `QPDF::read_xref` function iterates through the `m->xref_table` which contains object generation information.
2. Within this loop, the `removeObject` function is called if it encounters an object with the same object number but a lower generation than the previous one.
3. `removeObject` erases the entry with the lower generation from `m->xref_table`.
4. However, the loop in `read_xref` continues iterating, potentially accessing the erased entry, which is now a dangling pointer.
5. Accessing this dangling pointer leads to a heap-use-after-free error, causing the program to crash.

**An attacker could exploit this vulnerability by:**

1. Crafting a malicious PDF file containing an xref table with duplicate object numbers but varying generations.
2. This would trigger the use-after-free in `QPDF::read_xref`, causing a denial of service.
3. Depending on the system and the specific exploitation technique, an attacker might be able to control the freed memory and achieve arbitrary code execution.

## 3. Remediation Patch

The patch below resolves the vulnerability by deferring the removal of lower-generation objects:

```diff
--- a/libqpdf/QPDF.cc
+++ b/libqpdf/QPDF.cc
@@ -707,13 +707,16 @@ QPDF::read_xref(qpdf_offset_t xref_offset)
     // never depend on its being set.
     m->deleted_objects.clear();
 
-    // Make sure we keep only the highest generation for any object.
+    // Make sure we keep only the highest generation for any object. We can't do this while
+    // iterating because removing an object may invalidate iterators later in the sequence.
+    std::set<QPDFObjGen> to_remove;
     QPDFObjGen last_og{-1, 0};
     for (auto const& og: m->xref_table) {
         if (og.first.getObj() == last_og.getObj())
-            removeObject(last_og);
+            to_remove.insert(last_og);
         last_og = og.first;
     }
+    for (auto const& og: to_remove) {
+        removeObject(og);
+    }
 }
 
 bool

This patch:

1. Creates a temporary `to_remove` set to store object generations slated for removal.
2. After iterating over the `m->xref_table`, it safely removes the lower-generation objects from the table using the `to_remove` set.
3. This approach avoids modifying the `m->xref_table` while iterating over it, preventing the use-after-free vulnerability.

This patch effectively prevents the crash and potential exploitation, while preserving the intended functionality of removing duplicate objects with lower generations. 

@Jarvx
Copy link

Jarvx commented Jul 18, 2024

Hi, would you please share the meta data of the 5 cases or give any references so that I can dig out?

@inferno-chromium
Copy link
Collaborator Author

inferno-chromium commented Jul 18, 2024

Hi, would you please share the meta data of the 5 cases or give any references so that I can dig out?

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=<bug_number> where bug_number is 69527, 65777, 64337, 69497

@Jarvx
Copy link

Jarvx commented Jul 30, 2024

Hi,

when I tried to reproduce the vulnerability of qpdf. I got this issue. Is this due to base image change?

#12 exporting to image
#12 exporting layers done
#12 writing image sha256:2bb0f793adf91fae9530eb4995cc38cfa466d130ef627038d29b35de2a7e8911 done
#12 naming to gcr.io/oss-fuzz/qpdf done
#12 DONE 0.0s

 2 warnings found (use --debug to expand):
 - MaintainerDeprecated: Maintainer instruction is deprecated in favor of using label (line 18)
 - WorkdirRelativePath: Relative workdir "qpdf" can have unexpected results if the base image changes (line 23)
ERROR:root:Failed to detect repo:
python3: can't open file '/opt/cifuzz/detect_repo.py': [Errno 2] No such file or directory

and in another project libical.

 1 warning found (use --debug to expand):
 - WorkdirRelativePath: Relative workdir "libical" can have unexpected results if the base image changes (line 22)

@oliverchang
Copy link
Collaborator

@Jarvx Sorry for the slow response. Yes, these look like they're using an older base image that doesn't have the required script. Is this https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65777 ? If so, this is a little surprising because it's quite recent. Can you check if you're building this image with an up to date base image from oss-fuzz?

@Jarvx
Copy link

Jarvx commented Aug 12, 2024

@oliverchang Thanks for the information I have managed to reproduce this one after setting it up again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants