Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logic for test-to-harness conversion #494

Open
DavidKorczynski opened this issue Jul 17, 2024 · 2 comments
Open

Logic for test-to-harness conversion #494

DavidKorczynski opened this issue Jul 17, 2024 · 2 comments
Assignees

Comments

@DavidKorczynski
Copy link
Collaborator

Test-to-harness conversion by way of LLM sounds like an interesting avenue and is very commonly an approach taken by security engineers when first approaching a given project.

I see multiple steps:

  1. Enable in experimental without use of FI
  2. Enable in core without use of FI
  3. Assess quality overall
  4. See if improvements can be made using more program analysis data by way of FI
@DavidKorczynski
Copy link
Collaborator Author

DavidKorczynski commented Jul 17, 2024

An example OSS-Fuzz project that has low coverage (5% at the time of writing https://introspector.oss-fuzz.com/project-profile?project=neomutt and https://storage.googleapis.com/oss-fuzz-coverage/neomutt/reports/20240716/linux/report.html) but a wealth of tests that can be converted: https://github.com/neomutt/neomutt/tree/main/test

@DavidKorczynski
Copy link
Collaborator Author

Option (1) above has been implemented in #495

DavidKorczynski added a commit that referenced this issue Jul 18, 2024
Adds a fuzz harness heuristic that relies on converting existing tests.
At this stage, it's done without relying on FI, we simply (1) find tests
files in the target project; (2) read them; (3) for each test file we
use a simple prompt to convert it into a harness.

At this stage, it already out-performs on some existing projects, e.g:
https://github.com/jkuhlmann/cgltf/blob/master/test/main.c

In this case, we have a harness generated that looks quite nice:

```c
// Heuristic: TestConverterPrompt :: Target: 
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>

#define CGLTF_IMPLEMENTATION
#include "cgltf.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    if (size < 1) {
        return 0;
    }

    cgltf_options options;
	memset(&options, 0, sizeof(cgltf_options));
	cgltf_data* parsed_data = NULL;
	cgltf_result result;

    // Parse input data
    result = cgltf_parse(&options, data, size, &parsed_data);

    if (result == cgltf_result_success) {
        result = cgltf_validate(parsed_data);
    }

    if (result == cgltf_result_success) {
        // Use the parsed data in some way
        // For example, print file type and mesh count
		printf("Type: %u\n", parsed_data->file_type);
		printf("Meshes: %u\n", (unsigned)parsed_data->meshes_count);
    }

    cgltf_free(parsed_data);

    return 0;
}
```

Ref: #494

---------

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit to ossf/fuzz-introspector that referenced this issue Jul 18, 2024
First go keeping it as simple as possible

Ref: google/oss-fuzz-gen#494

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit to ossf/fuzz-introspector that referenced this issue Jul 18, 2024
core: fidn and store test files

First go keeping it as simple as possible

Ref: google/oss-fuzz-gen#494

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit to google/oss-fuzz that referenced this issue Jul 18, 2024
Contains updates for:

- java coverage analysis improvements
- test-to-harness conversion (google/oss-fuzz-gen#494)
DavidKorczynski added a commit to google/oss-fuzz that referenced this issue Jul 20, 2024
Contains updates for:

- java coverage analysis improvements
- test-to-harness conversion
(google/oss-fuzz-gen#494)
arthurscchan pushed a commit to arthurscchan/oss-fuzz-gen that referenced this issue Jul 24, 2024
Adds a fuzz harness heuristic that relies on converting existing tests.
At this stage, it's done without relying on FI, we simply (1) find tests
files in the target project; (2) read them; (3) for each test file we
use a simple prompt to convert it into a harness.

At this stage, it already out-performs on some existing projects, e.g:
https://github.com/jkuhlmann/cgltf/blob/master/test/main.c

In this case, we have a harness generated that looks quite nice:

```c
// Heuristic: TestConverterPrompt :: Target: 
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#include <string.h>

#define CGLTF_IMPLEMENTATION
#include "cgltf.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    if (size < 1) {
        return 0;
    }

    cgltf_options options;
	memset(&options, 0, sizeof(cgltf_options));
	cgltf_data* parsed_data = NULL;
	cgltf_result result;

    // Parse input data
    result = cgltf_parse(&options, data, size, &parsed_data);

    if (result == cgltf_result_success) {
        result = cgltf_validate(parsed_data);
    }

    if (result == cgltf_result_success) {
        // Use the parsed data in some way
        // For example, print file type and mesh count
		printf("Type: %u\n", parsed_data->file_type);
		printf("Meshes: %u\n", (unsigned)parsed_data->meshes_count);
    }

    cgltf_free(parsed_data);

    return 0;
}
```

Ref: google#494

---------

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit that referenced this issue Jul 24, 2024
While working on #494 I
need to adjust a few things wrt. Benchmarks, as the test-to-harness
logic doesn't fit the benchmark abstraction entirely. This commit is a
step towards simplifying some of the code around benchmarks. I also
removed some code that was no longer used, e.g. `manual_fix` in
`run_one_experiment.py`.

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit that referenced this issue Jul 26, 2024
While working on #494 I
need to adjust a few things wrt. Benchmarks, as the test-to-harness
logic doesn't fit the benchmark abstraction entirely. This commit is a
step towards simplifying some of the code around benchmarks. I also
removed some code that was no longer used, e.g. `manual_fix` in
`run_one_experiment.py`.

---------

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit that referenced this issue Aug 2, 2024
Ref: #494

Some more comments on this PR in
#511 (comment)

---------

Signed-off-by: David Korczynski <[email protected]>
DavidKorczynski added a commit to google/oss-fuzz that referenced this issue Aug 27, 2024
Contains updates regarding test-to-harness conversion google/oss-fuzz-gen#494
DavidKorczynski added a commit to google/oss-fuzz that referenced this issue Aug 27, 2024
Contains updates regarding test-to-harness conversion
google/oss-fuzz-gen#494
AlexDev08 pushed a commit to AlexDev08/fuzz-introspector that referenced this issue Nov 20, 2024
core: fidn and store test files

First go keeping it as simple as possible

Ref: google/oss-fuzz-gen#494

Signed-off-by: David Korczynski <[email protected]>
shovon58 added a commit to shovon58/oss-introspector that referenced this issue Nov 21, 2024
core: fidn and store test files

First go keeping it as simple as possible

Ref: google/oss-fuzz-gen#494

Signed-off-by: David Korczynski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant