Skip to content

CodeChecker store fails for report directories above ~10GB (report deduplication on the disk) #4129

@dkrupp

Description

@dkrupp

CodeChecker cannot store report directories which are larger than 10GB. Unfortunately this can be a common case for C/C++ projects because some reports for headers are repeated for almost all TUs, which may cause a report count explosion on certain checkers.

When CodeChecker executes the analyzers, it stores every finding into the output report directory.

Some of the checkers report problems for C/C++ types that are commonly used across the whole code base. Such reports are repeated at every usage, which generates a huge number of redundant findings. An example for this is the cppcoreguidelines-special-member-functions clang-tidy checker which reports for classes where some but not all (copy constructor, copy assignment, move constructor, move assignment, destructor) of the special member functions are defined.

These reports are repeated in many PLIST files redundantly causing an excessively large report directory >~10GB. Such report directories make the diff, parse commands very slow and prohibit the storage of the results to the serve (using CodeChecker store) which would anyway throw away the duplicate findings.

If the report directory would be more compact the storage could be successful and would be significantly smaller.

Deduplicating the reports before storage would make the zipped content much smaller and the parsing on the server side much faster.

CodeChecker version
6.23.0

To Reproduce
Analyze the xerces project with --enable-all

CodeChecker analyze --enable-all ./compile_commands.json ./reports

Expected behaviour
I would expect more efficient report directory structure that is smaller in size, can be stored and can be handled by parse and diff.

Additional context
Add any other context about the problem.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions