Skip to content

Improve clang AST parsing #226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

arnetheduck
Copy link
Contributor

To prepare genbindings for accepting arbitrary libraries from the command line, we must retain parse types not only from the header being processed but also its includes, to understand things like type hierarchies etc without depending on a particular module order.

This is the first baby step that addresses clang AST filtering.

The change has some collateral benefits: the improved accuracy of the AST parsing keeps more relevant information but at the same time reduces the memory footprint since filtering is done streaming instead of loading everything into memory first (ditto when writing the cache).

A consequence is that a lot more clang processes can run in parallel without OOMing.

  • parse "file" correctly to discover type provenance - using the standard go json parser for this does not work since the depend on serialization order which go discards (see
    https://github.com/dtolnay/clang-ast?tab=readme-ov-file#source-locations)
  • stream clang output to JSON decoder and stream-write the cache to reduce memory footprint
  • with the newfound memory efficiency, bump up the number of parallel jobs and use threads for the parsing as well
  • fix missing classes such as QSysInfo using the corrected file source location field
  • don't try to re-run clang if it fails (OOM is unlikely anyway)

@arnetheduck arnetheduck marked this pull request as draft May 25, 2025 08:24
@arnetheduck
Copy link
Contributor Author

draft because of the numbered renames - need some help with these in case we want to merge this

To prepare genbindings for accepting arbitrary libraries from the
command line, we must retain parse types not only from the header being
processed but also its includes, to understand things like type
hierarchies etc without depending on a particular module order.

This is the first baby step that addresses clang AST filtering.

The change has some collateral benefits: the improved accuracy of the
AST parsing keeps more relevant information but at the same time reduces
the memory footprint since filtering is done streaming instead of
loading everything into memory first (ditto when writing the cache).

A consequence is that a lot more clang processes can run in parallel
without OOMing.

* parse "file" correctly to discover type provenance - using the
standard go json parser for this does not work since the  depend on
serialization order which go discards (see
https://github.com/dtolnay/clang-ast?tab=readme-ov-file#source-locations)
* stream clang output to JSON decoder and stream-write the cache to
reduce memory footprint
* with the newfound memory efficiency, bump up the number of parallel
jobs and use threads for the parsing as well
* fix missing classes such as QSysInfo using the corrected `file` source
location field
* don't try to re-run clang if it fails (OOM is unlikely anyway)
@@ -29,13 +29,11 @@ typedef struct QVariant QVariant;
#endif

QQmlListReference* QQmlListReference_new();
QQmlListReference* QQmlListReference_new2(QVariant* variant);
Copy link
Contributor Author

@arnetheduck arnetheduck May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the interesting thing about these removed overloads is that 3 and 4 are the same .. 🤷

@arnetheduck arnetheduck marked this pull request as ready for review June 3, 2025 09:37
@arnetheduck
Copy link
Contributor Author

the qmllist changes look legit as far as I can tell, so undrafting

@arnetheduck
Copy link
Contributor Author

ping @mappu anything more needed here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants