Skip to content

feat: add 'Query' derive to manage custom Utoipa Query descriptions #1890

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mrizzi
Copy link
Collaborator

@mrizzi mrizzi commented Jul 22, 2025

Implementing the QueryDoc custom Derive macro in order to use the referenced struct as the source for the fields allowed to:

  • have filters for in the q query parameter
  • sort by in the sort query parameter

AdvisoryQuery and VulnerabilityQuery are two examples of how to use such a macro from now on.
The actual descriptions format might need further changes based on the LLM/MCP effectiveness.

@ctron in the first review I would like to understand if the overall structure for the macro is fine and eventually apply refactoring to move it somewhere else in the project or rename them. Thank you.

Summary by Sourcery

Implement a custom derive macro to generate OpenAPI descriptions for q and sort query parameters based on struct fields, replace static documentation in the YAML spec, and integrate the new TrustifyQuery wrapper in advisory and vulnerability endpoints.

New Features:

  • Add query-derive proc-macro to automatically derive Query trait implementations for custom query descriptions
  • Introduce query crate with TrustifyQuery wrapper and Query trait to generate q and sort parameter docs based on struct fields

Enhancements:

  • Replace manual EBNF grammar definitions in openapi.yaml with dynamic descriptions generated by the Query derive
  • Update advisory and vulnerability endpoints to use TrustifyQuery<T> and derive Query on local query parameter structs

Build:

  • Add query and query-derive crates to the workspace and update Cargo.toml configurations

@mrizzi mrizzi requested a review from ctron July 22, 2025 15:27
Copy link

sourcery-ai bot commented Jul 22, 2025

Reviewer's Guide

This PR introduces a custom derive macro and a wrapper type to automate generation of EBNF-based query and sort parameter descriptions, replaces manual grammar entries in the OpenAPI spec, and integrates these into advisory and vulnerability endpoints, alongside necessary workspace configuration updates.

Entity relationship diagram for query parameter field mapping

erDiagram
    ADVISORY_QUERY {
        UUID id
        STRING identifier
        STRING version
        STRING document_id
        BOOL deprecated
        UUID issuer_id
        DATETIME published
        DATETIME modified
        DATETIME withdrawn
        STRING title
        DATETIME ingested
        STRING label
    }
    VULNERABILITY_QUERY {
        STRING id
        STRING title
        DATETIME reserved
        DATETIME published
        DATETIME modified
        DATETIME withdrawn
        STRING[] cwes
        FLOAT base_score
        SEVERITY base_severity
    }

    ADVISORY_QUERY ||--o{ TrustifyQuery : "used as T"
    VULNERABILITY_QUERY ||--o{ TrustifyQuery : "used as T"
Loading

Class diagram for the new Query derive macro and TrustifyQuery wrapper

classDiagram
    class Query {
        <<trait>>
        +generate_query_description() String
        +generate_sort_description() String
    }

    class TrustifyQuery~T: Query~ {
        -phantom: PhantomData<T>
    }
    TrustifyQuery ..|> IntoParams
    TrustifyQuery ..|> Query

    class IntoParams {
        <<trait>>
        +into_params(parameter_in_provider) Vec<Parameter>
    }

    Query <|.. TrustifyQuery
    IntoParams <|.. TrustifyQuery

    class Query_derive_macro {
        <<proc-macro derive(Query)>>
        // Implements Query for struct
    }
    Query_derive_macro ..> Query : implements
Loading

Class diagram for AdvisoryQuery and VulnerabilityQuery usage

classDiagram
    class AdvisoryQuery {
        +id: Uuid
        +identifier: String
        +version: Option<String>
        +document_id: String
        +deprecated: bool
        +issuer_id: Option<Uuid>
        +published: Option<OffsetDateTime>
        +modified: Option<OffsetDateTime>
        +withdrawn: Option<OffsetDateTime>
        +title: Option<String>
        +ingested: OffsetDateTime
        +label: String
    }
    AdvisoryQuery ..|> Query

    class VulnerabilityQuery {
        +id: String
        +title: Option<String>
        +reserved: Option<OffsetDateTime>
        +published: Option<OffsetDateTime>
        +modified: Option<OffsetDateTime>
        +withdrawn: Option<OffsetDateTime>
        +cwes: Option<Vec<String>>
        +base_score: Option<f64>
        +base_severity: Option<Severity>
    }
    VulnerabilityQuery ..|> Query

    class TrustifyQuery~T: Query~
    TrustifyQuery <.. AdvisoryQuery : used as T
    TrustifyQuery <.. VulnerabilityQuery : used as T
Loading

File-Level Changes

Change Details Files
Implement Query derive macro for generating query and sort descriptions
  • Add proc-macro derive implementation in query-derive crate
  • Parse struct fields and emit Query trait impl that returns EBNF grammar strings
query/query-derive/src/lib.rs
Add TrustifyQuery wrapper type for OpenAPI parameter injection
  • Define TrustifyQuery with PhantomData
  • Implement IntoParams to emit q and sort parameters using Query trait
query/src/lib.rs
Replace manual OpenAPI grammar in spec with generated descriptions
  • Remove long hand-written EBNF blocks for q and sort in openapi.yaml
  • Insert standardized
-prefixed descriptions compliant with derive output
Integrate custom derive and wrapper in endpoint modules
  • Define AdvisoryQuery and VulnerabilityQuery structs with #[derive(Query)]
  • Switch endpoint params from raw Query to TrustifyQuery
modules/fundamental/src/advisory/endpoints/mod.rs
modules/fundamental/src/vulnerability/endpoints/mod.rs
Update workspace configuration to include query crates
  • Add query and query-derive entries in root Cargo.toml
  • Include query and query-derive in modules/fundamental Cargo.toml
Cargo.toml
modules/fundamental/Cargo.toml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

codecov bot commented Jul 22, 2025

Codecov Report

Attention: Patch coverage is 96.66667% with 3 lines in your changes missing coverage. Please review.

Project coverage is 68.25%. Comparing base (da07c38) to head (78bde90).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
query/query-derive/src/lib.rs 95.71% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1890      +/-   ##
==========================================
+ Coverage   68.06%   68.25%   +0.19%     
==========================================
  Files         365      367       +2     
  Lines       23063    23211     +148     
  Branches    23063    23211     +148     
==========================================
+ Hits        15698    15843     +145     
- Misses       6486     6488       +2     
- Partials      879      880       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@jcrossley3 jcrossley3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to add the valid field names to openapi.yaml, can't this entire PR be replaced by the following suggested changes?

I'm having a hard time justifying the complexity of the extra modules/macros.

Comment on lines 58 to 66
#[allow(dead_code)]
#[derive(QueryDoc)]
struct AdvisoryQuery {
average_score: i32,
average_severity: String,
modified: Date,
title: String,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[allow(dead_code)]
#[derive(QueryDoc)]
struct AdvisoryQuery {
average_score: i32,
average_severity: String,
modified: Date,
title: String,
}
/// List advisories
///
/// Valid field names to use in sort/filter queries:
/// - average_score
/// - average_severity
/// - modified
/// - title
///

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to have a customized OpenAPI description for each endpoint similar to the detailed (but hardcoded) EBNF grammar definition (without having to copy & paste the same grammar in each endpoint) currently available in

pub struct Query {
/// EBNF grammar for the _q_ parameter:
/// ```text
/// q = ( values | filter ) { '&' q }
/// values = value { '|', values }
/// filter = field, operator, values
/// operator = "=" | "!=" | "~" | "!~" | ">=" | ">" | "<=" | "<"
/// value = (* any text but escape special characters with '\' *)
/// field = (* must match an entity attribute name *)
/// ```
/// Any values in a _q_ will result in a case-insensitive "full
/// text search", effectively producing an OR clause of LIKE
/// clauses for every string-ish field in the resource being
/// queried.
///
/// Examples:
/// - `foo` - any field containing 'foo'
/// - `foo|bar` - any field containing either 'foo' OR 'bar'
/// - `foo&bar` - some field contains 'foo' AND some field contains 'bar'
///
/// A _filter_ may also be used to constrain the results. The
/// filter's field name must correspond to one of the resource's
/// attributes. If it doesn't, an error will be returned
/// containing a list of the valid fields for that resource.
///
/// The value 'null' is treated specially for [Not]Equal filters:
/// it returns resources on which the field isn't set. Use the
/// LIKE operator, `~`, to match a literal "null" string. Omit the
/// value to match an empty string.
///
/// Examples:
/// - `name=foo` - entity's _name_ matches 'foo' exactly
/// - `name~foo` - entity's _name_ contains 'foo', case-insensitive
/// - `name~foo|bar` - entity's _name_ contains either 'foo' OR 'bar', case-insensitive
/// - `name=` - entity's _name_ is the empty string, ''
/// - `name=null` - entity's _name_ isn't set
/// - `published>3 days ago` - date values can be "human time"
///
/// Multiple full text searches and/or filters should be
/// '&'-delimited -- they are logically AND'd together.
///
/// - `red hat|fedora&labels:type=cve|osv&published>last wednesday 17:00`
///
/// Fields corresponding to JSON objects in the database may use a
/// ':' to delimit the column name and the object key,
/// e.g. `purl:qualifiers:type=pom`
///
/// Any operator or special character, e.g. '|', '&', within a
/// value should be escaped by prefixing it with a backslash.
///
#[serde(default)]
pub q: String,
/// EBNF grammar for the _sort_ parameter:
/// ```text
/// sort = field [ ':', order ] { ',' sort }
/// order = ( "asc" | "desc" )
/// field = (* must match the name of entity's attributes *)
/// ```
/// The optional _order_ should be one of "asc" or "desc". If
/// omitted, the order defaults to "asc".
///
/// Each _field_ name must correspond to one of the columns of the
/// table holding the entities being queried. Those corresponding
/// to JSON objects in the database may use a ':' to delimit the
/// column name and the object key,
/// e.g. `purl:qualifiers:type:desc`
///
#[serde(default)]
pub sort: String,

Considering only this PR will have all of the derive procedural macro code, the changes required for using it will be just a matter of defining a struct with the list of fields, the same list you have in the proposed comment but with the benefit of being able to further improve/manage it in the future as we need.

Comment on lines 33 to 41
#[allow(dead_code)]
#[derive(QueryDoc)]
struct VulnerabilityQuery {
base_score: i32,
base_severity: String,
modified: Date,
title: String,
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#[allow(dead_code)]
#[derive(QueryDoc)]
struct VulnerabilityQuery {
base_score: i32,
base_severity: String,
modified: Date,
title: String,
}
/// List vulnerabilities
///
/// Valid field names to use in sort/filter queries:
/// - base_score
/// - base_severity
/// - modified
/// - title
///

Copy link
Contributor

@ctron ctron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the PR. Learning about this and leveraging it for a seamless integration.

Maybe it makes sense to change the name away from "doc", indicating that this could be more than just doc in the future.

@mrizzi
Copy link
Collaborator Author

mrizzi commented Jul 23, 2025

I like the PR. Learning about this and leveraging it for a seamless integration.

Maybe it makes sense to change the name away from "doc", indicating that this could be more than just doc in the future.

Cool, I've renamed it 👍

@bobmcwhirter
Copy link
Contributor

If this is purely to add in common docs, can you use something like on each operation to amend additional docs?

#[doc = include_str!("../../EBNF_LANGUAGE_DEETS.md")]

ref: https://doc.rust-lang.org/rustdoc/write-documentation/the-doc-attribute.html

@mrizzi
Copy link
Collaborator Author

mrizzi commented Jul 23, 2025

If this is purely to add in common docs, can you use something like on each operation to amend additional docs?

#[doc = include_str!("../../EBNF_LANGUAGE_DEETS.md")]

ref: https://doc.rust-lang.org/rustdoc/write-documentation/the-doc-attribute.html

This PR is more about having a documentation template (i.e. the EBNF grammar) populated with custom fields each endpoint manages providing a solution that allows us to further improve it.
It looks to me that the referenced md file would be anyway the same for all of the endpoints without letting us customize it for each endpoint.

@mrizzi
Copy link
Collaborator Author

mrizzi commented Jul 23, 2025

@sourcery-ai summary

@mrizzi mrizzi force-pushed the feat-macro-generate-openapi branch from f7302a4 to ed3e90b Compare July 23, 2025 17:00
@mrizzi mrizzi changed the title feat: add 'QueryDoc' derive to manage custom Utoipa Query descriptions feat: add 'Query' derive to manage custom Utoipa Query descriptions Jul 23, 2025
@mrizzi mrizzi force-pushed the feat-macro-generate-openapi branch from ed3e90b to 5d6172d Compare July 24, 2025 08:38
@mrizzi mrizzi force-pushed the feat-macro-generate-openapi branch from d7d2603 to 78bde90 Compare July 24, 2025 13:28
@mrizzi
Copy link
Collaborator Author

mrizzi commented Jul 24, 2025

@sourcery-ai summary

@mrizzi
Copy link
Collaborator Author

mrizzi commented Jul 24, 2025

@sourcery-ai review

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @mrizzi - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `query/query-derive/src/lib.rs:16` </location>
<code_context>
+
+fn impl_query(ast: &syn::DeriveInput) -> TokenStream {
+    let name = &ast.ident;
+    let fields = match &ast.data {
+        Data::Struct(data_struct) => data_struct
+            .fields
+            .clone()
+            .into_iter()
+            .map(|field| match field.ident {
+                Some(ref ident) => ident.to_string(),
+                None => String::default(),
+            })
+            .collect::<Vec<String>>(),
+        Data::Enum(_) => vec![],
+        Data::Union(_) => vec![],
+    };
</code_context>

<issue_to_address>
The macro currently ignores tuple and unnamed struct fields.

If tuple structs aren't supported, add a compile error or explicit panic to prevent silent failures.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
fn impl_query(ast: &syn::DeriveInput) -> TokenStream {
    let name = &ast.ident;
    let fields = match &ast.data {
        Data::Struct(data_struct) => data_struct
            .fields
            .clone()
            .into_iter()
            .map(|field| match field.ident {
                Some(ref ident) => ident.to_string(),
                None => String::default(),
            })
            .collect::<Vec<String>>(),
        Data::Enum(_) => vec![],
        Data::Union(_) => vec![],
    };
=======
fn impl_query(ast: &syn::DeriveInput) -> TokenStream {
    let name = &ast.ident;
    let fields = match &ast.data {
        Data::Struct(data_struct) => {
            match &data_struct.fields {
                syn::Fields::Named(fields_named) => {
                    fields_named.named.iter()
                        .map(|field| field.ident.as_ref().unwrap().to_string())
                        .collect::<Vec<String>>()
                }
                syn::Fields::Unnamed(_) | syn::Fields::Unit => {
                    return syn::Error::new_spanned(
                        &ast.ident,
                        "Tuple structs and unit structs are not supported by #[derive(QueryDoc)]"
                    )
                    .to_compile_error()
                    .into();
                }
            }
        }
        Data::Enum(_) => vec![],
        Data::Union(_) => vec![],
    };
>>>>>>> REPLACE

</suggested_fix>

### Comment 2
<location> `query/query-derive/src/lib.rs:69` </location>
<code_context>
+    );
+
+    let generated = quote! {
+        impl query::Query for #name {
+            fn generate_query_description() -> String {
+                #query_description.to_string()
</code_context>

<issue_to_address>
The macro assumes the `query` crate is always imported as `query`.

If `query` is renamed or missing, this will fail. Use `$crate` or document the required import path.
</issue_to_address>

<suggested_fix>
<<<<<<< SEARCH
    let generated = quote! {
        impl query::Query for #name {
            fn generate_query_description() -> String {
                #query_description.to_string()
            }

            fn generate_sort_description() -> String {
                #sort_description.to_string()
            }
        }
    };
=======
    let generated = quote! {
        impl $crate::Query for #name {
            fn generate_query_description() -> String {
                #query_description.to_string()
            }

            fn generate_sort_description() -> String {
                #sort_description.to_string()
            }
        }
    };
>>>>>>> REPLACE

</suggested_fix>

### Comment 3
<location> `modules/fundamental/src/advisory/endpoints/mod.rs:60` </location>
<code_context>
 }

+#[allow(dead_code)]
+#[derive(Query)]
+struct AdvisoryQuery {
+    id: Uuid,
+    identifier: String,
+    version: Option<String>,
+    document_id: String,
+    deprecated: bool,
+    issuer_id: Option<Uuid>,
+    published: Option<OffsetDateTime>,
+    modified: Option<OffsetDateTime>,
+    withdrawn: Option<OffsetDateTime>,
+    title: Option<String>,
+    ingested: OffsetDateTime,
+    label: String,
+}
+
</code_context>

<issue_to_address>
The AdvisoryQuery struct uses non-Option types for some fields, which may not match query parameter semantics.

Since these fields are not Option types, they must always be provided, which may not align with how query parameters are typically used. This could lead to deserialization errors if any of these fields are missing in incoming queries.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +14 to +28
fn impl_query(ast: &syn::DeriveInput) -> TokenStream {
let name = &ast.ident;
let fields = match &ast.data {
Data::Struct(data_struct) => data_struct
.fields
.clone()
.into_iter()
.map(|field| match field.ident {
Some(ref ident) => ident.to_string(),
None => String::default(),
})
.collect::<Vec<String>>(),
Data::Enum(_) => vec![],
Data::Union(_) => vec![],
};
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): The macro currently ignores tuple and unnamed struct fields.

If tuple structs aren't supported, add a compile error or explicit panic to prevent silent failures.

Suggested change
fn impl_query(ast: &syn::DeriveInput) -> TokenStream {
let name = &ast.ident;
let fields = match &ast.data {
Data::Struct(data_struct) => data_struct
.fields
.clone()
.into_iter()
.map(|field| match field.ident {
Some(ref ident) => ident.to_string(),
None => String::default(),
})
.collect::<Vec<String>>(),
Data::Enum(_) => vec![],
Data::Union(_) => vec![],
};
fn impl_query(ast: &syn::DeriveInput) -> TokenStream {
let name = &ast.ident;
let fields = match &ast.data {
Data::Struct(data_struct) => {
match &data_struct.fields {
syn::Fields::Named(fields_named) => {
fields_named.named.iter()
.map(|field| field.ident.as_ref().unwrap().to_string())
.collect::<Vec<String>>()
}
syn::Fields::Unnamed(_) | syn::Fields::Unit => {
return syn::Error::new_spanned(
&ast.ident,
"Tuple structs and unit structs are not supported by #[derive(QueryDoc)]"
)
.to_compile_error()
.into();
}
}
}
Data::Enum(_) => vec![],
Data::Union(_) => vec![],
};

Comment on lines +60 to +69
#[derive(Query)]
struct AdvisoryQuery {
id: Uuid,
identifier: String,
version: Option<String>,
document_id: String,
deprecated: bool,
issuer_id: Option<Uuid>,
published: Option<OffsetDateTime>,
modified: Option<OffsetDateTime>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): The AdvisoryQuery struct uses non-Option types for some fields, which may not match query parameter semantics.

Since these fields are not Option types, they must always be provided, which may not align with how query parameters are typically used. This could lead to deserialization errors if any of these fields are missing in incoming queries.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants