Skip to content

Conversation

@dmihalcik-virtru
Copy link
Member

Proposed Changes

  • Audit events should be added to the context stack as soon as possible in the handler, notably before the network request can be cancelled, or timeouts due to slow backend calls.
  • This PR introduces new audit methods that return updateable audit.Event objects, which in turn have their own log method.
  • To get the benefit of this feature, all existing calls to audit events must be re-written to use the new deferred log pattern:
auditEvent := as.logger.Audit.[EventType](ctx, ...)
defer auditEvent.Log(ctx)

... do background work ...
auditEvent.UpdateWith(newValues...)

Checklist

  • I have added or updated unit tests
  • I have added or updated integration tests (if appropriate)
  • I have added or updated documentation

Testing Instructions

@dmihalcik-virtru dmihalcik-virtru requested review from a team as code owners December 22, 2025 14:51
@github-actions github-actions bot added comp:policy Policy Configuration ( attributes, subject mappings, resource mappings, kas registry) comp:kas Key Access Server comp:authorization comp:middleware:auth size/xl labels Dec 22, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @dmihalcik-virtru, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors the audit logging mechanism to improve the reliability and completeness of audit trails. By introducing a deferred logging pattern, audit events are now created early in a request's execution and progressively updated with relevant information as the operation proceeds. The final log entry is then emitted only when the operation concludes, whether successfully or with an error, or if the request is cancelled. This change ensures that audit logs accurately reflect the full state and outcome of each request, providing more robust tracking and debugging capabilities.

Highlights

  • Deferred Audit Logging: Introduced a new pattern for audit logging where events are initiated early in a request lifecycle but logged only upon completion or failure. This ensures that audit events capture the full context of an operation, even if it's cancelled or times out prematurely.
  • Enhanced Audit Event Objects: New audit methods now return updateable audit.Event objects. These objects allow for progressive enrichment of audit data throughout an operation, such as updating entitlements or entity decisions, before the final log is emitted.
  • Centralized Context Management: Implemented a new auditTransaction mechanism within the request context. This centralizes the collection and management of audit events for a given request, ensuring all related events are processed consistently.
  • Migration to New Audit Pattern: All existing audit logging calls across various services (Authorization, KAS, Policy CRUD operations for Actions, Attributes, KAS Registry, Key Management, Namespaces, Obligations, Registered Resources, Resource Mapping, Subject Mapping, and Unsafe operations) have been refactored to utilize the new deferred audit event pattern.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


A log begun, then held in wait, Context gathered, sealed by fate. Defer's embrace, a watchful eye, To track the truth, 'til tasks pass by.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable refactoring to use deferred audit log events, which will improve failure tracking, especially for cancelled or timed-out requests. The implementation of the deferred event pattern is well-executed across the majority of the services. I've identified one high-severity issue where successful operations might be incorrectly audited as 'cancelled' if a subsequent operation in the same request panics. I've also noted a minor misuse of the new pattern in one location that could be simplified for better clarity. Overall, this is a solid improvement to the auditing mechanism.

Comment on lines 90 to 92
if !success {
auditEvent.Action.Result = ActionResultCancel
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This logic overrides the result of all audit events in a transaction to cancel if the request as a whole fails (e.g., panics), even if some operations within the request have already completed successfully and were logged as such. For example, in DeleteAllUnmappedSubjectConditionSets, database deletions are committed, and then corresponding success audit events are queued. If the request panics later, these successful deletions will be incorrectly audited as cancel. This can lead to misleading audit trails.

A test case for an operation that succeeds and is audited with Success(), followed by a panic in the same request, would demonstrate this issue.

Consider changing this to only mark events as cancel if they haven't been explicitly marked as success already. This would preserve the correct status of completed operations within a failed request.

Comment on lines 394 to 396
auditEvent := s.logger.Audit.PolicyCRUD(ctx, auditParams)
auditEvent.Success(scs)
auditEvent.Log()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The call to auditEvent.Log() here is redundant. auditEvent.Success(scs) has already marked the event as complete and queued it for logging within the current audit transaction. The Log() method is primarily intended for use with defer to handle cases where an operation might fail before Success() is called. Since the database operation has already completed successfully and you are just logging the results, the explicit Log() call is not needed and can be removed for clarity.

Suggested change
auditEvent := s.logger.Audit.PolicyCRUD(ctx, auditParams)
auditEvent.Success(scs)
auditEvent.Log()
auditEvent := s.logger.Audit.PolicyCRUD(ctx, auditParams)
auditEvent.Success(scs)

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 190.878964ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 106.841103ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 361.123451ms
Throughput 276.91 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.467069606s
Average Latency 392.682968ms
Throughput 126.69 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 28.063772644s
Average Latency 279.669423ms
Throughput 178.17 requests/second

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 188.390161ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 104.093924ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 358.903519ms
Throughput 278.63 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 38.242830483s
Average Latency 380.650996ms
Throughput 130.74 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.218634952s
Average Latency 271.324226ms
Throughput 183.70 requests/second

@jakedoublev
Copy link
Contributor

If a request times out or is canceled, there is no response given back to the requestor. It seems like there are some audit events where audit-worthy things could have happened (i.e. policy write mutations) without a response, but the most critical audit events around decisions and rewraps have no meaning to clients without a response received back. In essence, those two events arguably never happened without a response because the client action is so dependent upon the response. Should there be any distinction made for this case?

@jentfoo
Copy link
Contributor

jentfoo commented Dec 22, 2025

If a request times out or is canceled, there is no response given back to the requestor.

This is only true in kind circumstances. Response delivery, and ACK of that delivery (in TCP) are not atomic. It is possible for someone to receive a response, fail to ack it, then close the connection.

These new audit methods allow properly registering an event as going-to-happen, so that they are properly logged in the event of a cancellation or panic.
prompt: There seems to be some code duplication with the different audit types (DeferredX). Is there an idiomatic way in go to combine them?
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-2007-advanced-audacity branch from ec479d1 to d948ad6 Compare December 22, 2025 16:47
@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 185.345935ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 107.652648ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 351.872548ms
Throughput 284.19 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 38.751643411s
Average Latency 386.353881ms
Throughput 129.03 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.612451358s
Average Latency 275.097624ms
Throughput 181.08 requests/second

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 189.606083ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 107.364876ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 372.736221ms
Throughput 268.29 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.092874081s
Average Latency 389.361529ms
Throughput 127.90 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.957475809s
Average Latency 278.371375ms
Throughput 178.84 requests/second

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 181.573172ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 109.253154ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 360.579769ms
Throughput 277.33 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 38.79907452s
Average Latency 385.996299ms
Throughput 128.87 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 28.610831663s
Average Latency 284.705577ms
Throughput 174.76 requests/second

@dmihalcik-virtru
Copy link
Member Author

decisions and rewraps have no meaning to clients without a response received back. In essence, those two events arguably never happened without a response because the client action is so dependent upon the response. Should there be any distinction made for this case?

These events will now be logged with a cancel status, and an optional cancellation_error in their metadata. Filtering by Result should allow ignoring these events.

So, depending on how the events are aggregated and summarized, some events that are meaningless will now be available, and potentially cause problems. Do we have any sample audit processors I can look into to see how they can be modified, or if such modification may be required?

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 190.293633ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 101.868341ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 358.645691ms
Throughput 278.83 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 38.476216894s
Average Latency 383.038813ms
Throughput 129.95 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.36747374s
Average Latency 272.608722ms
Throughput 182.70 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 186.691763ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 104.314348ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 360.751811ms
Throughput 277.20 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.341396383s
Average Latency 391.334327ms
Throughput 127.09 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.877008516s
Average Latency 277.820561ms
Throughput 179.36 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 159.492254ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 88.003653ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 370.681917ms
Throughput 269.77 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 41.320364904s
Average Latency 411.765958ms
Throughput 121.01 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.902756057s
Average Latency 278.005222ms
Throughput 179.19 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 160.754489ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 96.230893ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 380.020028ms
Throughput 263.14 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.115325508s
Average Latency 389.155845ms
Throughput 127.83 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 28.243547568s
Average Latency 281.515164ms
Throughput 177.03 requests/second

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 165.431211ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 89.133102ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 372.867004ms
Throughput 268.19 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 40.048047937s
Average Latency 398.596123ms
Throughput 124.85 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.398708675s
Average Latency 272.872393ms
Throughput 182.49 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 189.73195ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 107.695068ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 364.937889ms
Throughput 274.02 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.674593918s
Average Latency 395.316887ms
Throughput 126.03 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 28.060553615s
Average Latency 279.594806ms
Throughput 178.19 requests/second

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:authorization comp:kas Key Access Server comp:middleware:auth comp:policy Policy Configuration ( attributes, subject mappings, resource mappings, kas registry) size/xl

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants