Skip to content

Conversation

@jakedoublev
Copy link
Contributor

Proposed Changes

Checklist

  • I have added or updated unit tests
  • I have added or updated integration tests (if appropriate)
  • I have added or updated documentation

Testing Instructions

@github-actions github-actions bot added comp:db DB component comp:policy Policy Configuration ( attributes, subject mappings, resource mappings, kas registry) docs Documentation size/xl labels Dec 15, 2025
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @jakedoublev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the platform's PostgreSQL capabilities by introducing horizontal scaling for read operations and robust high availability for the primary database. It enables the system to handle increased read loads more efficiently through read replicas with intelligent load balancing and automatic health checking. Furthermore, it provides automatic failover mechanisms for the primary database, ensuring greater resilience against outages. These changes are designed to be backward compatible and opt-in, allowing for a smooth transition and flexible deployment configurations.

Highlights

  • PostgreSQL Horizontal Scaling: Introduced support for PostgreSQL read replicas with round-robin load balancing and circuit breaker health checking, allowing for horizontal scaling of read operations.
  • Multi-Host Primary Failover: Implemented multi-host primary configuration using pgx's target_session_attrs=primary for automatic failover, enhancing database high availability.
  • Circuit Breaker Health Checking: Integrated sony/gobreaker for robust circuit breaker health checking on read replicas, preventing cascading failures and ensuring automatic recovery.
  • Context-Based Primary Routing: Added WithForcePrimary context functionality to explicitly route queries to the primary database, crucial for read-after-write consistency and critical reads.
  • Comprehensive Integration Tests: Developed extensive testcontainers-based integration tests to validate read replica functionality, circuit breaker behavior, and multi-host primary failover scenarios.
  • Enhanced Local Development & Configuration: Provided three Docker Compose scenarios for local testing of single primary, primary + replicas, and full multi-host primary + replicas setups, along with configuration validation to prevent ambiguous setups.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Database expands, Reads flow free, no longer bound, High availability.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the database layer by introducing PostgreSQL horizontal scaling and high availability features. Key changes include support for read replicas with round-robin load balancing and circuit breaker health checking using sony/gobreaker, multi-host primary failover via pgx's target_session_attrs=primary, and context-based primary routing (WithForcePrimary) to ensure read-after-write consistency. The docker-compose.yaml and opentdf-example.yaml files have been updated with new configurations and commented-out scenarios for local development and testing, alongside production-ready PostgreSQL replication settings. Comprehensive integration tests using Testcontainers have been added to validate these new capabilities, covering circuit breaker behavior, multi-host failover, and read replica functionality. The db.Client interface has been extended to manage replica connections and integrate circuit breakers, with Query and QueryRow methods now intelligently routing reads to healthy replicas or falling back to the primary. A critical review comment highlighted a flaw in executeQueryRow's circuit breaker integration, noting that it only checks the circuit state but doesn't wrap the execution, potentially leading to uncounted failures. Another comment pointed out an inconsistent and less secure pg_hba.conf setting in the docker-compose.yaml for the secondary primary, which should be aligned with the main primary's more restrictive configuration. Additionally, the Exec method's signature and behavior changed, no longer erroring on zero rows affected, which is a breaking API change requiring clear communication to consumers.

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 185.4979ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 105.66348ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 365.202276ms
Throughput 273.82 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.898329401s
Average Latency 397.296456ms
Throughput 125.32 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 28.453655783s
Average Latency 283.643347ms
Throughput 175.72 requests/second

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 174.489464ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 90.898839ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 354.119154ms
Throughput 282.39 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.548705497s
Average Latency 393.830612ms
Throughput 126.43 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.998057435s
Average Latency 279.097534ms
Throughput 178.58 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 179.370772ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 98.487989ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 356.8135ms
Throughput 280.26 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 39.675859032s
Average Latency 395.54031ms
Throughput 126.02 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 27.617968439s
Average Latency 275.065552ms
Throughput 181.04 requests/second

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 192.477081ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 101.898398ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 360.85936ms
Throughput 277.12 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 40.391819009s
Average Latency 402.180678ms
Throughput 123.79 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 28.475130284s
Average Latency 283.453659ms
Throughput 175.59 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 163.093124ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 86.424138ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 381.786282ms
Throughput 261.93 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 42.230605407s
Average Latency 418.316314ms
Throughput 118.40 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 30.734803953s
Average Latency 306.282362ms
Throughput 162.68 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 191.671392ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 106.979989ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 386.218449ms
Throughput 258.92 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 45.085044864s
Average Latency 448.545537ms
Throughput 110.90 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 33.13220489s
Average Latency 330.327222ms
Throughput 150.91 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 185.848076ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 96.492235ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 381.765531ms
Throughput 261.94 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 44.479641958s
Average Latency 442.999187ms
Throughput 112.41 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 32.827236813s
Average Latency 327.181468ms
Throughput 152.31 requests/second

@github-actions
Copy link
Contributor

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 190.086404ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 109.717641ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 385.544886ms
Throughput 259.37 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 45.112165957s
Average Latency 448.673441ms
Throughput 110.83 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 33.304135262s
Average Latency 331.938013ms
Throughput 150.13 requests/second

@github-actions
Copy link
Contributor


# Start PostgreSQL with query logging enabled
exec postgres \
-c log_statement=all \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not log all queries in any postgres servers for production, though useful for debugging.

@github-actions
Copy link
Contributor

Benchmark results, click to expand

Benchmark authorization.GetDecisions Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 203.205837ms

Benchmark authorization.v2.GetMultiResourceDecision Results:

Metric Value
Approved Decision Requests 1000
Denied Decision Requests 0
Total Time 112.325028ms

Benchmark Statistics

Name № Requests Avg Duration Min Duration Max Duration

Bulk Benchmark Results

Metric Value
Total Decrypts 100
Successful Decrypts 100
Failed Decrypts 0
Total Time 390.427429ms
Throughput 256.13 requests/second

TDF3 Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 42.063658965s
Average Latency 419.269914ms
Throughput 118.87 requests/second

NANOTDF Benchmark Results:

Metric Value
Total Requests 5000
Successful Requests 5000
Failed Requests 0
Concurrent Requests 50
Total Time 29.888545934s
Average Latency 297.787004ms
Throughput 167.29 requests/second

@github-actions
Copy link
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp:db DB component comp:policy Policy Configuration ( attributes, subject mappings, resource mappings, kas registry) docs Documentation size/xl

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants