Skip to content

Refactor JdbcMetadata to take in TableLocationProvider #25582

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 22, 2025

Conversation

pradeepvaka
Copy link
Contributor

@pradeepvaka pradeepvaka commented Jul 18, 2025

Description

This PR fixes issue introduced in #25127 . JDBC connectors were tightly coupled to BaseJdbcConfig.connectionUrl for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

  • Add changes to populate data source metadata to support combined lineage tracking

Introduced TableLocationProvider interface to decouple table location logic from JDBC configuration:

  • Created TableLocationProvider interface for flexible table location resolution
  • Added DefaultTableLocationProvider that uses connection-url from BaseJdbcConfig (maintains backward compatibility)
  • Refactored JdbcMetadata and JdbcMetadataFactory to accept TableLocationProvider instead of BaseJdbcConfig

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.
Differential Revision: D78571326

Impact

Test Plan

Release Notes

Please follow release notes guidelines and fill in the release notes below.

 == RELEASE NOTES ==

JDBC Connector Changes 
* Fixes issue introduced in #25127 by introducing `TableLocationProvider` interface to decouple table location logic from JDBC configuration

@pradeepvaka pradeepvaka requested a review from a team as a code owner July 18, 2025 23:15
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

@rschlussel
Copy link
Contributor

Also, add more description to the commit message and PR description and link to the original PR, and also add a release note.

pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 21, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`
- Made `BaseJdbcConfig.getConnectionUrl()` nullable to support connectors that don't use traditional JDBC URLs
- Updated `JdbcModule` to bind the default implementation

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.


Differential Revision: D78571326
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 21, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.


Differential Revision: D78571326
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 21, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.


Differential Revision: D78571326
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

Copy link
Contributor

@rschlussel rschlussel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tests! just a small comment.

pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 21, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.


Differential Revision: D78571326
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 21, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.

Pull Request resolved: prestodb#25582

Differential Revision: D78571326
pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 22, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.


Differential Revision: D78571326
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

pradeepvaka added a commit to pradeepvaka/presto that referenced this pull request Jul 22, 2025
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.


Differential Revision: D78571326
Summary:
**Issue**: JDBC connectors were tightly coupled to `BaseJdbcConfig.connectionUrl` for table location information, making it difficult for connectors like XDB to use alternative connection management (e.g., SMC tier).

**Solution**: Introduced `TableLocationProvider` interface to decouple table location logic from JDBC configuration:
- Created `TableLocationProvider` interface for flexible table location resolution
- Added `DefaultTableLocationProvider` that uses connection-url from `BaseJdbcConfig` (maintains backward compatibility)
- Refactored `JdbcMetadata` and `JdbcMetadataFactory` to accept `TableLocationProvider` instead of `BaseJdbcConfig`

This enables connectors to provide custom table location logic while maintaining full backward compatibility for existing JDBC connectors.

Added test coverage.

Pull Request resolved: prestodb#25582

Differential Revision: D78571326
@facebook-github-bot
Copy link
Collaborator

This pull request was exported from Phabricator. Differential Revision: D78571326

@rschlussel rschlussel merged commit d95fca4 into prestodb:master Jul 22, 2025
112 of 120 checks passed
Copy link
Contributor

@tdcmeehan tdcmeehan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this got merged as I was typing. I don't think this is a good abstraction to introduce, as I explain below.

Comment on lines +39 to +42
* This class is provided for convenience and contains common configuration properties
* that many JDBC connectors may need. However, core JDBC functionality should not
* depend on this class, as JDBC connectors may choose to use their own mechanisms
* for connection management, authentication, and other configuration needs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this comment, unless it's possible to explain what "core JDBC functionality" specifically means.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can reword, but I think it's helpful because it makes clear that connectors may not choose to use this class for configuring DB connections.

* This demonstrates that TableLocationProvider implementations can use their own
* configuration mechanisms without depending on BaseJdbcConfig.
*/
public class SimpleTestTableLocationProvider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What additional test coverage does this class give us over and above what testing DefaultTableLocationProvider would give us? Can it be removed, and can we simply test the latter?

* Different implementations can provide location information based on
* different sources (e.g., connection URL, service discovery, etc.)
*/
public interface TableLocationProvider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to take a minute to reconsider this abstraction.

TableLocation in the context of JDBC does not have meaning. Moreover, one who extends this class might easily misunderstand that this class actually does not influence anything regarding table locations returned from the driver. What has meaning in JDBC are JDBC URLs/connection strings--this is the only service discovery mechanism found in JDBC.

I understand you need to get this to work with SMC and stuff that doesn't use JDBC connection strings for service discovery. But what's more problematic is there is already an abstraction to help with that, it's the ConnectionFactory. I think it would be better to converge on that single abstraction rather than having two parallel ones:

  • We add a new freeform text field, connectionString to ConnectionFactory.
  • The default implementation, DriverConnectionFactory simply returns the connection string from the config.
  • Someone who doesn't use JDBC connection strings for service discovery may put whatever string in there needed.
  • To expose this to the rest of the application, the JdbcClient also exposes connectionString. References in JdbcMetadata and JdbcMetadataFactory simply return this value from the client.

Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current name is reasonable, but if you have something better it can definitely be renamed. but I disagree that this should be merged with ConnectionFactory. The information added to the JDBC Metadata is for lineage purposes, and may or may not be the same as the connection information. There's no reason to conflate the two. (Also, FWIW we don't use ConnectionFactory either)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants