-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[feat]Refactor: Storage Property Conversion Separation and Unification #50031
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
## Background Previously, all storage-related property conversions were handled in a single class: PropertyConvert. This class included logic for multiple components such as: - BE storage configuration - Frontend (FE) object storage - HDFS(FE) configuration Over time, this approach introduced several problems: Tight Coupling: Different storage types (e.g., S3, OSS, COS, HDFS) were processed in a mixed manner. Inconsistent Behavior: The same storage type behaved differently across components. For instance: Some services accepted https:// style URIs. Others only accepted s3:// style URIs. High Maintenance Cost: Adding or updating logic for a single storage type risked breaking unrelated paths. Low Extensibility: Introducing new storage types or protocols required invasive changes to centralized logic. ## Changed This PR refactors the property conversion logic with the following goals: ### Separation of Responsibility: Each storage type (e.g., S3, COS, HDFS) now manages its own property parsing and conversion. No cross-dependency between different storage implementations. ### Unified Interface for Upper Layers: A single unified interface is exposed to business logic (e.g., generating properties for BE). Upper layers no longer care about the specific storage type or URI scheme. ### Consistent Behavior Across Components: Each storage implementation defines its own rules. Eliminates inconsistencies like accepting different URI styles in different parts of the system. ### Future-Friendly Design: Lays the groundwork for plugin-based SPI support.
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
TPC-H: Total hot run time: 34175 ms |
TPC-DS: Total hot run time: 186226 ms |
ClickBench: Total hot run time: 31.16 s |
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/StorageProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/StorageProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HDFSProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/ConnectionProperties.java
Show resolved
Hide resolved
| try { | ||
| S3URI s3uri = S3URI.create(uri, usePathStyle, forceParsingByStandardUri); | ||
| return s3uri.getEndpoint().orElse(null); | ||
| } catch (UserException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should print this exception for debugging.
Maybe we can just throw this exception
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HDFSProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HDFSProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HDFSProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HDFSProperties.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HdfsPropertiesUtils.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/datasource/property/storage/HDFSProperties.java
Outdated
Show resolved
Hide resolved
| import java.util.Map; | ||
| import java.util.Set; | ||
|
|
||
| public class HdfsPropertiesUtils { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name it as HDFSPropertiesUtils, or change the HDFSProperties to HdfsProperties
| } | ||
| } | ||
|
|
||
| public String getRegion() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cosRegion is null and cosEndpoint does not contains "myqcloud.com", it will return null
|
run buildall |
3c0bb17 to
2ac771f
Compare
|
run buildall |
|
run buildall |
|
run performance |
|
run cloud_p0 |
TPC-H: Total hot run time: 34063 ms |
TPC-DS: Total hot run time: 185198 ms |
ClickBench: Total hot run time: 29.96 s |
|
run cloud_p0 |
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
apache#50031) ## Background Previously, all storage-related property conversions were handled in a single class: PropertyConvert. This class included logic for multiple components such as: - BE storage configuration - Frontend (FE) object storage - HDFS(FE) configuration Over time, this approach introduced several problems: Tight Coupling: Different storage types (e.g., S3, OSS, COS, HDFS) were processed in a mixed manner. Inconsistent Behavior: The same storage type behaved differently across components. For instance: Some services accepted https:// style URIs. Others only accepted s3:// style URIs. High Maintenance Cost: Adding or updating logic for a single storage type risked breaking unrelated paths. Low Extensibility: Introducing new storage types or protocols required invasive changes to centralized logic. ## Changed This PR refactors the property conversion logic with the following goals: ### Separation of Responsibility: Each storage type (e.g., S3, COS, HDFS) now manages its own property parsing and conversion. No cross-dependency between different storage implementations. ### Unified Interface for Upper Layers: A single unified interface is exposed to business logic (e.g., generating properties for BE). Upper layers no longer care about the specific storage type or URI scheme. ### Consistent Behavior Across Components: Each storage implementation defines its own rules. Eliminates inconsistencies like accepting different URI styles in different parts of the system. ### Future-Friendly Design: Lays the groundwork for plugin-based SPI support.
… Unification (apache#50031) ## Background Previously, all storage-related property conversions were handled in a single class: PropertyConvert. This class included logic for multiple components such as: - BE storage configuration - Frontend (FE) object storage - HDFS(FE) configuration Over time, this approach introduced several problems: Tight Coupling: Different storage types (e.g., S3, OSS, COS, HDFS) were processed in a mixed manner. Inconsistent Behavior: The same storage type behaved differently across components. For instance: Some services accepted https:// style URIs. Others only accepted s3:// style URIs. High Maintenance Cost: Adding or updating logic for a single storage type risked breaking unrelated paths. Low Extensibility: Introducing new storage types or protocols required invasive changes to centralized logic. ## Changed This PR refactors the property conversion logic with the following goals: ### Separation of Responsibility: Each storage type (e.g., S3, COS, HDFS) now manages its own property parsing and conversion. No cross-dependency between different storage implementations. ### Unified Interface for Upper Layers: A single unified interface is exposed to business logic (e.g., generating properties for BE). Upper layers no longer care about the specific storage type or URI scheme. ### Consistent Behavior Across Components: Each storage implementation defines its own rules. Eliminates inconsistencies like accepting different URI styles in different parts of the system. ### Future-Friendly Design: Lays the groundwork for plugin-based SPI support. (cherry picked from commit be7617e)
Issue Number: #50238
Background
Previously, all storage-related property conversions were handled in a single class: PropertyConvert. This class included logic for multiple components such as:
BE storage configuration
Frontend (FE) object storage
HDFS(FE) configuration
Over time, this approach introduced several problems:
Tight Coupling: Different storage types (e.g., S3, OSS, COS, HDFS) were processed in a mixed manner.
Inconsistent Behavior: The same storage type behaved differently across components. For instance:
Some services accepted https:// style URIs.
Others only accepted s3:// style URIs.
High Maintenance Cost: Adding or updating logic for a single storage type risked breaking unrelated paths.
Low Extensibility: Introducing new storage types or protocols required invasive changes to centralized logic.
Changed
This PR refactors the property conversion logic with the following goals:
Separation of Responsibility:
Each storage type (e.g., S3, COS, HDFS) now manages its own property parsing and conversion.
No cross-dependency between different storage implementations.
Unified Interface for Upper Layers:
A single unified interface is exposed to business logic (e.g., generating properties for BE).
Upper layers no longer care about the specific storage type or URI scheme.
Consistent Behavior Across Components:
Each storage implementation defines its own rules.
Eliminates inconsistencies like accepting different URI styles in different parts of the system.
Future-Friendly Design:
Lays the groundwork for plugin-based SPI support.
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)