Skip to content

[Feature] doris cross-cluster query #57884

@HonestManXin

Description

@HonestManXin

Search before asking

  • I had searched in the issues and found no similar issues.

Description

Similar to other catalogs, we implements ExternalCatalog, ExternalDatabase, and ExternalTable. In the implementation of ExternalTable, it will pull OLAPTable from the peer Fe and generate an ExternalOLAPTable (a subclass of OLAPTable) on the local Fe.
when generating query plan, once it is determined to be an External Doris table, just like generating an OlapScanNode for an inner table, all subsequent processes will be the same as those for the original OlapScanNode (only the judgment regarding Colocate must be forced to be for inner tables, we can dig more to see if can do colocate when two table in same catalog).
In the generation of distributed execution plans (Fe Coordinator), instead of generating a map from backendId->Address for some logical processing in the handling of Backends, it is now processed in the manner of catalogId->backendId->Address, mainly to accommodate the case where backendIds in different clusters may be the same.
Through the above solution, it is basically equivalent to treating the BE of the external cluster as if it were the BE of this cluster. The query performance is comparable to that of internal tables, and subsequent optimizations related to execution plans basically do not require additional adaptation. Currently existing optimizations such as topn & runtime filters can all be utilized.

Use case

No response

Related issues

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions