Skip to content

[Feature] to support incremental computation in Doris #57921

@VividByteWorker

Description

@VividByteWorker

Search before asking

  • I had searched in the issues and found no similar issues.

Description

Currently, the execution engine of Doris adopts batch computation. It can handle near-real-time online analytical scenarios well, but consumes relatively much resources when computing large datasets. If it can support incremental computation, it will achieve more efficient computing performance in some scenarios and cope with more real-time scenarios such as live streaming, logistics, and e-commerce. Based on this, I recommend introducing Table Stream and Dynamic Table to support incremental data computation and make Doris to be a streaming warehouse.
Image

Use case

The benefits are as follows:

  1. Obtain changes based on table streams to realize CDC (Change Data Capture) acquisition
Image
  1. Support online dual-stream join to achieve more efficient multi-stream column concatenation capability.
  2. Realize online stream-based incremental aggregation computing.
  3. Asynchronous Materialized Views with Incremental Computation: Implement materialized views that are refreshed incrementally, thereby accelerating near-real-time online serving queries.

Related issues

  1. to add table stream to aquire CDC
  2. to add dynamic table
  3. to add incremental computation runtime
  4. to add optimizer rules for incremental computation

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/featureCategorizes issue or PR as related to a new feature.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions