Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(connector): make SplitEnumerator/Reader dyn #20098

Merged
merged 6 commits into from
Jan 14, 2025
Merged

Conversation

xxchan
Copy link
Member

@xxchan xxchan commented Jan 10, 2025

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Previously we use dispatch_source_prop macro to dispatch a large enum, and use a generic function inside the macro. This PR moved these lines into method of ConnectorProperties, and make it return dyn. The traits are refactored to be dyn-compatible (aka object safe)

benefits:

  • This is not IDE friendly, and hard to maintain. image
  • (not sure) dyn may slightly compile faster than generics, because generics will lead to code bloat. And async fn compiles especially slow if the body is very large.
    • I did some preliminary benchmarks but didn't get a conclusion, since the results are too unstable.
    • One data is the difference in code size: (2.3MB smaller)
Before:

 File  .text     Size                           Crate Name
 0.0%   0.1% 214.2KiB                      aws_sdk_s3 aws_sdk_s3::config::endpoint::internals::resolve_endpoint
 0.0%   0.0% 139.4KiB                         parquet brotli::enc::prior_eval::PriorEval<Alloc>::update_cost_base
 0.0%   0.0% 139.4KiB                         parquet brotli::enc::prior_eval::PriorEval<Alloc>::update_cost_base
 0.0%   0.0% 118.3KiB                 risingwave_meta risingwave_meta::manager::sink_coordination::coordinator_worker::CoordinatorWorker::...
 0.0%   0.0% 111.4KiB               risingwave_stream risingwave_connector::source::reader::reader::SourceReader::build_stream::{{closure}}
 0.0%   0.0% 111.4KiB      risingwave_batch_executors risingwave_connector::source::reader::reader::SourceReader::build_stream::{{closure}}
 0.0%   0.0% 101.7KiB                 risingwave_meta risingwave_meta::controller::scale::<impl risingwave_meta::controller::catalog::Cata...
 0.0%   0.0%  99.5KiB               risingwave_stream risingwave_connector::source::reader::reader::SourceReader::build_stream_for_backfil...
 0.0%   0.0%  86.5KiB               risingwave_stream risingwave_stream::executor::backfill::cdc::cdc_backfill::CdcBackfillExecutor<S>::ex...
 0.0%   0.0%  85.8KiB                 risingwave_meta risingwave_meta::stream::source_manager::worker::create_source_worker_async::{{closu...
 0.0%   0.0%  79.7KiB                          blake2 blake2::Blake2bVarCore::compress
 0.0%   0.0%  78.0KiB             risingwave_frontend risingwave_frontend::handler::handle::{{closure}}
 0.0%   0.0%  77.3KiB risingwave_meta_model_migration <risingwave_meta_model_migration::m20230908_072257_init::Migration as sea_orm_migrat...
 0.0%   0.0%  77.1KiB                         lz4_sys _LZ4_compress_fast_continue
 0.0%   0.0%  77.1KiB                       [Unknown] _prof_backtrace_impl
 0.0%   0.0%  75.6KiB               risingwave_stream risingwave_stream::executor::monitor::streaming_stats::StreamingMetrics::new
 0.0%   0.0%  72.2KiB                 risingwave_meta risingwave_meta::rpc::metrics::MetaMetrics::new
 0.0%   0.0%  71.8KiB                         lz4_sys _LZ4_compress_fast_extState_fastReset
 0.0%   0.0%  70.7KiB                          blake2 blake2::Blake2sVarCore::compress
 0.0%   0.0%  69.4KiB                          rustls rustls::msgs::ffdhe_groups::FfdheGroup::named_group
 0.0%   0.0%  63.7KiB                              h2 h2::codec::framed_read::decode_frame
 0.0%   0.0%  63.6KiB                              h2 h2::codec::framed_read::decode_frame
 0.0%   0.0%  63.2KiB                       sqlparser <sqlparser::ast::Statement as core::fmt::Display>::fmt
 0.0%   0.0%  62.5KiB         risingwave_meta_service <risingwave_meta_service::cloud_service::CloudServiceImpl as risingwave_pb::cloud_se...
 0.0%   0.0%  61.6KiB               tikv_jemalloc_sys __rjem_mallocx
 0.0%   0.0%  61.6KiB               tikv_jemalloc_sys __rjem_realloc
 0.0%   0.0%  61.3KiB               tikv_jemalloc_sys __rjem_calloc
 0.0%   0.0%  61.3KiB               tikv_jemalloc_sys __rjem_je_malloc_default
 0.0%   0.0%  61.0KiB                          blake3 _blake3_hash4_neon
 0.0%   0.0%  58.3KiB                       [Unknown] _rd_kafka_parse_Metadata0
 0.0%   0.0%  57.5KiB                      aws_sdk_s3 aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                     aws_sdk_sts aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                     aws_sdk_sso aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                 aws_sdk_ssooidc aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                    aws_sdk_glue aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                aws_sdk_dynamodb aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB           aws_sdk_emrserverless aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                 aws_sdk_kinesis aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  56.4KiB                       chrono_tz <chrono_tz::timezones::Tz as chrono_tz::timezone_impl::TimeSpans>::timespans
 0.0%   0.0%  56.0KiB               risingwave_stream risingwave_stream::executor::source::source_executor::SourceExecutor<S>::execute_wit...
 0.0%   0.0%  56.0KiB               risingwave_stream risingwave_stream::executor::source::source_executor::SourceExecutor<S>::execute_wit...
 0.0%   0.0%  55.8KiB               risingwave_stream risingwave_stream::executor::source::source_backfill_executor::SourceBackfillExecuto...
 0.0%   0.0%  53.0KiB                  risingwave_ctl <risingwave_ctl::HummockCommands as clap_builder::derive::Subcommand>::augment_subco...
 0.0%   0.0%  52.6KiB            datafusion_optimizer <datafusion_optimizer::simplify_expressions::expr_simplifier::Simplifier<S> as dataf...
 0.0%   0.0%  49.0KiB                      aws_sdk_s3 aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                     aws_sdk_sts aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                     aws_sdk_sso aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                 aws_sdk_ssooidc aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                    aws_sdk_glue aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                aws_sdk_dynamodb aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
30.7%  99.0% 361.7MiB                                 And 1767757 smaller methods. Use -n N to show more.
31.0% 100.0% 365.2MiB                                 .text section size, the file size is 1176.5MiB


After:


 File  .text     Size                           Crate Name
 0.0%   0.1% 214.2KiB                      aws_sdk_s3 aws_sdk_s3::config::endpoint::internals::resolve_endpoint
 0.0%   0.0% 139.4KiB                         parquet brotli::enc::prior_eval::PriorEval<Alloc>::update_cost_base
 0.0%   0.0% 139.4KiB                         parquet brotli::enc::prior_eval::PriorEval<Alloc>::update_cost_base
 0.0%   0.0% 118.3KiB                 risingwave_meta risingwave_meta::manager::sink_coordination::coordinator_worker::CoordinatorWorker::...
 0.0%   0.0% 101.7KiB                 risingwave_meta risingwave_meta::controller::scale::<impl risingwave_meta::controller::catalog::Cata...
 0.0%   0.0%  86.5KiB               risingwave_stream risingwave_stream::executor::backfill::cdc::cdc_backfill::CdcBackfillExecutor<S>::ex...
 0.0%   0.0%  79.7KiB                          blake2 blake2::Blake2bVarCore::compress
 0.0%   0.0%  78.0KiB             risingwave_frontend risingwave_frontend::handler::handle::{{closure}}
 0.0%   0.0%  77.3KiB risingwave_meta_model_migration <risingwave_meta_model_migration::m20230908_072257_init::Migration as sea_orm_migrat...
 0.0%   0.0%  77.1KiB                         lz4_sys _LZ4_compress_fast_continue
 0.0%   0.0%  77.1KiB                       [Unknown] _prof_backtrace_impl
 0.0%   0.0%  75.6KiB               risingwave_stream risingwave_stream::executor::monitor::streaming_stats::StreamingMetrics::new
 0.0%   0.0%  72.2KiB                 risingwave_meta risingwave_meta::rpc::metrics::MetaMetrics::new
 0.0%   0.0%  71.8KiB                         lz4_sys _LZ4_compress_fast_extState_fastReset
 0.0%   0.0%  70.7KiB                          blake2 blake2::Blake2sVarCore::compress
 0.0%   0.0%  69.4KiB                          rustls rustls::msgs::ffdhe_groups::FfdheGroup::named_group
 0.0%   0.0%  63.7KiB                              h2 h2::codec::framed_read::decode_frame
 0.0%   0.0%  63.6KiB                              h2 h2::codec::framed_read::decode_frame
 0.0%   0.0%  63.2KiB                       sqlparser <sqlparser::ast::Statement as core::fmt::Display>::fmt
 0.0%   0.0%  61.6KiB               tikv_jemalloc_sys __rjem_mallocx
 0.0%   0.0%  61.6KiB               tikv_jemalloc_sys __rjem_realloc
 0.0%   0.0%  61.3KiB               tikv_jemalloc_sys __rjem_calloc
 0.0%   0.0%  61.3KiB               tikv_jemalloc_sys __rjem_je_malloc_default
 0.0%   0.0%  61.0KiB                          blake3 _blake3_hash4_neon
 0.0%   0.0%  58.3KiB               risingwave_stream risingwave_stream::executor::source::source_backfill_executor::SourceBackfillExecuto...
 0.0%   0.0%  58.3KiB                       [Unknown] _rd_kafka_parse_Metadata0
 0.0%   0.0%  57.5KiB                      aws_sdk_s3 aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                     aws_sdk_sts aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                     aws_sdk_sso aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                 aws_sdk_ssooidc aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                    aws_sdk_glue aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                aws_sdk_dynamodb aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB           aws_sdk_emrserverless aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  57.5KiB                 aws_sdk_kinesis aws_smithy_runtime::client::orchestrator::try_op::{{closure}}::{{closure}}
 0.0%   0.0%  56.4KiB                       chrono_tz <chrono_tz::timezones::Tz as chrono_tz::timezone_impl::TimeSpans>::timespans
 0.0%   0.0%  56.1KiB               risingwave_stream risingwave_stream::executor::source::source_executor::SourceExecutor<S>::execute_wit...
 0.0%   0.0%  56.1KiB               risingwave_stream risingwave_stream::executor::source::source_executor::SourceExecutor<S>::execute_wit...
 0.0%   0.0%  53.0KiB                  risingwave_ctl <risingwave_ctl::HummockCommands as clap_builder::derive::Subcommand>::augment_subco...
 0.0%   0.0%  52.6KiB            datafusion_optimizer <datafusion_optimizer::simplify_expressions::expr_simplifier::Simplifier<S> as dataf...
 0.0%   0.0%  49.0KiB                      aws_sdk_s3 aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                     aws_sdk_sts aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                     aws_sdk_sso aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                 aws_sdk_ssooidc aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                    aws_sdk_glue aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                aws_sdk_dynamodb aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB           aws_sdk_emrserverless aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  49.0KiB                 aws_sdk_kinesis aws_smithy_runtime::client::orchestrator::try_attempt::{{closure}}::{{closure}}
 0.0%   0.0%  48.6KiB                      datafusion datafusion::physical_planner::DefaultPhysicalPlanner::map_logical_node_to_physical::...
 0.0%   0.0%  48.0KiB                              h2 h2::frame::headers::HeaderBlock::load::{{closure}}
 0.0%   0.0%  47.8KiB                              h2 h2::frame::headers::HeaderBlock::load::{{closure}}
30.8%  99.1% 359.6MiB                                 And 1755089 smaller methods. Use -n N to show more.
31.1% 100.0% 362.9MiB                                 .text section size, the file size is 1168.8MiB

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

  • My PR needs documentation updates.
Release note

@xxchan xxchan force-pushed the xxchan/compile branch 2 times, most recently from 3fb5aa4 to 8a6de63 Compare January 13, 2025 03:28
Copy link
Member Author

xxchan commented Jan 13, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@xxchan xxchan changed the title Xxchan/compile refactor: make SplitEnumerator/Reader dyn Jan 13, 2025
@xxchan xxchan marked this pull request as ready for review January 13, 2025 03:36
@xxchan xxchan requested a review from BugenZhao January 13, 2025 03:36
@xxchan xxchan changed the title refactor: make SplitEnumerator/Reader dyn refactor(connector): make SplitEnumerator/Reader dyn Jan 13, 2025
src/connector/src/macros.rs Outdated Show resolved Hide resolved
src/meta/src/stream/source_manager/worker.rs Show resolved Hide resolved
src/connector/src/source/base.rs Outdated Show resolved Hide resolved
Signed-off-by: xxchan <[email protected]>
@xxchan xxchan enabled auto-merge January 14, 2025 09:41
@xxchan xxchan added this pull request to the merge queue Jan 14, 2025
Merged via the queue into main with commit 325404c Jan 14, 2025
31 of 32 checks passed
@xxchan xxchan deleted the xxchan/compile branch January 14, 2025 11:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants