Correcting the issue with miscalculation of the median_vehicle_postion_age. #3520
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Distincting the
vehicle_message_age
caused the median of the vehicle positions age per agencies to be abnormally elevated. In this PR, I corrected this issue by removing the distinct function. While this change is necessary for calculatingmedian_vehicle_message_age
, we still need to apply distinct processing to the headers when calculatingmedian_header_message_age
to deduplicate them to the overall message level, so the distinct function remains in place for calculations involving the feed header.Resolves #3512
Type of change
How has this been tested?
poetry run dbt run -s +fct_daily_vehicle_positions_latency_statistics
_22:27:16 Running with dbt=1.5.1
22:27:19 [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
22:27:20 Found 424 models, 950 tests, 0 snapshots, 0 analyses, 852 macros, 0 operations, 12 seed files, 175 sources, 4 exposures, 0 metrics, 0 groups
22:27:20
22:27:25 Concurrency: 8 threads (target='dev')
22:27:25
22:27:25 1 of 16 START sql view model farhad_staging.stg_gtfs_rt__vehicle_positions ..... [RUN]
22:27:25 2 of 16 START sql view model farhad_staging.stg_gtfs_schedule__agency .......... [RUN]
22:27:25 3 of 16 START sql view model farhad_staging.stg_gtfs_schedule__download_outcomes [RUN]
22:27:25 4 of 16 START sql view model farhad_staging.stg_gtfs_schedule__file_parse_outcomes [RUN]
22:27:25 5 of 16 START sql view model farhad_staging.stg_gtfs_schedule__unzip_outcomes .. [RUN]
22:27:25 6 of 16 START sql view model farhad_staging.stg_transit_database__gtfs_datasets [RUN]
22:27:26 3 of 16 OK created sql view model farhad_staging.stg_gtfs_schedule__download_outcomes [CREATE VIEW (0 processed) in 1.13s]
22:27:26 6 of 16 OK created sql view model farhad_staging.stg_transit_database__gtfs_datasets [CREATE VIEW (0 processed) in 1.11s]
22:27:26 7 of 16 START sql table model farhad_staging.int_transit_database__gtfs_datasets_dim [RUN]
22:27:26 1 of 16 OK created sql view model farhad_staging.stg_gtfs_rt__vehicle_positions [CREATE VIEW (0 processed) in 1.20s]
22:27:26 4 of 16 OK created sql view model farhad_staging.stg_gtfs_schedule__file_parse_outcomes [CREATE VIEW (0 processed) in 1.18s]
22:27:26 8 of 16 START sql view model farhad_staging.int_gtfs_schedule__grouped_feed_file_parse_outcomes [RUN]
22:27:26 5 of 16 OK created sql view model farhad_staging.stg_gtfs_schedule__unzip_outcomes [CREATE VIEW (0 processed) in 1.26s]
22:27:26 2 of 16 OK created sql view model farhad_staging.stg_gtfs_schedule__agency ..... [CREATE VIEW (0 processed) in 1.30s]
22:27:27 8 of 16 OK created sql view model farhad_staging.int_gtfs_schedule__grouped_feed_file_parse_outcomes [CREATE VIEW (0 processed) in 1.33s]
22:27:27 9 of 16 START sql view model farhad_staging.int_gtfs_schedule__joined_feed_outcomes [RUN]
22:27:29 9 of 16 OK created sql view model farhad_staging.int_gtfs_schedule__joined_feed_outcomes [CREATE VIEW (0 processed) in 1.34s]
22:27:29 10 of 16 START sql table model farhad_mart_gtfs.dim_schedule_feeds ............. [RUN]
22:27:31 7 of 16 OK created sql table model farhad_staging.int_transit_database__gtfs_datasets_dim [CREATE TABLE (4.9k rows, 5.0 GiB processed) in 4.59s]
22:27:31 11 of 16 START sql table model farhad_mart_transit_database.bridge_schedule_dataset_for_validation [RUN]
22:27:31 12 of 16 START sql table model farhad_mart_transit_database.dim_gtfs_datasets .. [RUN]
22:27:33 12 of 16 OK created sql table model farhad_mart_transit_database.dim_gtfs_datasets [CREATE TABLE (4.9k rows, 1.6 MiB processed) in 2.34s]
22:27:33 13 of 16 START sql table model farhad_staging.int_transit_database__urls_to_gtfs_datasets [RUN]
22:27:33 11 of 16 OK created sql table model farhad_mart_transit_database.bridge_schedule_dataset_for_validation [CREATE TABLE (2.8k rows, 511.7 KiB processed) in 2.54s]
22:27:35 13 of 16 OK created sql table model farhad_staging.int_transit_database__urls_to_gtfs_datasets [CREATE TABLE (4.9k rows, 868.9 KiB processed) in 2.35s]
22:28:37 10 of 16 OK created sql table model farhad_mart_gtfs.dim_schedule_feeds ........ [CREATE TABLE (14.1k rows, 11.4 GiB processed) in 68.10s]
22:28:37 14 of 16 START sql table model farhad_mart_gtfs.fct_daily_schedule_feeds ....... [RUN]
22:28:42 14 of 16 OK created sql table model farhad_mart_gtfs.fct_daily_schedule_feeds .. [CREATE TABLE (280.3k rows, 2.8 MiB processed) in 5.20s]
22:28:42 15 of 16 START sql view model farhad_mart_gtfs.fct_vehicle_positions_messages .. [RUN]
22:28:43 15 of 16 OK created sql view model farhad_mart_gtfs.fct_vehicle_positions_messages [CREATE VIEW (0 processed) in 1.25s]
22:28:43 16 of 16 START sql incremental model farhad_mart_gtfs_quality.fct_daily_vehicle_positions_latency_statistics [RUN]
22:32:07 16 of 16 OK created sql incremental model farhad_mart_gtfs_quality.fct_daily_vehicle_positions_latency_statistics [CREATE TABLE (989.0 rows, 391.9 GiB processed) in 203.33s]
22:32:07
22:32:07 Finished running 9 view models, 6 table models, 1 incremental model in 0 hours 4 minutes and 47.05 seconds (287.05s).
22:32:07
22:32:07 Completed successfully
22:32:07
22:32:07 Done. PASS=16 WARN=0 ERROR=0 SKIP=0 TOTAL=16_
Post-merge follow-ups
The next follow-up step for this PR is to replace the old table,
fct_daily_vehicle_positions_message_age_summary
, with this new table in Metabase dashboards. This will ensure the dashboards display accurate data for vehicle position latency..