Skip to content

feat: add daily rating count recalculation to ETL pipeline#227

Merged
ayuki-joto merged 6 commits intomainfrom
feat/update-note-rating-status
Mar 9, 2026
Merged

feat: add daily rating count recalculation to ETL pipeline#227
ayuki-joto merged 6 commits intomainfrom
feat/update-note-rating-status

Conversation

@ayuki-joto
Copy link
Contributor

Summary

  • ETLパイプラインにnote評価カウントの日次再集計機能を追加 (Closes ratingの再集計 #220)
  • recalculate_rating_counts() 関数が row_note_ratings テーブルから集計し、notes テーブルの rate_count, helpful_count, somewhat_helpful_count, not_helpful_count をバルク更新
  • 日次ratings抽出フェーズの直後に自動実行され、全ノートの集計値を最新化

Changes

  • etl/src/birdxplorer_etl/extract_ecs.py: recalculate_rating_counts() 関数の追加と extract_data() への統合
  • etl/tests/test_recalculate_rating_counts.py: 4件のユニットテスト(正常系、0行更新、ログ出力、エラー伝播)
  • etl/tests/test_postlookup_lambda.py: Blackフォーマッタによる整形のみ

Test plan

  • 新規テスト4件がすべてpass
  • 既存テスト(test_postlookup_lambda.py 35件)がpass
  • ステージング環境でETL実行後にnotesテーブルの集計値が更新されることを確認

🤖 Generated with Claude Code

ayuki-joto and others added 6 commits March 2, 2026 13:56
…d readability

- Standardized exception message formatting by collapsing multiline strings to single lines.
- Updated dictionary reformatting for cleaner structure and readability.
Add recalculate_rating_counts() that aggregates rating data from
row_note_ratings and bulk-updates notes table columns (rate_count,
helpful_count, somewhat_helpful_count, not_helpful_count) after the
daily ratings extraction phase. Closes #220.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
The ETL test job only installed etl[dev] dependencies, but
birdxplorer_common is a prod-only dependency. Tests that import
extract_ecs.py (which uses birdxplorer_common.storage) failed with
ModuleNotFoundError in CI. This matches what tox.ini already does
with `-e ../common`.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
…rmance and maintainability

- Moved validation logic for rating rows to a dedicated `_validate_rating_row` function.
- Refactored `_process_rating_rows` to perform bulk loading using COPY for improved efficiency.
- Introduced staging table with deduplication and atomic swap to ensure data integrity during ratings reload.
- Added helper functions for staging table creation, deduplication, swap, and cleanup.
- Resolved potential naming conflicts by dynamically querying and renaming PK indexes during staging table swaps.
- Ensured accurate normalization of PK index names for both old and new tables.
@ayuki-joto ayuki-joto merged commit 2b70cd6 into main Mar 9, 2026
12 checks passed
@ayuki-joto ayuki-joto deleted the feat/update-note-rating-status branch March 9, 2026 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ratingの再集計

1 participant