Skip to content

Conversation

@Ceredron
Copy link
Collaborator

@Ceredron Ceredron commented Nov 14, 2025

Description

Changing from lateral joins to distinct on surprisingly decreased query time by a factor of ~100.

Related Issue(s)

  • #{issue number}

Verification

  • Your code builds clean without any errors or warnings
  • Manual testing done (required)
  • Relevant automated test added (if you find this hard, leave it and we'll help out)
  • All tests run green
  • If pre- or post-deploy actions (including database migrations) are needed, add a description, include a "Pre/Post-deploy actions" section below, and mark the PR title with ⚠️

Documentation

  • User documentation is updated with a separate linked PR in altinn-studio-docs. (if applicable)

Summary by CodeRabbit

  • Refactor
    • Restructured query construction in the data access layer for improved performance in file transfer operations.
    • Enhanced filtering capabilities for file transfer and recipient status handling through optimized database queries.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 14, 2025

📝 Walkthrough

Walkthrough

FileTransferRepository.cs undergoes SQL query restructuring, replacing inner lateral joins with CTEs and introducing status-filtering aliases (las, lfs). WHERE clauses consolidated with conditional status joins; parameter handling adjusted. No public API signatures modified.

Changes

Cohort / File(s) Change Summary
SQL Query Restructuring
src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
Refactored multiple repository methods to replace inner lateral joins with common table expressions (CTEs). Introduced new aliases (las, lfs) for recipient and sender status filtering. Consolidated WHERE clauses with baseline WHERE 1=1 pattern. Added conditional CTE additions and optional JOINs for file transfer and recipient status filtering. Adjusted parameter handling (@From/@to parameter names) and conditional parameter additions to align with new query structure.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–60 minutes

  • SQL query logic density: Multiple methods contain restructured query construction with CTE introduction, requiring careful verification of correctness and equivalence to original lateral join logic.
  • Parameter handling changes: Adjusted parameter names and conditional additions must be traced through all affected code paths.
  • Status filtering implementation: New aliases and optional JOINs introduce expanded filtering capabilities that require verification against business logic requirements.
  • Architectural shift: Lateral joins replaced with CTEs represents a meaningful change in query construction pattern that warrants careful review for potential performance or correctness implications.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Significantly improved performance on GetAvailableFiles query' directly and clearly describes the main change—a performance optimization to the GetAvailableFiles query, which aligns with the raw summary and PR objectives.
Description check ✅ Passed The PR description includes the required template sections (Description, Related Issue(s), Verification, and Documentation) with most verification checkboxes marked complete. However, the Related Issue reference is incomplete (just '#' without a number) and pre/post-deploy actions section is missing, though verification indicates none are needed.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch performance/getavailablefiles-query

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@Ceredron Ceredron added the kind/chore Routine task or minor maintenance work that doesn't introduce new features or functionality label Nov 14, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs (1)

316-450: Verify DISTINCT ON ordering and multi-recipient semantics in new CTE-based query

The new LegacyGetFilesForRecipientsWithRecipientStatus query looks good from a parameterization and performance perspective (CTEs, DISTINCT ON, ANY(@actorIds), and no string interpolation of values). Two points are worth double-checking:

  1. Ordering criterion for “latest” actor status

    latest_actor_status currently uses:

    SELECT DISTINCT ON (file_transfer_id_fk)
      file_transfer_id_fk,
      actor_file_transfer_status_description_id_fk
    FROM broker.actor_file_transfer_status
    ...
    ORDER BY file_transfer_id_fk, actor_file_transfer_status_id_pk DESC

    Previous guidance for this repository was to order by *_status_description_id_fk DESC to get the highest/most advanced status rather than the most recent by date. In this query the DISTINCT ON row is chosen by actor_file_transfer_status_id_pk DESC, which corresponds to “last inserted” rather than “highest status code”.

    If the domain assumption is that statuses never regress, both will coincide; otherwise, this could subtly change which status is considered current for filtering. A possible compromise that preserves both intent and determinism would be:

    ORDER BY file_transfer_id_fk,
             actor_file_transfer_status_description_id_fk DESC,
             actor_file_transfer_status_id_pk DESC

    so DISTINCT ON picks the highest status, breaking ties by latest row. Please confirm that the current actor_file_transfer_status_id_pk DESC ordering is intentional given the earlier decision about using status_description_id_fk for progression. Based on learnings

  2. Behavior when multiple recipient actor IDs are supplied

    When actorIds.Length > 1, the CTE filters with actor_id_fk = ANY(@actorIds) but still does DISTINCT ON (file_transfer_id_fk), effectively collapsing multiple recipients’ statuses to a single row per file, and then applies the recipient-status filter on that single row:

    INNER JOIN latest_actor_status las
      ON las.file_transfer_id_fk = f.file_transfer_id_pk
    ...
    AND las.actor_file_transfer_status_description_id_fk = @recipientFileTransferStatus

    This yields “include the file if the selected recipient (the one whose row survives DISTINCT ON) matches the desired status”. If the intended semantics are “any of the requested recipients with that status”, this is fine; if you need “all of them” or per-recipient guarantees, further adjustment may be required.

    Also, please confirm that fileTransferSearch.GetActorIds() is guaranteed to return at least one actor; if it may be empty, the CTE would run without an actor filter and effectively consider statuses from all actors, which probably isn’t desired.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3035df7 and 5499553.

📒 Files selected for processing (1)
  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs (8 hunks)
🧰 Additional context used
🧠 Learnings (8)
📓 Common learnings
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 582
File: src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs:176-177
Timestamp: 2024-11-14T07:12:14.214Z
Learning: In `src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs`, avoid suggesting to extract inline SQL queries into constants.
Learnt from: tomshag
Repo: Altinn/altinn-broker PR: 790
File: src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs:349-352
Timestamp: 2025-09-30T11:29:20.339Z
Learning: In the Altinn Broker file transfer system, file transfer status and actor file transfer status use status_description_id_fk values that are ordered by progression level. Ordering by status_description_id_fk DESC is intentional to get the highest/most advanced status, not the most recent by date. This applies to queries in FileTransferRepository.cs.
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 560
File: tests/Altinn.Broker.Tests/Factories/FileTransferEntityFactory.cs:8-45
Timestamp: 2024-10-29T12:12:50.346Z
Learning: When suggesting code improvements in `tests/Altinn.Broker.Tests/Factories/FileTransferEntityFactory.cs`, avoid adding excessive comments and unnecessary code changes.
📚 Learning: 2024-11-14T07:12:14.214Z
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 582
File: src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs:176-177
Timestamp: 2024-11-14T07:12:14.214Z
Learning: In `src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs`, avoid suggesting to extract inline SQL queries into constants.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
📚 Learning: 2024-10-29T12:12:50.346Z
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 560
File: tests/Altinn.Broker.Tests/Factories/FileTransferEntityFactory.cs:8-45
Timestamp: 2024-10-29T12:12:50.346Z
Learning: When suggesting code improvements in `tests/Altinn.Broker.Tests/Factories/FileTransferEntityFactory.cs`, avoid adding excessive comments and unnecessary code changes.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
📚 Learning: 2025-09-30T11:29:20.339Z
Learnt from: tomshag
Repo: Altinn/altinn-broker PR: 790
File: src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs:349-352
Timestamp: 2025-09-30T11:29:20.339Z
Learning: In the Altinn Broker file transfer system, file transfer status and actor file transfer status use status_description_id_fk values that are ordered by progression level. Ordering by status_description_id_fk DESC is intentional to get the highest/most advanced status, not the most recent by date. This applies to queries in FileTransferRepository.cs.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
📚 Learning: 2024-10-29T12:51:11.540Z
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 560
File: tests/Altinn.Broker.Tests/ManifestDownloadStreamTests.cs:24-39
Timestamp: 2024-10-29T12:51:11.540Z
Learning: In `tests/Altinn.Broker.Tests/ManifestDownloadStreamTests.cs`, refactoring suggestions aimed at enhancing test organization and coverage may be considered less readable and of dubious value.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
📚 Learning: 2024-10-29T12:50:05.699Z
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 560
File: tests/Altinn.Broker.Tests/Helpers/ManifestDownloadStreamHelpers.cs:18-28
Timestamp: 2024-10-29T12:50:05.699Z
Learning: In `tests/Altinn.Broker.Tests/Helpers/ManifestDownloadStreamHelpers.cs`, the test code is static and will not change, so refactoring suggestions on this file are unnecessary.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
📚 Learning: 2024-11-14T07:20:20.158Z
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 582
File: src/Altinn.Broker.Persistence/Repositories/ServiceOwnerRepository.cs:36-47
Timestamp: 2024-11-14T07:20:20.158Z
Learning: In `src/Altinn.Broker.Persistence/Repositories/ServiceOwnerRepository.cs`, the `created` field retrieved from the database is non-nullable, so null checks for the `created` field are not necessary.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
📚 Learning: 2024-11-14T07:01:04.663Z
Learnt from: Ceredron
Repo: Altinn/altinn-broker PR: 582
File: tests/Altinn.Broker.Tests.LargeFile/Program.cs:147-147
Timestamp: 2024-11-14T07:01:04.663Z
Learning: In the project `Altinn.Broker.Tests.LargeFile`, the codebase uses C# 12 features like collection expressions, such as initializing collections with `PropertyList = []`. Do not flag this syntax as invalid in future reviews.

Applied to files:

  • src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs
🧬 Code graph analysis (1)
src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs (4)
src/Altinn.Broker.API/Mappers/FileTransferStatusDetailsExtMapper.cs (2)
  • List (34-39)
  • List (41-50)
src/Altinn.Broker.API/Mappers/LegacyFileStatusOverviewExtMapper.cs (4)
  • List (12-15)
  • List (107-122)
  • FileTransferStatus (54-73)
  • ActorFileTransferStatus (75-89)
src/Altinn.Broker.API/Mappers/FileStatusOverviewExtMapper.cs (1)
  • List (70-85)
src/Altinn.Broker.Core/Domain/ActorFileTransferStatusEntity.cs (1)
  • ActorFileTransferStatusEntity (3-9)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Analyze (csharp)
🔇 Additional comments (6)
src/Altinn.Broker.Persistence/Repositories/FileTransferRepository.cs (6)

112-172: GetFileTransfers: no behavioral changes spotted

The method still guards against empty fileTransferIds, uses a parameterized UUID array, and relies on the existing overviewCommandMultiple SQL. The change around line 118 appears to be non-functional; the implementation remains sound.


238-274: GetLatestRecipientFileTransferStatuses: aggregation query remains consistent

The query still groups by actor_id_fk and actor_external_id and uses MAX on both status id and date; changes around the command creation are non-functional. The method remains aligned with the existing domain model.


276-314: AddFileTransfer: actor resolution and insert logic unchanged

The only visible change around line 291 is structural/formatting; actor lookup/creation, file_transfer insert, and metadata handling behave as before. No correctness or security issues identified.


519-531: GetFileTransfersAssociatedWithActor: minor formatting only

The added whitespace around the reader loop (lines 522 and 529) is cosmetic; the UNION-based query, parameter binding, and retry pattern remain unchanged and correct.


592-604: GetFileTransfersForRecipientWithRecipientStatus: no behavioral changes

Changes around the reader loop (lines 595 and 602) are purely structural. The lateral joins, status/date filtering, and ordering logic remain as before, with parameters still fully bound.


625-641: GetMetadata: command execution still safe and parameterized

The small edits around command execution (lines 629–635) don’t affect behavior. The method continues to use a parameterized query over file_transfer_id_fk and safely materializes the key/value dictionary.

Copy link
Collaborator

@RagnarFatland RagnarFatland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks exiting.

@Ceredron Ceredron merged commit 12808a8 into main Nov 14, 2025
7 of 8 checks passed
@Ceredron Ceredron deleted the performance/getavailablefiles-query branch November 14, 2025 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/chore Routine task or minor maintenance work that doesn't introduce new features or functionality

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants