Skip to content

Conversation

ihvol-freenome
Copy link

@ihvol-freenome ihvol-freenome commented Oct 10, 2025

Tracking issue

Closes #4583

Why are the changes needed?

The Flyte tags table lacked a composite index on the DatasetUUID and ArtifactID columns.
As a result, queries filtering by both fields (for example, fetching tags associated with a specific dataset and artifact) were less efficient, leading to unnecessary full-table scans in some cases.

This change adds a composite index to improve query performance and reduce latency in database lookups involving both columns.

What changes were proposed in this pull request?

This PR adds a composite GORM index on the DatasetUUID and ArtifactID fields in the tags model.

- ArtifactID  string
- DatasetUUID string   `gorm:"type:uuid;index:tags_dataset_uuid_idx"`
+ ArtifactID  string   `gorm:"index:tags_dataset_uuid_artifact_id_idx,priority:2"`
+ DatasetUUID string   `gorm:"type:uuid;index:tags_dataset_uuid_idx;index:tags_dataset_uuid_artifact_id_idx,priority:1"`

How this fixes the issue:

  • Introduces a composite index named tags_dataset_uuid_artifact_id_idx.
  • Sets index priority for DatasetUUID (1) and ArtifactID (2) to ensure optimal ordering for typical query patterns.
  • Addresses #4583 by improving query efficiency without changing existing schema behavior.

How was this patch tested?

Labels

  • changed: For changes in existing functionality.
  • fixed: For any bug fixed.

This is important to improve the readability of release notes.

Setup process

No special setup required. Run database migrations after updating Flyte to apply the new index.

Screenshots

N/A — schema-level optimization.

Check all the applicable boxes

  • I updated the documentation accordingly.
  • All new and existing tests passed.
  • All commits are signed-off.

Related PRs

N/A

Docs link

N/A — this change affects backend model definitions only and does not modify public APIs or user-facing documentation.

Summary by Bito

This pull request introduces a composite index on the DatasetUUID and ArtifactID fields in the tags model, significantly enhancing query performance and addressing inefficiencies in database lookups. The changes optimize query execution for common patterns, improving overall application performance.

Copy link

welcome bot commented Oct 10, 2025

Thank you for opening this pull request! 🙌

These tips will help get your PR across the finish line:

  • Most of the repos have a PR template; if not, fill it out to the best of your knowledge.
  • Sign off your commits (Reference: DCO Guide).

Copy link

codecov bot commented Oct 11, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.56%. Comparing base (996504d) to head (a2c06bc).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6672      +/-   ##
==========================================
+ Coverage   57.88%   58.56%   +0.67%     
==========================================
  Files         770      929     +159     
  Lines       63679    70875    +7196     
==========================================
+ Hits        36858    41505    +4647     
- Misses      23972    26216    +2244     
- Partials     2849     3154     +305     
Flag Coverage Δ
unittests-datacatalog 59.03% <ø> (ø)
unittests-flyteadmin 56.13% <ø> (ø)
unittests-flytecopilot 40.87% <ø> (ø)
unittests-flytectl 64.64% <ø> (?)
unittests-flyteidl 76.12% <ø> (ø)
unittests-flyteplugins 61.00% <ø> (ø)
unittests-flytepropeller 55.03% <ø> (-0.04%) ⬇️
unittests-flytestdlib 63.05% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Performance degradation with datacatalog.tags table

1 participant