Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AN-140 Metadata index for cost capping #7567

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
Open

Conversation

aednichols
Copy link
Collaborator

@aednichols aednichols commented Oct 2, 2024

Description

The cost-capping query below improves from 1m30s to 250ms on the 20M-row test workflow.

select SQL_NO_CACHE
    `WORKFLOW_EXECUTION_UUID`,
    `CALL_FQN`,
    `JOB_SCATTER_INDEX`,
    `JOB_RETRY_ATTEMPT`,
    `METADATA_KEY`,
    `METADATA_VALUE`,
    `METADATA_VALUE_TYPE`,
    `METADATA_TIMESTAMP`,
    `METADATA_JOURNAL_ID`
from
    `METADATA_ENTRY`
where
    (
        (
            `WORKFLOW_EXECUTION_UUID` = '69e8259c-856a-4d87-9cdf-bd709f8e5ce3'
            )
            and (
            (
                (
                    (
                        `METADATA_KEY` like 'vmStartTime%'
                        )
                        or (`METADATA_KEY` like 'vmEndTime%')
                    )
                    or (
                    `METADATA_KEY` like 'vmCostPerHour%'
                    )
                )
                or (
                `METADATA_KEY` like 'subWorkflowId%'
                )
            )
        )
  and (
    (not false)
        or (
        (
            (`CALL_FQN` is null)
                and (`JOB_SCATTER_INDEX` is null)
            )
            and (`JOB_RETRY_ATTEMPT` is null)
        )
    )
order by
    `METADATA_TIMESTAMP`;

This is how Liquibase checks for index existence on MySQL:

SELECT 
  TABLE_CATALOG AS TABLE_CAT, 
  TABLE_SCHEMA AS TABLE_SCHEM, 
  TABLE_NAME, 
  NON_UNIQUE, 
  NULL AS INDEX_QUALIFIER, 
  INDEX_NAME, 
  3 AS TYPE, 
  SEQ_IN_INDEX AS ORDINAL_POSITION, 
  COLUMN_NAME, 
  COLLATION AS ASC_OR_DESC, 
  CARDINALITY, 
  0 AS PAGES, 
  NULL AS FILTER_CONDITION 
FROM 
  INFORMATION_SCHEMA.STATISTICS 
WHERE 
  TABLE_SCHEMA = 'cromwell_test' 
  AND INDEX_NAME = 'IX_METADATA_ENTRY_WEU_MK' 
ORDER BY 
  NON_UNIQUE, 
  INDEX_NAME, 
  SEQ_IN_INDEX

And this is the index create:

CREATE INDEX `IX_METADATA_ENTRY_WEU_MK` ON `cromwell_test`.`METADATA_ENTRY`(
  `WORKFLOW_EXECUTION_UUID`, `METADATA_KEY`
)
CREATE INDEX `IX_METADATA_ENTRY_WEU_CF_JSI_JRA_MK` ON `cromwell_test`.`METADATA_ENTRY`(
  `WORKFLOW_EXECUTION_UUID`, `CALL_FQN`, 
  `JOB_SCATTER_INDEX`, `JOB_RETRY_ATTEMPT`, 
  `METADATA_KEY`
)

A reincarnation of #4736

Release Notes Confirmation

CHANGELOG.md

  • I updated CHANGELOG.md in this PR
  • I assert that this change shouldn't be included in CHANGELOG.md because it doesn't impact community users

Terra Release Notes

  • I added a suggested release notes entry in this Jira ticket
  • I assert that this change doesn't need Jira release notes because it doesn't impact Terra users

Copy link
Contributor

@salonishah11 salonishah11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

@aednichols aednichols requested a review from a team as a code owner October 16, 2024 19:13
@aednichols aednichols changed the title WX-1878 Metadata index for cost capping AN-140 Metadata index for cost capping Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants