Skip to content

Conversation

@mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Jan 20, 2026

The _exclude_jobs_with_deleted_outputs function was using unaliased model classes (e.g., model.HistoryDatasetAssociation) in its exists() subqueries, while the main query from _build_stmt_for_hdca uses aliased versions of these same tables (e.g., candidate_hda).

While exists() subqueries should create independent table references, using the same model class without aliasing in both the main query and the exists() subquery could potentially cause ambiguity in PostgreSQL's query planner, leading to non-deterministic query behavior.

This fix adds explicit aliases (job_output_collection_assoc, output_hdca, job_output_dataset_assoc, output_hda) to ensure the tables in the exists() subqueries are clearly distinct from any tables in the outer query.

This (maybe) addresses the flaky test_search_delete_hdca_output issue tracked in GitHub issue #21230.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

@github-actions github-actions bot added this to the 26.0 milestone Jan 20, 2026
@guerler guerler added kind/enhancement area/database Galaxy's database or data access layer area/backend labels Jan 20, 2026
@mvdbeek
Copy link
Member Author

mvdbeek commented Jan 20, 2026

That's not it, failed on the third re-run

@mvdbeek mvdbeek force-pushed the claude/fix-flaky-jobs-test-Lfb3h branch from 699bdba to 1ad4ed0 Compare January 20, 2026 14:49
@jmchilton
Copy link
Member

Such an alluring theory though!

The _exclude_jobs_with_deleted_outputs function was using unaliased
model classes (e.g., model.HistoryDatasetAssociation) in its exists()
subqueries, while the main query from _build_stmt_for_hdca uses aliased
versions of these same tables (e.g., candidate_hda).

While exists() subqueries should create independent table references,
using the same model class without aliasing in both the main query and
the exists() subquery could potentially cause ambiguity in PostgreSQL's
query planner, leading to non-deterministic query behavior.

This fix adds explicit aliases (job_output_collection_assoc, output_hdca,
job_output_dataset_assoc, output_hda) to ensure the tables in the
exists() subqueries are clearly distinct from any tables in the outer
query.

This may fix the flaky test_search_delete_hdca_output issue tracked
in GitHub issue galaxyproject#21230.
The job_ids_cte in _filter_jobs was misleadingly named
"job_ids_materialized_cte" but was not actually materialized.
This CTE contains the complex HDCA signature matching logic.

PostgreSQL 12+ can choose to inline CTEs (re-evaluate them as
subqueries) rather than materialize them (evaluate once and store).
This optimization decision is based on cost estimates which can
vary between query executions.

By explicitly materializing this CTE when supported (PostgreSQL 12+),
we ensure the signature matching results are evaluated once and
reused consistently throughout the query, reducing potential for
non-deterministic behavior.

This is a follow-up to the explicit aliases fix, both addressing
the flaky test_search_delete_hdca_output (GitHub issue galaxyproject#21230).
Use unique_id in all is:published searches to avoid picking up
published histories created by other tests running in parallel.
@mvdbeek mvdbeek force-pushed the claude/fix-flaky-jobs-test-Lfb3h branch from f68f6ed to cabc18b Compare January 21, 2026 11:53
@mvdbeek
Copy link
Member Author

mvdbeek commented Jan 21, 2026

OK, I ran the test for the 2 first commits 10 times on my fork and they passed ... we could try it.

@mvdbeek
Copy link
Member Author

mvdbeek commented Jan 21, 2026

lol, first try failed, cool ...

@mvdbeek mvdbeek marked this pull request as draft January 21, 2026 13:24
@mvdbeek mvdbeek removed this from the 26.0 milestone Jan 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/backend area/database Galaxy's database or data access layer kind/bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants