Skip to content

Conversation

gbggrant
Copy link
Collaborator

@gbggrant gbggrant commented Oct 2, 2025

No description provided.

rsasch and others added 30 commits February 27, 2024 16:53
* Remove AS_YNG from Extracted VCFs
* Update test data to handle code change
* Updated truth
* Rebuild and update gatk docker.
ExtractCohortToPgen tool and workflow for running it based on the GVS VCF extract code

Co-authored-by: Miguel Covarrubias <[email protected]>
* Had Hail Integration test tell VCF Integration test to ignore diffs in disk space or cost (which may be transient)
* Added support for all_chrs truth.
* Changed naming of cost and table sizes files to make replacement easier.
* Add new parameter to set filtered genotypes to no-calls to ExtractCohort
* Modified ExtractCohortEngine to optionally set genotypes that are filtered (FT flag set - at the genotype leve) to no-calls.
* Renamed VQSR Classic to 'VQSR'
* Renamed VQSR Lite to 'VETS'
* Updated VCF and pgen tests for code changes
* Fixed cost calculation to ignore duplicate rows caused by preemption.
* Resetting tolerances to 5% across the board. We shall see.
* Optionally extract to bgz format.
* Set bgzipping to be off (everywhere) by default.
* Update assert_identical_outputs to handle bgzipped outputs.
* Have GvsAssignIds.wdl validate that input sample names (in the provided input file) are unique.
mcovarr and others added 30 commits July 10, 2025 12:48
…ants tasks for FoxTrot Callset

* Increase Default Disk for MergeVCFs task for FoxTrot Callset
* Upping memory for SelectVariants for Foxtrot.
* VS-1716. Have Extract check for the existence of the sample_chromosome_ploidy table and use it if it exists. This should allow backwards compatibility with Delta callset.
…sites in a new location (#9255)

* Update GvsCallsetStatistics.wdl to look for table gnomad_v3_sites in a new location.
* adding a testing WDL for VAT changes

* adding a testing WDL for VAT changes

* Adding in first pass at keeping the highest AC synonym and removing the rest

* trying a different loop.  Last one didn't work

* Seeing if a string comparison instead of an integer comparison was behind the issue

* Now with my debug info

* Another round of better awking

* piping synonyms to top-level test wdl

* problem might be in awk version.  Printing it

* version of awk doesn't support --version??

* undoing version check

* using some more verison-portable awk

* Creating full VAT wdl to run against Echo data

* Rounding up and saving the detailed list of lower ac synonyms that we filtered out

* covering an edge case that we'll never actually see, but copilot pointed out. For completeness, I suppose

* doing it as an inline python for loop

* updating code to write the expected file instead of dumping to stdout

* Removing helping wdl for efficient testing.  Updating comments as per review feedback
*  just do the query in two passes - even and odd numbered samples.
…somes (#9259)

* Generalized the chromosome splitting.
* Add the variant id to the pgen output files
* Adding in check for contigs not in the weighed bed file so we no longer fail on that case

* Pushing the changes to dockstore so I can check there instead of locally

* updating GATK docker

* Pointing us back to the normal weighted bed file instead of the interntionally broken one for testing
* Update VAT alt allele cutoff to 100
* Change logic for n_chunks to be echo-ish
* VS-1570 - Adding new validation for the VAT
* For VS-1644. Increase disk for ExcludeSitesFromSitesOnlyVcf task.
Making it a task parameter so potentially overrideable.
* VS-1743. Modify pgen .pvar file to use a new ID field format.
Specifically, 'chr:pos:ref'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants