Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add TRGT Special-purpose caller #412
base: main
Are you sure you want to change the base?
Add TRGT Special-purpose caller #412
Changes from 3 commits
98df965
5f76777
d20883a
7206419
53f8e5d
d433589
3974b44
673bb03
15adde1
12b0413
f6f8c10
37eaee2
9454e44
723b058
78bfab5
0a895f0
c2c4a98
a54ed63
aa66981
6c7fc74
269d0d9
5b796bc
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a general comment about including
trgt
in this production pipeline.It's a bit complicated, so I'll outline the main logic here:
The production pipeline is intended for running routine and stable tools on all production samples. Its primary responsibility is two-fold: alignments merging (and associated QC/metric) + variant calling (SV, SNV, indel). The tools incorporated here are well accepted by the community and undergo less frequent updates. This means we don't need to update & rerun the workflow constantly.
There's another set of callers that are not quite "validated" yet, i.e. they are more special-purpose focused and/or target a specific type of variants. They are under active development and evaluation so their release frequencies are much higher. IMO
trgt
falls under this category.So here's my proposal:
Let's leave
trgt
out of the main variant calling pipeline, and promote it to a standaloneworkflow
(i.e. place it under/wdl/pipelines/PacBio/VariantCalling/
). We have other means to ensure that such "special-purpose" callers are also run routinely, in addition to the main ones. This also has the benefit that if we decide to update the version oftrgt
or how to run it (added annotations, etc), we only need to update its WDLs and rerun that workflow, i.e. separation of concerns .How does this sound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You make an excellent point. I definitely agree that TRGT is much more 'under active development' than may be ideal for the main pipeline.
I agree with your plan and have gone ahead and created a new pipeline under /wdl/pipelines/PacBio/VariantCalling/ which is called PBCCS_CallTRs.wdl. This pipeline runs the processWithTRGT task in the TRGT.wdl file (under /wdl/tasks/VariantCalling/).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never tried running workflows that try to localize HTTP files. So I worry this default value will fail the workflow's executions.
This points to a larger issue: where should these annotations be stored?
I don't mean we should resolve this here, but just getting it out there for us to think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. Apparently you can import WDLs from https, but can't localize files that way. But yes, this is obviously just because I didn't have a good place to store the annotation file. It is about 275MB, so not really appropriate for Github. Ideally, we would put it somewhere on gs. But I don't have an account where I can make a file publicly available there.
For now, I went ahead and moved this to a wget command within the Dockerfile so the annotation is built in to the image (and then referenced locally). But if we can make a place for me to host the annotation on gs, that would obviously be a much better solution as the annotation will certainly change over time and re-building the docker image for that is clearly not optimal.