-
Notifications
You must be signed in to change notification settings - Fork 33
Description
Hello,
I have a similar issue with #318 (although with bacteria/haploid data) where I can see in the VCFs as well as the pileup that under peculiar circumstances, I am seeing near high snp densities where there are pileup variant calls that are correctly assigning an alternate allele but in the full alignment at the same position it's incorrectly typed as as a refcall. Using the same run_clair.sh (v1.0.10) parameters shown below, I get for a serial isolate with the correct call. This is with bacteria data btw, and I found that in some instances when running with --haploid_precise
, the GT was being called 0/1 and removed when the evidence for ALT had >.9 AF and high coverage (100X), thus I've dropped it for now. Here's a picture of the region in particular with the two serial isolates aligned to a reference from a different sequence type and the VCF showing the discordant pileup v. full alignment VCs:
CR-0005 Full Alignment VCF:
CR-0005 Pileup VCF:
CR-0063 Full Alignment VCF:
CR-0063 Pileup VCF:
And command here:
clair3_cmd = [ "run_clair3.sh", f"--bam_fn={os.path.abspath(bam_output)}", f"--ref_fn={os.path.abspath(ref)}", f"--threads={threads}", "--platform=ont", f"--parallel={parallel_path}", f"--model_path={os.path.abspath(model_path)}", f"--output={clair3_output_dir}", f"--sample_name={sample}", "--include_all_ctgs", "--no_phasing_for_fa", "--enable_long_indel" ]
I'm using the r1041_e82_400bps_sup_v500 model. I can provide data if need be. Love your tool though! This and more consistent masking of recombinant regions and should have a decent core genome alignment pipeline.
Best,
Will