Haploid Haplotype Reconstruction

**What is your question?**
@eblerjana I am working on reconstructing a haploid haplotype using the imputed genotypes from PanGenie. Currently, I am using the following commands:

```bash
PanGenie -i Reads.fq -r MHC-CHM13.ref.fa -v MHC_49-MC.vcf -o temp/APD_PG -t32 && bgzip temp/APD_PG_genotyping.vcf
tabix -p vcf temp/APD_PG_genotyping.vcf.gz && rm -rf APD_rec_PG.fasta
bcftools view -e 'GT="het"' temp/APD_PG_genotyping.vcf.gz | bgzip > temp/APD_PG_genotyping_no_homo.vcf.gz && tabix -p vcf temp/APD_PG_genotyping_no_homo.vcf.gz
bcftools consensus -f MHC-CHM13.ref.fa -o Rec_PG.fasta temp/APD_PG_genotyping_no_homo.vcf.gz
```

In the above commands, I am using haploid reads to obtain genotypes, then filtering the heterozygous variants, and finally using the filtered genotypes to reconstruct the haploid haplotype from the imputed filtered genotypes.

My question is: Is this the correct way to use PanGenie to reconstruct haplotypes? The input VCF is a phased diploid VCF generated by the minigraph-cactus pipeline and preprocessed with the "prepare-mc-vcf" pipeline.

**If applicable: which version of PanGenie are you using?**
v3.1.0

**If applicable: how did you run PanGenie?**
Please provide the command lines used. Did you run it using Singularity?
I've used conda to install PanGenie

**If applicable: what data are you running PanGenie on?**
Which species are you analyzing? Which input reads are used? How does the input VCF look like (number of input samples, how was it produced etc.)?
MHC VCF file generated using Minigraph-Cactus pipeline and preprocessed using "prepare-mc-vcf" pipeline.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Haploid Haplotype Reconstruction #88

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Haploid Haplotype Reconstruction #88

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions