Skip to content

Issue with Generating VCF for PanGenie Using the Pipeline #95

@HuangYihang1222

Description

@HuangYihang1222

Dear Jana,

I encountered an issue while using the pipeline from this repository to generate the VCF files required for PanGenie.

Upon inspection, I found that the VCF files in the calls folder were generated correctly. The following files are present and seem normal:

all-haplotypes-callable.log
all-haplotypes.vcf
all-haplotypes-callable.vcf
sample1-hap0.vcf.gz
sample1-hap0.vcf.gz.tbi
sample1-hap1.vcf.gz
sample1-hap1.vcf.gz.tbi

However, in the multisample-vcfs folder, the callset-filtered.vcf and graph-filtered.vcf files appear to be problematic. They contain only header lines (lines starting with #) and no variant entries.

Image

After checking the log file generated with:

nohup snakemake -j 50 --use-conda > snakemake.log 2>&1 &

and comparing it with the Snakefile, I found that there were no error messages in the log. However, it seems that the sort_bed and callable_regions steps were not executed.

snakemake.log

I would like to understand what might be causing this issue and how I can resolve it. Additionally, my config.json file is as follows:

{
        "reference": 
                {
                        "filename": "/home/huangyh/pangenome/1_cactus_construction/CN.fasta"
                },
        "assemblies":
                {
                        "sample1" : ["/home/huangyh/pangenome/1_cactus_construction/NA1.fasta", "/home/huangyh/pangenome/1_cactus_construction/NA2.fasta"]
                },
        "trios":
                {

                },
        "scripts": "scripts",
        "outdir" : "/home/huangyh/pangenome/4_pangenie/"
}

Thank you for your time and assistance!

Best regards,
Yihang Huang

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions