Skip to content

Why does my PGGB graph look different from the tutorial example? #464

@lcyi0208

Description

@lcyi0208

Hi,

I would like to understand why the graph I generated using the tutorial commands looks different from the one shown in your example.
./pggb -i data/HLA/DRB1-3123.fa.gz -n 12 -t 16 -V 'gi|568815561' -o out -M
The tutorial uses the above command. Since gi|568815561 does not comply with the naming convention, I renamed the sequences using:
zcat DRB1-3123.fa.gz | awk '/^>/{print ">sample"++i"#1#chr6"} !/^>/' > renamed.fa
This generated a new FASTA file, and I then created a new .fai index with faidx. After that, I ran:
pggb -i ./HLA/renamed.fa -n 12 -t 16 -V 'sample4' -o ./out -M
However, the resulting 2D plot is still different from the one in the tutorial.
I noticed that the output image name looks like this:
DRB1-3123.fa.gz.pggb-E-s5000-l15000-p80-n10-a0-K16-k8-w50000-j5000-e5000-I0-R0-N.smooth.chop.og.lay.draw_mqc
and includes parameters that differ from the tutorial command. So I tried:
pggb -i ./HLA/renamed.fa -n 12 -t 16 -V 'sample4' -o ./out -M
pggb -i ./HLA/renamed.fa -p 80 -n 12 -K 16 -k 8 -t 16 -V 'sample4' -o ./out -M
But the graph still looks different.
I also tried specifying the provided PAF file under /data/paf (and updated the names accordingly), but the generated graph still turned out relatively simple.
Perhaps I could get the exact command used to generate the sample graph? I’d like to understand how this dataset was processed to produce a graph with so many cycles.

Image Image Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions