Skip to content

Complexity curve #96

@bazyliszek

Description

@bazyliszek
  1. Looking into complexity curve. Could it be that for PE reads we actually need to add the -P parameter in preseq? This was not detected in the pipeline automatically.
samtools sort 6-10A_S13_L001_R1_001_val_1_bismark_bt2_pe.bam \
         -m 8589934592 \
         -@ 1 \
         -o 6-10A_S13_L001_R1_001_val_1_bismark_bt2_pe.sorted.bam
     preseq lc_extrap -v -B 6-10A_S13_L001_R1_001_val_1_bismark_bt2_pe.sorted.bam -o 6-10A_S13_L001_R1_001_val_1_bismark_bt2_pe.ccurve.txt 
  1. Also, for very small amounts of the reads this program is not working, but I guess that is my problem with reads of total number 119,573. Error: max count before zero is less than min required cound (4), sample not sufficiently deep or duplicates removed.

original command:

BAM_INPUT
TOTAL READS     = 119573
DISTINCT READS  = 119456
DISTINCT COUNTS = 3
MAX COUNT       = 3
COUNTS OF 1     = 119342
MAX TERMS       = 2
OBSERVED COUNTS (4)
1       119342
2       111
3       3

ERROR:  max count before zero is les than min required count (4), sample not sufficiently deep or duplicates removed

and if I implement PE:

PAIRED_END_BAM_INPUT
paired = 119572
unpaired = 0
MERGED PAIRED END READS = 119572
MATES PROCESSED = 239144
TOTAL READS     = 119572
DISTINCT READS  = 119485
DISTINCT COUNTS = 2
MAX COUNT       = 2
COUNTS OF 1     = 119398
MAX TERMS       = 2
OBSERVED COUNTS (3)
1       119398
2       87

ERROR:  max count before zero is les than min required count (4), sample not sufficiently deep or duplicates removed

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requestedstaleWill eventually be closed due to divergence and compatibility issues.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions