-
Notifications
You must be signed in to change notification settings - Fork 35
Description
Hi @blahah and folk!
I'm evaluating different assemblies to build a transcriptome of reference. I'm using a hybrid approach i.e. combining long-reads (PacBio) with short-reads (Illumina). After running the assembly, I've tried to evaluate it and the result is a little weird.
The total number of fragments/reads mapped to the assembly was very low at Read mapping metrics module, only 29% so I aligned the reads using another tool (bowtie2) and the result was much better (84%)... Is something wrong with TransRate metrics? Is it normal to obtain 100% of low covered and uncovered contigs? Is it normal not find any bridge?
May I trust on TransRate to remove all missassembly transcripts and continue the pipeline using only "good" transcripts?
This issue is related to #220 and #208
TRANSRATE LOG
[ INFO] 2020-07-16 15:37:25 : Calculating contig metrics...
[ INFO] 2020-07-16 15:37:48 : Contig metrics:
[ INFO] 2020-07-16 15:37:48 : -----------------------------------
[ INFO] 2020-07-16 15:37:48 : n seqs 198411
[ INFO] 2020-07-16 15:37:48 : smallest 72
[ INFO] 2020-07-16 15:37:48 : largest 18293
[ INFO] 2020-07-16 15:37:48 : n bases 165212363
[ INFO] 2020-07-16 15:37:48 : mean len 821.66
[ INFO] 2020-07-16 15:37:48 : n under 200 11451
[ INFO] 2020-07-16 15:37:48 : n over 1k 41481
[ INFO] 2020-07-16 15:37:48 : n over 10k 137
[ INFO] 2020-07-16 15:37:48 : n with orf 35726
[ INFO] 2020-07-16 15:37:48 : mean orf percent 43.54
[ INFO] 2020-07-16 15:37:48 : n90 290
[ INFO] 2020-07-16 15:37:48 : n70 878
[ INFO] 2020-07-16 15:37:48 : n50 2041
[ INFO] 2020-07-16 15:37:48 : n30 3533
[ INFO] 2020-07-16 15:37:48 : n10 6370
[ INFO] 2020-07-16 15:37:48 : gc 0.33
[ INFO] 2020-07-16 15:37:48 : bases n 425802
[ INFO] 2020-07-16 15:37:48 : proportion n 0.0
[ INFO] 2020-07-16 15:37:48 : Contig metrics done in 23 seconds
[ INFO] 2020-07-16 15:37:48 : Calculating read diagnostics...
[ INFO] 2020-07-16 15:55:07 : Read mapping metrics:
[ INFO] 2020-07-16 15:55:07 : -----------------------------------
[ INFO] 2020-07-16 15:55:07 : fragments 47147400
[ INFO] 2020-07-16 15:55:07 : fragments mapped 13764007
[ INFO] 2020-07-16 15:55:07 : p fragments mapped 0.29
[ INFO] 2020-07-16 15:55:07 : good mappings 12549696
[ INFO] 2020-07-16 15:55:07 : p good mapping 0.27
[ INFO] 2020-07-16 15:55:07 : bad mappings 1214311
[ INFO] 2020-07-16 15:55:07 : potential bridges 0
[ INFO] 2020-07-16 15:55:07 : bases uncovered 70095137
[ INFO] 2020-07-16 15:55:07 : p bases uncovered 0.42
[ INFO] 2020-07-16 15:55:07 : contigs uncovbase 99185
[ INFO] 2020-07-16 15:55:07 : p contigs uncovbase 0.5
[ INFO] 2020-07-16 15:55:07 : contigs uncovered 198411
[ INFO] 2020-07-16 15:55:07 : p contigs uncovered 1.0
[ INFO] 2020-07-16 15:55:07 : contigs lowcovered 198411
[ INFO] 2020-07-16 15:55:07 : p contigs lowcovered 1.0
[ INFO] 2020-07-16 15:55:07 : contigs segmented 33054
[ INFO] 2020-07-16 15:55:07 : p contigs segmented 0.17
[ INFO] 2020-07-16 15:55:07 : Read metrics done in 1039 seconds
[ INFO] 2020-07-16 15:55:07 : Calculating comparative metrics...
[ INFO] 2020-07-16 15:57:05 : Comparative metrics:
[ INFO] 2020-07-16 15:57:05 : -----------------------------------
[ INFO] 2020-07-16 15:57:05 : CRBB hits 33246
[ INFO] 2020-07-16 15:57:05 : n contigs with CRBB 33246
[ INFO] 2020-07-16 15:57:05 : p contigs with CRBB 0.17
[ INFO] 2020-07-16 15:57:05 : rbh per reference 1.01
[ INFO] 2020-07-16 15:57:05 : n refs with CRBB 14438
[ INFO] 2020-07-16 15:57:05 : p refs with CRBB 0.44
[ INFO] 2020-07-16 15:57:05 : cov25 6721
[ INFO] 2020-07-16 15:57:05 : p cov25 0.2
[ INFO] 2020-07-16 15:57:05 : cov50 4760
[ INFO] 2020-07-16 15:57:05 : p cov50 0.14
[ INFO] 2020-07-16 15:57:05 : cov75 3351
[ INFO] 2020-07-16 15:57:05 : p cov75 0.1
[ INFO] 2020-07-16 15:57:05 : cov85 2901
[ INFO] 2020-07-16 15:57:05 : p cov85 0.09
[ INFO] 2020-07-16 15:57:05 : cov95 2318
[ INFO] 2020-07-16 15:57:05 : p cov95 0.07
[ INFO] 2020-07-16 15:57:05 : reference coverage 0.16
[ INFO] 2020-07-16 15:57:05 : Comparative metrics done in 118 seconds
[ INFO] 2020-07-16 15:57:05 : -----------------------------------
[ INFO] 2020-07-16 15:57:26 : TRANSRATE ASSEMBLY SCORE 0.1066
[ INFO] 2020-07-16 15:57:26 : -----------------------------------
[ INFO] 2020-07-16 15:57:26 : TRANSRATE OPTIMAL SCORE 0.1308
[ INFO] 2020-07-16 15:57:26 : TRANSRATE OPTIMAL CUTOFF 0.014
[ INFO] 2020-07-16 15:57:26 : good contigs 187971
[ INFO] 2020-07-16 15:57:26 : p good contigs 0.95
BOWTIE2 ALIGNMENT STATS
94294800 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
89466567 + 0 mapped (94.88% : N/A)
94294800 + 0 paired in sequencing
47147400 + 0 read1
47147400 + 0 read2
80055734 + 0 properly paired (84.90% : N/A)
86651820 + 0 with itself and mate mapped
2814747 + 0 singletons (2.99% : N/A)
6178086 + 0 with mate mapped to a different chr
1736018 + 0 with mate mapped to a different chr (mapQ>=5)