-
-
Notifications
You must be signed in to change notification settings - Fork 75
Description
Is there a quality metric associated with the umi / barcode grouping? e.g. "UMI1" is identified to be a set of n reads using some strategy. In the simpliest case of identity for example, all have the same UMI read, but of varying quality. Is there a set of metrics to describe the confidence of the grouping to UMI1? This is a little complicated as it of course depends on not only the DNA sequence quality, but also the UMI length (and if we're really going to model error "correctly" the representation of bases or other UMIs and their edit distances).
What I'd ideally like is a likelihood ratio of the read being associated with a particular UMI grouping vs the next nearest group per-read. This can be calculated from the phred scores of the read and the nearest observed alternative UMI.