-
Notifications
You must be signed in to change notification settings - Fork 0
Parvovirus‐GLUE Alignment Tree
Multiple sequence alignments (MSAs) in the Parvovirus-GLUE core project include:
- A root alignment constructed to represent homology between the two largest subgroupings in the Parvoviridae.
- subfamily alignments constructed to represent proposed homologies between representative members of Parvoviridae subfamilies.
- cross-genus alignments constructed to represent proposed homologies between representative members of 'minor' Parvoviridae lineages.
Please note that the repository also contains genus-level alignments constructed to represent proposed homologies between the genomes of representative members of specific parvovirus genera. These are imported when genus-level extensions are added to the core project.
In GLUE, multiple sequence alignments can be represented in two forms; constrained and unconstrained. In unconstrained alignments, no fixed system of numbering is applied, and columns are inserted into the alignment as necessary to represent the homologies between sequences. By contrast, constrained alignments are anchored on a specific reference sequence, and use the coordinates of that reference to refer to homologies at specific nucleotide and amino acid positions.
Parvovirus-GLUE makes use of GLUE's Alignment Tree data structure.
For the highest taxonomic levels (i.e. at the root) we aligned only the most conserved regions of the NS gene, whereas for the lower taxonomic levels (i.e. within and below genus level) we aligned complete genomes.
The root alignment contains reference sequences for major clades, whereas all children of the root inherit at least one reference from their immediate parent. Thus, all alignments are linked to one another via our chosen set of master reference sequences.
The schematic figure above shows the 'alignment tree' data structure currently implemented in Parvovirus-GLUE. For the highest taxonomic levels (i.e. at the root) we aligned only the most conserved regions of the genome, whereas for the lower taxonomic levels (i.e. within and below genus level) we aligned complete coding sequences. We used an alignment tree data structure to link these alignments, via a set of common reference sequences. The root alignment contains reference sequences for major clades, whereas all children of the root inherit at least one reference from their immediate parent. Thus, all alignments are linked to one another via our chosen set of master reference sequences.
Parvovirus-GLUE by Robert J Gifford Lab.
For questions, issues, or feedback, please open an issue on the GitHub repository.