Skip to content

Parvovirus‐GLUE Alignment Tree

Robert J. Gifford edited this page Sep 27, 2024 · 16 revisions

Alignments Included in Parvovirus-GLUE

Multiple sequence alignments (MSAs) in the Parvovirus-GLUE core project include:

  1. A root alignment constructed to represent homology between the two largest subgroupings in the Parvoviridae.
  2. subfamily alignments constructed to represent proposed homologies between representative members of Parvoviridae subfamilies.
  3. cross-genus alignments constructed to represent proposed homologies between representative members of 'minor' Parvoviridae lineages.

Please note that the repository also contains genus-level alignments constructed to represent proposed homologies between the genomes of representative members of specific parvovirus genera. These are imported when genus-level extensions are added to the core project.

Constrained versus Unconstrained Alignments

In GLUE, multiple sequence alignments can be represented in two forms; constrained and unconstrained. In unconstrained alignments, no fixed system of numbering is applied, and columns are inserted into the alignment as necessary to represent the homologies between sequences. By contrast, constrained alignments are anchored on a specific reference sequence, and use the coordinates of that reference to refer to homologies at specific nucleotide and amino acid positions.

The Alignment Tree in Parvovirus-GLUE

Parvovirus-GLUE makes use of GLUE's Alignment Tree data structure.

For the highest taxonomic levels (i.e. at the root) we aligned only the most conserved regions of the NS gene, whereas for the lower taxonomic levels (i.e. within and below genus level) we aligned complete genomes.

The root alignment contains reference sequences for major clades, whereas all children of the root inherit at least one reference from their immediate parent. Thus, all alignments are linked to one another via our chosen set of master reference sequences.

Parvovirus alignment tree

The schematic figure above shows the 'alignment tree' data structure currently implemented in Parvovirus-GLUE. For the highest taxonomic levels (i.e. at the root) we aligned only the most conserved regions of the genome, whereas for the lower taxonomic levels (i.e. within and below genus level) we aligned complete coding sequences. We used an alignment tree data structure to link these alignments, via a set of common reference sequences. The root alignment contains reference sequences for major clades, whereas all children of the root inherit at least one reference from their immediate parent. Thus, all alignments are linked to one another via our chosen set of master reference sequences.

Clone this wiki locally