Hi,
Thank you for the useful tool!
While testing treetime ancestral on a small dummy dataset (5 sequences, each 20 bp long), I encountered some unexpected behavior that I’d like to understand better.
Here is the command I used:
treetime ancestral --gtr infer --aln "gaps_to_N_alignment.fasta" --outdir "$OUTPUT_DIR" --tree gubbins.node_labelled.final_tree.tre
In the alignment:
- At position 4, only one sequence (Seq5) contains an "N", while the others have "C". However, the ancestral reconstruction output reported a "T>C" mutation for this position across the tree.
- At position 15, all sequences had a "C", yet a "T>C"mutation appeared in internal nodes.
I have attached both the input alignment and the tree file used for this run, along with the output ancestral_sequences.fasta file.
Could you please advise whether this is expected behavior, or if it might be an artifact of the reconstruction process under ambiguous bases like "N" or perhaps due to the small size of the dataset?
Thank you very much.
treetime_files.zip