Skip to content
Kenji Fukushima edited this page Dec 21, 2024 · 7 revisions

Setting up

CDSKIT commands

  • accession2fasta: Retrieving fasta sequences from a list of GenBank accessions

  • aggregate: Extracting the longest sequences combined with a sequence name regex

  • backtrim: Back-translating a trimmed protein alignment

  • hammer: Removing less-occupied codon columns from a gappy alignment

  • intersection: Dropping non-overlapping sequence labels between two sequences files or between a sequence file and a gff file

  • label: Modifying sequence labels

  • mask: Masking ambiguous and/or stop codons

  • pad: Making nucleotide sequences in-frame by head and tail paddings

  • parsegb: Converting the GenBank format

  • printseq: Print a subset of sequences with a regex

  • rmseq: Removing a subset of sequences by using a sequence name regex and by detecting problematic sequence characters.

  • stats: Printing sequence statistics.

Clone this wiki locally