Hi,
I have a protein fasta file and used rmdup to remove duplicated sequence (I expect to remove completely identical sequences) in the file. However, I find in my output file, it also removed a few short sequences that are substrings of another sequence. I am wondering if rmdup removes by substring pattern?
Thanks,
Sophia