-
Notifications
You must be signed in to change notification settings - Fork 40
Description
First of all, thank you very much for your work and dedication, I really enjoy using the t_coffee suite.
I am running a multiple alignment of a fasta containing ~7500 sequences with these lengths:
file format type num_seqs sum_len min_len avg_len max_len
1_1.faa FASTA Protein 7,582 7,901,142 300 1,042.1 13,093
These are the commands that I'm using:
t_coffee -other_pg seq_reformat -in 1_1.faa -output code_name > 1_1.code_name
t_coffee -other_pg seq_reformat -code 1_1.code_name -in 1_1.faa > 1_1.coded.fasta
t_coffee -reg -seq 1_1.coded.fasta -nseq 750 -tree mbed -method maffteinsi_msa -outfile 1_1.coded.aln -outtree 1_1.coded.mbed -thread 20
t_coffee -other_pg seq_reformat -decode 1_1.code_name -in 1_1.coded.aln > 1_1.aln
At a certain point, the output remains in this state
In addition, t_coffee remains in process state S (interruptable sleep) consuming ~60GB of RAM. After several days of checking the process, it does not seem to move from there. I have tried restarting the alignment several times but it keeps stopping at the same point.
Do I have to leave it until it finishes?
Is it possible to estimate how long it may take? I have aligned other files with slightly smaller sizes a little below and I have not had this kind of problem. Could it be due to the large size difference (from 300aa to >13 000aa) ?
