Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildSNPdb #14

Open
aemjunior opened this issue Jul 25, 2021 · 1 comment
Open

buildSNPdb #14

aemjunior opened this issue Jul 25, 2021 · 1 comment

Comments

@aemjunior
Copy link

aemjunior commented Jul 25, 2021

What does the "buildSNPdb = 1" actually do? Not sure whether this has been asked before.

I am running an analysis with reads from 300 Vibrio cholerae isolates, and in a first attempt to run PhaME, there was an error: "Error running buildSNPDB.pl -i /home/ubuntu/Augusto/data/phame_analysis_folder/workdir/results -r /home/ubuntu/Augusto/data/phame_analysis_fo",

The .error file indicates that there was no diskspace available (after 5 days very agonizing days waiting for the results):
"Unable to flush stdout: No space left on deviceWarning: unable to close filehandle properly: No space left on device during global destruction."

Reading the documentation, when this parameter is set to "1", it states that the processing time increases significantly (and give this is the one where the error occured, I decided to find an answer before re-starting the analysis).

My PhaME version is 1.0.2 (the one I was able to install using conda environments), and these are the versions of my dependencies:
samtools --version is >= 1.3 - ok, have 1.7
bcftools --version is >= 1.3 - ok, have 1.8
nucmer --version is >= 3.1 - ok, have 3.1
bowtie2 --version is >= 2.3 - ok, have 2.4
bwa is >= 0.7 - ok, have 0.7
FastTree -expert is >= 2.1 - ok, have 2.1

My control file looked like this:
refdir = refdir # Contains only 1 reference genome and the corresponding annotation file in gff3 format.
workdir = workdir # Decompressed reads are in this directory
reference = 1
reffile = combined_IEC224.fasta
project = 300_genomes
cdsSNPS = 1
buildSNPdb = 1
FirstTime = 1
data = 2
reads = 2
tree = 2
bootstrap = 1
N = 100
PosSelect = 2
code = 0
clean = 0
threads = 32
cutoff = 0.1

Thank you in advance.

@mshakya
Copy link
Member

mshakya commented Jul 26, 2021

Hi @aemjunior. Thank you for your interest in phame. buildsnpdb will perform all vs. all alignment of your reference genomes. But, since you are only using one reference genome (i think), that shouldn't make a huge different. Its likely taking long time because of the read mapping step.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants