-
Notifications
You must be signed in to change notification settings - Fork 21
Description
Hey there,
While using the mergeMSTs branch, I ran into some trouble with mst and query.
mst
mantis mst doesn't seem to work.
It wants to load eqclass_rr.cls files:
Lines 33 to 34 in 7406e8f
| eqclass_files = | |
| mantis::fs::GetFilesExt(prefix.c_str(), mantis::EQCLASS_FILE); |
This will later lead to a segmentation fault because the files do not exist.
mantis build will always delete eqclass_rr.cls files at the end:
Lines 729 to 737 in 7406e8f
| if (opt.remove_colorClasses && !opt.keep_colorclasses) { | |
| for (auto &f : mantis::fs::GetFilesExt(opt.prefix.c_str(), mantis::EQCLASS_FILE)) { | |
| std::cerr << f.c_str() << "\n"; | |
| if (std::remove(f.c_str()) != 0) { | |
| std::cerr << "Unable to delete file " << f << "\n"; | |
| std::exit(1); | |
| } | |
| } | |
| } |
mantis build doesn't have an option to toggle this behavior.
Changing qopt.remove_colorClasses = true; to qopt.remove_colorClasses = false; here, fixes the issue:
Line 308 in 7406e8f
| qopt.prefix = bopt.out; qopt.numThreads = bopt.numthreads; qopt.remove_colorClasses = true; |
query
The default non-bulk query only works if the eqclass_rr.cls files are present and -1 is used:
mantis query -1 -k 20 -p index/ reads.fasta
To have eqclass_rr.cls files, the above fix is needed, and mst must have been run with -k.
Alternatively, bulk-mode (-b) works without the eqclass_rr.cls files. So, mst can also be run with -d.
mantis query -b -k 20 -p index/ reads.fasta
The problem in non-bulk query seems to be that findSamples is called for every query sequence:
Lines 492 to 498 in 7406e8f
| while (ipfile >> read) { | |
| mstQuery.reset(); | |
| mstQuery.parseKmers(numOfQueries, read, indexK); | |
| mstQuery.findSamples(cdbg, cache_lru, &rs, queryStats, 1); | |
| output_results(mstQuery, opfile, sampleNames, queryStats, 1); | |
| numOfQueries++; | |
| } |
The function then accesses cdbg.get_current_cqf()->keybits():
Line 132 in 7406e8f
| uint64_t ksize{cdbg.get_current_cqf()->keybits()}, numBlocks{cdbg.get_numBlocks()}; |
This works fine for the first query, but for the second one there is no CQF to access because it has been replaced with
an invalid one:
Line 181 in 7406e8f
| cdbg.replaceCQFInMemory(invalid); |
I tried loading the first block 0 at the begin of findSamples and just passing the keybits as an extra parameter.
But then there is an out-of-bounds access at
Line 254 in 7406e8f
| allQueries[q][numSamples]++; |