Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for preseq #99

Closed
edmundmiller opened this issue Dec 10, 2020 · 5 comments · Fixed by #112
Closed

Add tests for preseq #99

edmundmiller opened this issue Dec 10, 2020 · 5 comments · Fixed by #112
Labels
good first issue Good for newcomers help wanted Extra attention is needed tests Related to automated tests

Comments

@edmundmiller
Copy link
Contributor

No description provided.

@edmundmiller edmundmiller added help wanted Extra attention is needed good first issue Good for newcomers tests Related to automated tests labels Dec 10, 2020
@edmundmiller edmundmiller added this to the Test All the Modules milestone Dec 10, 2020
@sruthipsuresh
Copy link
Contributor

For the preseq module, would we need to have another input file to use?
When testing it, preseq fails with the test files we have in both the bed and bam format as they have low read counts.

From the user manual:
preseq lc_extrap can fail to estimate the curve if there are not enough reads in the
aligned bam file for the calculation. The error reported is:
ERROR: max count before zero is less than min required count (4), sample not sufficiently deep or duplicates removed

Additionally, there's an error in the module where the $bam file is not preceded by a -bam prefix (since the default is a bed file).

@drpatelh
Copy link
Member

drpatelh commented Jan 4, 2021

Good question @sruthipsuresh ! Yes, Preseq hasn't been updated in quite a while now and I doubt it will be anytime soon...

Does it fail for single-end and paired-end data? Given the issues you mentioned I think it would be good enough to get it passing with single-end data only. May need to change the test data for this if required.

@sruthipsuresh
Copy link
Contributor

It fails for both types of data unfortunately. I also tried creating a sorted bed file from the test bam files that we have already (as described here), but that fails as well.

The single-end test does work with this sample file from the preseq repo- should this be used as the test data instead then?

@edmundmiller
Copy link
Contributor Author

I think that's a good option, I couldn't find any test stuff last night in their repo good job!

Ah this was just what I was looking for too! https://www.nextflow.io/docs/edge/script.html#http-ftp-files

You should be able to use something like https://github.com/smithlabcode/preseq/raw/master/data/SRR1003759_5M_subset.mr for each of the files for input.

So

mr = file('https://github.com/smithlabcode/preseq/raw/master/data/SRR1003759_5M_subset.mr')

@sruthipsuresh
Copy link
Contributor

I'll do that instead of adding the file to the input folder directly! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed tests Related to automated tests
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants