-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plasmid poly(A) Disagreement Between 0.8.0 and 0.9.1 #1233
Comments
Hi @VBHerrenC, PolyA estimation is under continuous review, and there were some changes between those versions. Does your polyA transcript have a non-A linker section? dorado-0.9.0 is a bit stricter about breaking at non-A sections unless the appropriate |
Hi @malton-ont, There aren't any non-A linkers - tail_interrupt_length in the config file was set to 0. Thanks, |
Hi @VBHerrenC, Are you able to share any data? I think this will be very hard to diagnose without. One thing to note is that dorado expects the polyA section to be somewhere within the sequence - i.e. the cleave point for the plasmid can't be within the polyA sequence or flanks. You can get some useful insights into how the region is determined by adding the |
Hi @malton-ont, Unfortunately we can't share the data, but happy to try and do some testing and report back. Since we input circular plasmid to the library prep and it randomly cleaves, I would expect the amount of times it happens to cut in the poly(A) or flanks to be relatively low. Calleigh |
Hi @VBHerrenC, In that case, if you could gather a small (~20 reads) dataset of reads that report significantly differently between the two versions and run these with |
Hi @malton-ont, Thanks so much! Just opened a ticket and attached the requested logs. Let me know if you need anything else. Calleigh |
Issue Report
Please describe the issue:
We ran basecalling on an SQK-RBK114-24 plasmid dataset with --estimate-poly-a and a config file. We initially ran basecalling with 0.9.0 and [email protected], and got the following results:
Although these were somewhat unexpected, they were definitely feasible and so we did not question the results. However, after receiving poly(A) data via another instrument and method, we became suspicious that these results were not accurate - nanopore and the alternate method usually agree quite closely. I re-ran the same dataset on v0.9.1 and [email protected] and got the same results. Still suspicious, I bumped us back down to Dorado 0.8.0 and [email protected] and then got this distribution of poly(A) estimations:
This distribution matches much more closely to the alternate method, and the distribution shape in general matches our historical data much better.
Steps to reproduce the issue:
Basecalling with same dataset, model, parameters, and config file. Only difference is dorado 0.8.0 vs 0.9.X.
Run environment:
/path/pod5
--min-qscore 14
--estimate-poly-a
--poly-a-config /path/poly_a_config.toml
--no-trim
--device 'cuda:all' --verbose > dorado_sup.bam
Logs
The text was updated successfully, but these errors were encountered: