Skip to content

Conversation

J-Moravec
Copy link
Contributor

Second round of improvements, the ultimate aim is to build unit-tested path towards annotateWithGeneParts.

In this stage, we are improving readTranscriptFeatures:

  • used Map instead of mapply when parsing introns/exons since it uses SIMPLIFY = FALSE. Mapply defaults to simplify = TRUE, which can cause unexpected issues in some limited cases (e.g., simplifying to a matrix).

  • added check if the input bed file has 12 columns, which are expected further down in pipeline. This more clearly communicates the intent, instead of getting crypting subsetting errors in the intron, exons, promoters, and TSS functions.

  • Added two valid bed files, one with 6 and one with 12 columns and unit tests for readTranscriptFeatures. In the presence, this is more of a regression test since I didn't checked in detail the correctness of the output.

Knowing readTranscriptFeatures is correct and having a valid bed12 file, the next step is to use this with a sample GRanges object to annotate it with annotateWithGeneParts (and then finally test that as.data.frame method).

readTranscriptFeatures should now be a little bit safer

 * used Map instead of mapply when parsing introns/exons since it uses SIMPLIFY = FALSE, mapply with default simplify = TRUE can cause unexpected issues in some limited cases.

 * added check if the input bed file has 12 columns, which are expected further down in pipeline.

 * Added two valid bed files, one with 6 and one with 12 columns.
This is more of a regression test than solid behaviour test
@J-Moravec
Copy link
Contributor Author

@frenkiboy I have finally found time to finish simple unit-tests towards as.data.frame.

This PR is ready to be reviewed and merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant