Support NCBI microbe GTF/GFF with no transcripts (CDS only) #1627
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixes #1620
Support NCBI GTF/GFF annotation files that only contain CDS lines: these CDS lines are children from gene IDs (instead of transcript IDs, as usual in Ensembl annotation files) and don't have exons as children.
If a CDS is a child from a gene and has no exons of its own, parse the feature as a single-exon transcript with the same strand, start and end as the CDS.
TODO
--cds_as_transcript_gxf
--cds_as_transcript_gxf
in public docsTesting
Example files for avian paramyxovirus 1
Example VCF
Example command
Test conditions
--cds_as_transcript_gxf
should return a warning if there are CDS in the annotation whose parent is a gene record--cds_as_transcript_gxf
should successfully use the CDS in the annotation as single-exon transcripts