Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bio::Tools::GFF _gffX_string update #327

Open
Juke34 opened this issue Nov 6, 2019 · 1 comment
Open

Bio::Tools::GFF _gffX_string update #327

Juke34 opened this issue Nov 6, 2019 · 1 comment

Comments

@Juke34
Copy link
Contributor

Juke34 commented Nov 6, 2019

Hi, I would like to push some updates about the methods _gff2_string and _gff25_string to remove some inconsistency related to the format specifications (Here a review of the specifications I have done).
Currently the difference between the two methods lies in the fact that Target attribute are put first in the attribute list using _gff25_string.

point 1) As the order shouldn't matter I was wondering if we could remove the attribute sorting. The code is quite old (2004). I'm sceptical due to a comment saying # need to put the target info before other tag/value pairs - mw, and because the description of the _gff25_string method says: Function: To get a format of GFF that is peculiar to Gbrowse/Bio::DB::GFF. But why having a general method handling a specific case for Gbrowse then? I guess Gbrowse has fixed this peculiarity since then...

Both are giving attribute list like that (note the two spaces before the semicolon, one would be enough...):

tag1 "value 1" ; tag2 value2

The _gff2_string method follows the GFF2 specification. About the attribute the specification says: From version 2 onwards, the attribute field must have an tag value structure following the syntax used within objects in a .ace file, flattened onto one line by semicolon separators.
**point 2) They do not ask to put spaces around the semicolon, should we remove them? **. I guess for avoiding potential compatibility issue it's easier to keep it like that...

The _gff25_string is similar to _gff2_string but should follow the GTF2 format. (GFF2.5 = GTF). In that sense, the attribute must looks like:

tag1 "value 1"; tag2 value2;

point 3) For me is the most important point, the _gff25_string method must create GTF2/GFF2.5 format and not do be a fix of the _gff2_string method to be adapted for peculiar GBrowse case.

I poke @fangly @bosborne @hyphaltip @cjfields because I have seen you have worked on that package at some point.

I will adapt my modifications according to your feedback.
Best regards,

Jacques

@cjfields
Copy link
Member

cjfields commented Nov 6, 2019

@Juke34 Based on the documentation I think it would be good to have you involved with the GFF specification discussions, though those have gone a bit dormant in the last few years.

I'm all for updating to ensure the specifications are in place. @scottcain would you have any comments on the above, as it could affect GBrowse? Maybe it doesn't matter if everyone is moving to using JBrowse and/or Bio::DB::SeqFeature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants