Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New script to move feature from features of a same record (agat_sp_move_attributes_within_records) #413

Merged
merged 11 commits into from
Jan 10, 2024

Conversation

Juke34
Copy link
Collaborator

@Juke34 Juke34 commented Jan 9, 2024

@LucileSol @MartinPippel @mahesh-panchal
Is that script fine to you? Could you give a try?

example of usage:

agat_sp_move_attributes_within_records.pl --gff infile.gff --feature_copy mRNA  --feature_paste CDS,exon --attribute Dbxref,Ontology

@MartinPippel
Copy link

Hej @Juke34, thanks for the script. I tested it, but it is not exactly doing what we need.

The input looks like this:

ptg000002l      AUGUSTUS        mRNA    3255    4626    0.5     +       .       ID=NBISM00000000001;Parent=NBISG00000000001;Dbxref=CDD:cd07067,Gene3D:G3DSA:3.40.50.1240,InterPro:IPR013078,InterPro:IPR029033,;Name=ARB_03491;Ontology_term=-;makerName=g1.t1;product=Probable phosphoglycerate mutase ARB_03491;uniprot_id=D4B4V1
ID=NBISE00000000009;Parent=NBISM00000000001;makerName=g1.t1.exon9
ptg000002l      AUGUSTUS        CDS     3255    3275    0.98    +       0       ID=NBISC00000000001;Parent=NBISM00000000001;makerName=g1.t1.CDS1

and we want to have all arguments (but not makerName) copied to the CDS entries:

ptg000002l      AUGUSTUS        mRNA    3255    4626    0.5     +       .       ID=NBISM00000000001;Parent=NBISG00000000001;Dbxref=CDD:cd07067,Gene3D:G3DSA:3.40.50.1240,InterPro:IPR013078,InterPro:IPR029033,;Name=ARB_03491;Ontology_term=-;makerName=g1.t1;product=Probable phosphoglycerate mutase ARB_03491;uniprot_id=D4B4V1
ID=NBISE00000000009;Parent=NBISM00000000001;makerName=g1.t1.exon9
ptg000002l      AUGUSTUS        CDS     3255    3275    0.98    +       0       ID=NBISC00000000001;Parent=NBISM00000000001;makerName=g1.t1.CDS1;Dbxref=CDD:cd07067,Gene3D:G3DSA:3.40.50.1240,InterPro:IPR013078,InterPro:IPR029033,;Name=ARB_03491;Ontology_term=-;product=Probable phosphoglycerate mutase ARB_03491;uniprot_id=D4B4V1
ID=NBISE00000000009;Parent=NBISM00000000001;makerName=g1.t1.exon9

However, the current version of the script ignores all comma-separated entities of DBxref, as well it appends g1.t1; to the makerName argument:

ptg000002l      AUGUSTUS        CDS     3255    3275    0.98    +       0       ID=NBISC00000000001,NBISM00000000001;Parent=NBISM00000000001,NBISG00000000001;Dbxref=CDD:cd07067;Name=ARB_03491;Ontology_term=-;makerName=g1.t1.CDS1,g1.t1;product=Probable phosphoglycerate mutase ARB_03491;uniprot_id=D4B4V1

Due to our current tight time limitations, we will probably just add @LucileSol script to the GAAS repo.

@Juke34
Copy link
Collaborator Author

Juke34 commented Jan 10, 2024

Due to our current tight time limitations, we will probably just add @LucileSol script to the GAAS repo.
No problem as you prefer.

This script is anyway useful to AGAT, so I will include it. I have fixed the bugs (It should now behaves as you wish ^^).

@Juke34 Juke34 merged commit b8f36f6 into master Jan 10, 2024
5 checks passed
@MartinPippel
Copy link

thanks Jacques. I tested the new version and I am getting the following error.
Is my input file not following the AGAT standards?

Can't use string ("13557_t") as an ARRAY ref while "strict refs" in use at /projects/martin/prog/conda_envs/agat-1.2.0/lib/perl5/site_perl/AGAT/OmniscientTool.pm line 1272.

and here is the potential problem:

ptg001613l      GeneMark.hmm3   gene    1413    2319    .       -       .       ID=NBISG00000015636;gene_id=13557_g;makerName=13557_g;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   mRNA    1413    2319    .       -       .       ID=NBISM00000017245;Parent=NBISG00000015636;gene_id=13557_g;makerName=13557_t;product=hypothetical protein;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   exon    1413    1541    .       -       .       ID=NBISE00000088938;Parent=NBISM00000017245;cds_type=Internal;count=3_3;gene_id=13557_g;makerName=nbis-exon-22462;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   exon    1842    1989    .       -       .       ID=NBISE00000088939;Parent=NBISM00000017245;cds_type=Internal;count=3_3;gene_id=13557_g;makerName=nbis-exon-22463;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   exon    2036    2319    .       -       .       ID=NBISE00000088940;Parent=NBISM00000017245;cds_type=Internal;count=3_3;gene_id=13557_g;makerName=nbis-exon-22464;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   CDS     1413    1541    .       -       0       ID=NBISC00000017245;Parent=NBISM00000017245;cds_type=Internal;count=3_3;gene_id=13557_g;makerName=cds-79105;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   CDS     1842    1989    .       -       1       ID=NBISC00000017245;Parent=NBISM00000017245;cds_type=Internal;count=2_3;gene_id=13557_g;makerName=cds-79106;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   CDS     2036    2319    .       -       0       ID=NBISC00000017245;Parent=NBISM00000017245;cds_type=Initial;count=1_3;gene_id=13557_g;makerName=cds-79107;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   intron  1542    1841    .       -       0       ID=NBISI00000071698;Parent=NBISM00000017245;gene_id=13557_g;makerName=intron-65549;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   intron  1990    2035    .       -       2       ID=NBISI00000071699;Parent=NBISM00000017245;gene_id=13557_g;makerName=intron-65550;transcript_id=13557_t
ptg001613l      GeneMark.hmm3   start_codon     2317    2319    .       -       0       ID=NBISST00000017212;Parent=NBISM00000017245;gene_id=13557_g;makerName=start_codon-13548;transcript_id=13557_t

@Juke34
Copy link
Collaborator Author

Juke34 commented Jan 11, 2024

Are you sure you are using the latest version? I had this problem in previous commit that I have fixed (line $feature->add_tag_value($tag,@{$value}); in OmniscientTools). I will give a try

@Juke34
Copy link
Collaborator Author

Juke34 commented Jan 11, 2024

Check done. Your example works fine on my side.

@Juke34 Juke34 deleted the ena branch January 11, 2024 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants