Skip to content

Include corresponding gca/gcf accession in the metadata of each sample #433

Open
@anna-parker

Description

@anna-parker

Is your feature request related to a problem? Please describe.

Thanks again for this amazing tool!

I download all influenza A sequences using

datasets download virus genome taxon 11320  --filename data.zip

I would like to group all influenza A segments that come from the same assembly/isolate. I have been using the isolate metadata field for this purpose but it is a work-around and not always available. I noticed that I could download all assemblies to see which samples are in an assembly (sadly this information is not included in the assembly summary field). But I'm having issues with a download of this size. Having the gca accession in the sample metadata (data_report.jsonl) would really simplify this process and give me greater certainty that I have grouped segments together correctly.

Describe the solution you'd like
A clear and concise description of what you want to happen.
Each assembly contains =<8 influenza segments that were sequenced together. Each assembly has a unique ID. For each nucleotide sequence that is in an assembly I would like to see the gca/gfa accession as a metadata field in the data_report.jsonl so I could use it for grouping samples.

Thank you

Thanks for your feedback--your feature requests help improve NCBI Datasets.

Activity

olearyna

olearyna commented on Dec 5, 2024

@olearyna
Contributor

Hi anna-parker,

Thank you for the great suggestion! I’ve created a ticket for this feature request and will work on it soon.

All the best,

Nuala

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @olearyna@anna-parker

        Issue actions

          Include corresponding gca/gcf accession in the metadata of each sample · Issue #433 · ncbi/datasets