Description
Is your feature request related to a problem? Please describe.
Thanks again for this amazing tool!
I download all influenza A sequences using
datasets download virus genome taxon 11320 --filename data.zip
I would like to group all influenza A segments that come from the same assembly/isolate. I have been using the isolate metadata field for this purpose but it is a work-around and not always available. I noticed that I could download all assemblies to see which samples are in an assembly (sadly this information is not included in the assembly summary field). But I'm having issues with a download of this size. Having the gca accession in the sample metadata (data_report.jsonl) would really simplify this process and give me greater certainty that I have grouped segments together correctly.
Describe the solution you'd like
A clear and concise description of what you want to happen.
Each assembly contains =<8 influenza segments that were sequenced together. Each assembly has a unique ID. For each nucleotide sequence that is in an assembly I would like to see the gca/gfa accession as a metadata field in the data_report.jsonl
so I could use it for grouping samples.
Thank you
Thanks for your feedback--your feature requests help improve NCBI Datasets.
Activity
olearyna commentedon Dec 5, 2024
Hi anna-parker,
Thank you for the great suggestion! I’ve created a ticket for this feature request and will work on it soon.
All the best,
Nuala