[Parser Fix]: Change SRA parser handling of 'isBasedOn' values

Background: On Tuesday October 17th, the Production API went down. According to @DylanWelzel's investigations, the cause was due to SRA's excessively large metadata records, where many records had in excess of 1000 objects listed in the 'isBasedOn' field. This was addressed by @everaldorodrigo adjusting the memory size, but the core issue is excessively large SRA metadata record

An SRA record is project or study-based. Each record may reference thousands of runs, experiments, samples, etc. This is causing issues with memory when trying to query SRA records.

1. Revisit the metadata that is being parsed into the 'isBasedOn' property
2. Investigate potential changes to the parser that can address the core issue:
  * Parse multiple records of the same type to the same 'IsBasedOn' object. Since the identifier field can be an array, it's possible to cut down the number of repetitive 'isBasedOn' objects which only differ by 'identifier'
  * If this doesn't work, set an upper limit on the number of 'isBasedOn' objects to parse, then add some sort of indicator that the user should visit SRA if they want to see more
 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Parser Fix]: Change SRA parser handling of 'isBasedOn' values #112

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Parser Fix]: Change SRA parser handling of 'isBasedOn' values #112

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions