[performance] Improved data structure

Our current data processing strategy is not very optimized.
We have many steps that are also repeated, multiple loops around the data, and a structure that doesn't adapt much to the requested calculations.

In particular:
1. the library provides multiple ways to describe data: everything in a single spec, multiple specs, the spec grouping, data split accessors, and y accessors. This increases the logic complexity in charts with the need to align all these into a single set of "data tables"
2. there are multiple waste of data scans to compute data extents or to fill up some missing details. We can probably improve these scans limiting their number.
3. the way we describe categorical grouping (groupId, specId, splitAccessors, yAccessors) is not great and increases the complexity and time spent to compose and decompose that grouping.
4. a lof of processing generate different alternatives of the same dataset but without the possibility to being reused.

All these unoptimized calculations are probably wasting processing time and should be solved. There are probably a couple of tasks to go for:
- collect all the processing requirements for cartesian charts (all the operations, statistics, and calculations applied to the data today)
- research and test for an improved general data structure that reduces times for accessing the data, reduces memory usages by reducing the number of permutations, and copies of our data points and offers a simplified and optimized way to compute what we need.
- benchmark 4/5 different chart cases with the current setup and the alternative.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[performance] Improved data structure #2561

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[performance] Improved data structure #2561

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions