Skip to content

[performance] Improved data structure #2561

Open
@markov00

Description

@markov00

Our current data processing strategy is not very optimized.
We have many steps that are also repeated, multiple loops around the data, and a structure that doesn't adapt much to the requested calculations.

In particular:

  1. the library provides multiple ways to describe data: everything in a single spec, multiple specs, the spec grouping, data split accessors, and y accessors. This increases the logic complexity in charts with the need to align all these into a single set of "data tables"
  2. there are multiple waste of data scans to compute data extents or to fill up some missing details. We can probably improve these scans limiting their number.
  3. the way we describe categorical grouping (groupId, specId, splitAccessors, yAccessors) is not great and increases the complexity and time spent to compose and decompose that grouping.
  4. a lof of processing generate different alternatives of the same dataset but without the possibility to being reused.

All these unoptimized calculations are probably wasting processing time and should be solved. There are probably a couple of tasks to go for:

  • collect all the processing requirements for cartesian charts (all the operations, statistics, and calculations applied to the data today)
  • research and test for an improved general data structure that reduces times for accessing the data, reduces memory usages by reducing the number of permutations, and copies of our data points and offers a simplified and optimized way to compute what we need.
  • benchmark 4/5 different chart cases with the current setup and the alternative.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :performancePerformance related issues:xyBar/Line/Area chart related

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions