Skip to content

Add support for aggregating multiple values #285

@RamSaw

Description

@RamSaw

Right now the API of the library does not support aggregating more than one value in one query.

The goal is to add this support. The API itself has all the interfaces and data to enable it, however the DpEngine does not support it. DpEngine is the internal core engine that builds the execution graph to calculate dp metrics.

There can be two ways of modifying the DpEngine to add support for multiple values:

  • Natural one: modify all places in core and proto packages where we use single value and introduce a layer to store multiple values or parameters for their calculation. For example: DataExtractors.kt has to be modified to accept not one ValueExtractor but multiple. AggregationParams have to be modified as well and value params have to be repeated per each value. Value metrics in dpaggregates.proto have to be modified as well and converted into the map, for example. That being said it leads us to the necessity to distinguish different values. We could distinguish them via ValueExtractor references but it is not a very explicit way and it will be hard to return the results of computation in a convenient explicit format. Probably a better solution is to ask to provide a unique name (string) along with each ValueExtractor.
  • Another solution is to make DpEngine API more fine-grained and extract contribution bounding into API. This API will return the bounded input together with partition (groups) selection. And then the returned result we will use multiple times to compute the value aggreagation metrics per each value. This solution won't require changing protos, AggregationParams, etc. but this solution is probably not as clean as the first one. Therefore I have preference for the first one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    pipelinedp4jTo label issues related to PipelineDP4j

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions