-
Notifications
You must be signed in to change notification settings - Fork 405
Open
Labels
pipelinedp4jTo label issues related to PipelineDP4jTo label issues related to PipelineDP4j
Description
Right now the API of the library does not support aggregating more than one value in one query.
The goal is to add this support. The API itself has all the interfaces and data to enable it, however the DpEngine does not support it. DpEngine is the internal core engine that builds the execution graph to calculate dp metrics.
There can be two ways of modifying the DpEngine to add support for multiple values:
- Natural one: modify all places in core and proto packages where we use single value and introduce a layer to store multiple values or parameters for their calculation. For example: DataExtractors.kt has to be modified to accept not one
ValueExtractorbut multiple. AggregationParams have to be modified as well and value params have to be repeated per each value. Value metrics in dpaggregates.proto have to be modified as well and converted into the map, for example. That being said it leads us to the necessity to distinguish different values. We could distinguish them via ValueExtractor references but it is not a very explicit way and it will be hard to return the results of computation in a convenient explicit format. Probably a better solution is to ask to provide a unique name (string) along with eachValueExtractor. - Another solution is to make DpEngine API more fine-grained and extract contribution bounding into API. This API will return the bounded input together with partition (groups) selection. And then the returned result we will use multiple times to compute the value aggreagation metrics per each value. This solution won't require changing protos,
AggregationParams, etc. but this solution is probably not as clean as the first one. Therefore I have preference for the first one.
sakkumar
Metadata
Metadata
Assignees
Labels
pipelinedp4jTo label issues related to PipelineDP4jTo label issues related to PipelineDP4j