-
Notifications
You must be signed in to change notification settings - Fork 45
[BUG] Search on roll up indices with avg is not working when we have empty buckets #451
Description
Describe the bug
Search on roll up indices with avg is not working when we have empty buckets .
For example if we are rolling up the data on the roll up index with a 60 minute granularity and if we are searching on roll up indices with a 5 minute granularity we might face the following error ( which is not a idle customer use case though) but we need to handle when we have empty buckets ..
Roll up policy which we tried ::
curl -XPUT "localhost:9200/_opendistro/_rollup/jobs/latest_stats_roll_up" -H 'Content-Type: application/json' -d'{"rollup":{"enabled":true,"schedule":{"interval":{"period":1,"unit":"Minutes"}},"description":"An example policy that rolls up the sample ecommerce data","source_index":"fmstats_2021-06-03*","target_index":"temp_stats_roll_1","page_size":1000,"delay":0,"continuous":false,"dimensions":[{"date_histogram":{"source_field":"timestamp","fixed_interval":"60m"}},{"terms":{"source_field":"portIdToClusterId"}},{"terms":{"source_field":"alias"}}],"metrics":[{"source_field":"port.rx.packets","metrics":[{"avg":{}},{"sum":{}},{"max":{}},{"min":{}},{"value_count":{}}]}]}}'
Search query which we tried :
curl -XPOST "localhost:9200/temp_stats_roll_up/_search?pretty" -H 'Content-Type: application/json' -d'{"size":0,"aggregations":{"daily_numbers":{"terms":{"field":"portIdToClusterId"},"aggregations":{"Sub_dateHistogramAgg":{"date_histogram":{"field":"timestamp","missing":0,"fixed_interval":"5m","offset":0,"order":{"_key":"asc"},"keyed":false,"min_doc_count":0},"aggregations":{"avgAggportTxPackets":{"avg":{"field":"port.tx.packets"}}}}}}}}'
problem here is fixed_interval":"5m which should be greater than roll indexing time dimensions.
adding the following code and it started working , can you please validate the same and fix accordingly .
git a/src/main/kotlin/com/amazon/opendistroforelasticsearch/indexmanagement/rollup/util/RollupUtils.kt b/src/main/kotlin/com/amazon/opendistroforelasticsearch/indexmanagement/rollup/util/RollupUtils.kt
index f3d4b75..937106e 100644
--- a/src/main/kotlin/com/amazon/opendistroforelasticsearch/indexmanagement/rollup/util/RollupUtils.kt
+++ b/src/main/kotlin/com/amazon/opendistroforelasticsearch/indexmanagement/rollup/util/RollupUtils.kt
@@ -254,7 +254,7 @@ fun Rollup.rewriteAggregationBuilder(aggregationBuilder: AggregationBuilder): Ag
.combineScript(Script(ScriptType.INLINE, Script.DEFAULT_SCRIPT_LANG,
"def d = new long[2]; d[0] = state.sums; d[1] = state.counts; return d", emptyMap()))
.reduceScript(Script(ScriptType.INLINE, Script.DEFAULT_SCRIPT_LANG,
- "double sum = 0; double count = 0; for (a in states) { sum += a[0]; count += a[1]; } return sum/count", emptyMap()))
+ "double sum = 0; double count = 0; for (a in states) { if(a!=null && a.length > 0) sum += a[0]; if(a!=null && a.length > 0) count += a[1]; } return count!=0?sum/count:0", emptyMap()))
}
is MaxAggregationBuilder -> {
MaxAggregationBuilder(aggregationBuilder.name)
for avg case :
[root@fmha1 opendistro-index-management]# curl -XPOST "localhost:9200/temp*/_search?pretty" -H 'Content-Type: application/json' -d'{"size":0,"aggregations":{"daily_numbers":{"terms":{"field":"portIdToClusterId"},"aggregations":{"Sub_dateHistogramAgg":{"date_histogram":{"field":"timestamp","missing":0,"fixed_interval":"5m","offset":0,"order":{"_key":"asc"},"keyed":false,"min_doc_count":0},"aggregations":{"maxAggportRxPackets":{"avg":{"field":"port.rx.packets"}}}}}}}}'
{
"error" : {
"root_cause" : [ ],
"type" : "search_phase_execution_exception",
"reason" : "",
"phase" : "fetch",
"grouped" : true,
"failed_shards" : [ ],
"caused_by" : {
"type" : "script_exception",
"reason" : "runtime error",
"script_stack" : [
"sum += a[0]; ",
"^---- HERE"
],
"script" : "double sum = 0; double count = 0; for (a in states) { sum += a[0]; count += a[1]; } return sum/count",
"lang" : "painless",
"position" : {
"offset" : 54,
"start" : 54,
"end" : 67
},
"caused_by" : {
"type" : "null_pointer_exception",
"reason" : null
}
}
},
"status" : 400
}
Expected behavior
when we have empty buckets in the data histo we need to ignore them and return some value 0 or NA.
OpenDistro version : 1.13.2.0