one collector per agg request instead per bucket #2759

PSeitz-dd · 2025-12-04T03:52:34Z

In this refactoring a collector knows in which bucket of the parent their data is in. This allows to convert the previous approach of one collector per bucket to one collector per request.

Add PagedTermMap as another TermAggregationMap to reduce memory usage compared to a HashMap

It contains an optimization for low cardinality bucket id
Remove Clone on the collector (we only have one instance now)

Future Work

Fetch all values for all buckets once per collector (currently each collect fetches their own data per bucket)
Improve perf of group by bucket id in caching layer
Improve low cardinality detection
Remove PerRequestAggSegCtx, we can store now everything in the collector

Performance

The heavy hitters are drastically reduced in terms of memory and CPU.
For term aggs with many terms, we use a lot less memory.
We use some more buffers to pass docs, which increases memory consumption for some aggs.

Biggest regression is
terms_zipf_1000_with_avg_sub_agg Avg: 9.1190ms (+39.79%)
Which should be fixed when we fetch all values for all buckets at once.

full
average_u64                                    Memory: 21.7 KB (-2.30%)      Avg: 2.9480ms (-3.64%)      Median: 2.9303ms (-3.69%)      [2.8115ms .. 3.1867ms]       
average_f64                                    Memory: 21.7 KB (-2.30%)      Avg: 3.1238ms (-3.17%)      Median: 3.1146ms (-2.94%)      [3.0220ms .. 3.3337ms]       
average_f64_u64                                Memory: 23.0 KB (-0.48%)      Avg: 5.7650ms (-1.43%)      Median: 5.7655ms (-0.97%)      [5.6324ms .. 6.0755ms]       
stats_f64                                      Memory: 21.8 KB (-2.29%)      Avg: 3.1220ms (-2.72%)      Median: 3.1131ms (-2.49%)      [3.0185ms .. 3.3488ms]       
extendedstats_f64                              Memory: 23.0 KB (+3.22%)      Avg: 3.2849ms (-1.79%)      Median: 3.2591ms (-1.38%)      [3.1558ms .. 3.5704ms]       
percentiles_f64                                Memory: 39.4 KB (+37.37%)     Avg: 6.9987ms (-3.80%)      Median: 6.9900ms (-2.82%)      [6.9396ms .. 7.1972ms]       
terms_7                                        Memory: 36.7 KB (+3.00%)      Avg: 2.3431ms (+0.96%)      Median: 2.3425ms (+1.63%)      [2.2808ms .. 2.4372ms]       
terms_all_unique                               Memory: 14.7 MB (-50.24%)     Avg: 6.9163ms (-58.72%)     Median: 6.8885ms (-57.92%)     [6.7612ms .. 7.3554ms]       
terms_150_000                                  Memory: 3.0 MB (-55.96%)      Avg: 6.2613ms (-37.68%)     Median: 6.2406ms (-35.99%)     [6.1016ms .. 6.4347ms]       
terms_many_top_1000                            Memory: 5.2 MB (-33.38%)      Avg: 9.3138ms (-29.50%)     Median: 9.2938ms (-27.81%)     [9.1127ms .. 9.8866ms]       
terms_many_order_by_term                       Memory: 3.0 MB (-55.96%)      Avg: 4.9930ms (-58.18%)     Median: 4.9816ms (-57.86%)     [4.8966ms .. 5.2057ms]       
terms_many_with_top_hits                       Memory: 50.0 MB (-11.50%)     Avg: 96.4471ms (-40.65%)    Median: 94.6858ms (-41.67%)    [91.5090ms .. 113.9275ms]    
terms_all_unique_with_avg_sub_agg              Memory: 56.7 MB (-39.11%)     Avg: 17.5857ms (-74.43%)    Median: 17.4918ms (-74.10%)    [17.0980ms .. 19.1589ms]     
terms_many_with_avg_sub_agg                    Memory: 13.5 MB (-34.50%)     Avg: 15.2693ms (-45.67%)    Median: 15.1739ms (-44.65%)    [14.9133ms .. 17.1725ms]     
terms_status_with_avg_sub_agg                  Memory: 101.3 KB (+65.18%)    Avg: 6.2418ms (+13.02%)     Median: 6.2257ms (+13.12%)     [6.1074ms .. 6.4710ms]       
terms_status_with_histogram                    Memory: 137.1 KB (+28.02%)    Avg: 6.1062ms (+12.70%)     Median: 6.0941ms (+13.20%)     [6.0223ms .. 6.3640ms]       
terms_zipf_1000                                Memory: 69.2 KB (-12.86%)     Avg: 2.2407ms (+5.06%)      Median: 2.2369ms (+5.27%)      [2.2020ms .. 2.3248ms]       
terms_zipf_1000_with_histogram                 Memory: 1.2 MB (+20.39%)      Avg: 24.0328ms (+5.64%)     Median: 23.9883ms (+5.50%)     [23.8058ms .. 25.1477ms]     
terms_zipf_1000_with_avg_sub_agg               Memory: 463.4 KB (+27.36%)    Avg: 9.1190ms (+39.79%)     Median: 9.0712ms (+41.72%)     [8.8888ms .. 9.8575ms]       
terms_many_json_mixed_type_with_avg_sub_agg    Memory: 20.6 MB (-20.72%)     Avg: 25.2443ms (-41.91%)    Median: 25.1315ms (-41.44%)    [24.6717ms .. 26.7138ms]     
cardinality_agg                                Memory: 3.7 MB (-0.01%)       Avg: 30.6559ms (+3.37%)     Median: 29.7482ms (+0.62%)     [28.5122ms .. 35.9346ms]     
terms_status_with_cardinality_agg              Memory: 5.5 MB (+0.78%)       Avg: 74.4157ms (+4.42%)     Median: 72.5500ms (+2.05%)     [69.0952ms .. 87.0695ms]     
range_agg                                      Memory: 25.2 KB (-5.33%)      Avg: 3.4807ms (+7.94%)      Median: 3.3977ms (+5.44%)      [3.2741ms .. 4.0961ms]       
range_agg_with_avg_sub_agg                     Memory: 95.5 KB (+81.82%)     Avg: 7.2579ms (+3.35%)      Median: 7.1267ms (+1.81%)      [6.9252ms .. 8.1604ms]       
range_agg_with_term_agg_status                 Memory: 109.4 KB (+62.59%)    Avg: 6.7002ms (-69.18%)     Median: 6.5399ms (-69.70%)     [6.3694ms .. 7.7095ms]       
range_agg_with_term_agg_many                   Memory: 6.9 MB (+0.32%)       Avg: 14.4014ms (-53.85%)    Median: 14.3027ms (-54.12%)    [13.6193ms .. 15.9501ms]     
histogram                                      Memory: 22.6 KB (+0.67%)      Avg: 3.1483ms (-2.77%)      Median: 3.1182ms (-2.91%)      [3.0445ms .. 3.3583ms]       
histogram_hard_bounds                          Memory: 20.6 KB (+3.46%)      Avg: 1.6982ms (+3.78%)      Median: 1.6709ms (+2.33%)      [1.6003ms .. 1.9916ms]       
histogram_with_avg_sub_agg                     Memory: 122.7 KB (+68.32%)    Avg: 9.6292ms (+5.66%)      Median: 9.5913ms (+6.64%)      [9.3812ms .. 9.9690ms]       
histogram_with_term_agg_status                 Memory: 492.5 KB (+4.78%)     Avg: 12.9491ms (-34.46%)    Median: 12.8700ms (-34.61%)    [12.6297ms .. 13.8100ms]     
avg_and_range_with_avg_sub_agg                 Memory: 82.5 KB (+96.81%)     Avg: 10.1590ms (+3.49%)     Median: 10.0878ms (+3.37%)     [9.9766ms .. 10.6046ms]      
filter_agg_all_query_count_agg                 Memory: 139.3 KB (+21.71%)    Avg: 4.5550ms (+16.74%)     Median: 4.5375ms (+16.39%)     [4.4388ms .. 4.8239ms]       
filter_agg_term_query_count_agg                Memory: 140.0 KB (+21.85%)    Avg: 7.0402ms (+9.54%)     Median: 7.0113ms (+12.72%)     [6.9111ms .. 7.3698ms]       
filter_agg_all_query_with_sub_aggs             Memory: 157.1 KB (+19.28%)    Avg: 9.6685ms (+7.96%)      Median: 9.6228ms (+8.09%)      [9.4179ms .. 10.1717ms]      
filter_agg_term_query_with_sub_aggs            Memory: 157.5 KB (+19.23%)    Avg: 12.1100ms (+5.88%)     Median: 12.0301ms (+5.82%)     [11.8841ms .. 12.4835ms]

In this refactoring a collector knows in which bucket of the parent their data is in. This allows to convert the previous approach of one collector per bucket to one collector per request. low card bucket optimization

use paged term map in term agg use special no sub agg term map impl

increase cache to 2048

remove clone move data in term req, single doc opt for stats

fulmicoton · 2025-12-11T12:55:59Z