Skip to content

[processor/isolationforest] Add adaptive window sizing capability #42751

@VenuEmmadi

Description

@VenuEmmadi

Component(s)

processor/isolationforest

Is your feature request related to a problem? Please describe.

The current isolation forest processor uses fixed window sizes (ex:- training_window: 24h) which creates inefficiencies: memory waste during low-traffic periods, resource exhaustion during spikes, and suboptimal anomaly detection when data patterns change. Manual tuning is required for different environments and workloads.

Describe the solution you'd like

Implement adaptive window sizing that automatically adjusts based on:

Configuration:

# ─── NEW: adaptive window sizing ──────────────────────────────
    adaptive_window:
      # All parameters optional except 'enabled' - sensible defaults applied
      enabled:          true         # when false, uses static training_window: 24h
      min_window_size:  1000         # minimum samples to keep, should be ≥ min_samples for consistency
      max_window_size:  100000       # maximum samples (memory protection), auto-calculated from memory_limit if not specified
      memory_limit_mb:  256          # auto-shrink when exceeded, should be ≤ max_memory_mb (leave room for other components)
      adaptation_rate:  0.1          # adjustment speed (0.0-1.0)
      # Optional with defaults:
      velocity_threshold: 50        # default: grow when >50 samples/sec
      stability_check_interval: 5m  # default: check every 5 minutes

Core Algorithm:

  • Data velocity tracking: Measure samples/second to grow window during high traffic
  • Memory monitoring: Shrink window when approaching memory limits
  • Model stability: Expand window if anomaly detection accuracy drops
  • Gradual adjustment: Prevent thrashing with smooth size changes

Describe alternatives you've considered

  • Manual configuration profiles (dev/staging/prod) - requires maintenance
  • Time-based auto-scaling (hourly patterns) - less flexible than data-driven
  • External memory monitoring - adds deployment complexity
  • Fixed larger buffers - wasteful and doesn't solve core problem

Additional context

Research shows adaptive isolation forests achieve 99.22% accuracy vs static approaches. The sliding buffer technique with forget/learn mechanisms is proven effective for streaming anomaly detection. This enhancement builds on the existing comprehensive test suite and follows OpenTelemetry's configuration patterns.

Tip

React with 👍 to help prioritize this issue. Please use comments to provide useful context, avoiding +1 or me too, to help us triage it. Learn more here.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions