[RFC] Add upper bound parameter for min_max normalization technique

## Introduction
This document discusses the design of the upper bound feature for min-max score normalization technique in OpenSearch's hybrid search capability, complementing the existing [lower bound feature](https://github.com/opensearch-project/neural-search/issues/1189).

## Problem Statement
The current min-max normalization can produce misleading relevancy scores when the theoretical maximum score is known but differs from the actual maximum score in the result set. In neural/k-NN search scenarios where scores have known theoretical bounds (e.g., [0.75, 1.0]), the current normalization can overstate document relevance by normalizing to the actual maximum score rather than the theoretical maximum. Users who need more precise control over score normalization can use the upper bound feature to improve the relevance of their results.

## Requirements

### Functional Requirements
1. Support configurable upper bounds at sub-query level
2. Provide a way to define a score for upper bound, which can be ignored if needed.
3. Allow independent upper bound configuration for each sub-query
4.  Ensure proper interaction with lower bound feature while maintaining its existing behavior

### Non-Functional Requirements
1. Minimal performance impact on score normalization

## Current State
The min-max normalization technique currently:

* Uses actual retrieved scores to find minimum and maximum scores for normalization
* Has a lower bound feature implemented through LowerBound class with an inner Mode enum (APPLY, CLIP, IGNORE)
* Contains bound-related logic directly within the normalization class

### Current Score Calculation Formula
`normalized_score = (score - min_score) / (max_score - min_score)`

Note: `min_score` is changed depending on the `LowerBound.Mode` being used

### Example

<img width="3509" height="2206" alt="Image" src="https://github.com/user-attachments/assets/65ef67f1-f1a7-4807-b8cd-0da07ae109f5" />

In the example above, consider a scenario where scores theoretically range from 0.0 to 1.0. When a query returns scores [0.75, 0.76, 0.77], the current normalization process treats:

* 0.75 as the minimum, normalizing it to 0.0
* 0.77 as the maximum, normalizing it to 1.0
* 0.76 as the midpoint, normalizing it to 0.5

While the existing lower bound feature can address score distortion at the lower end by setting a minimum threshold, there is no equivalent mechanism for the upper end. This creates a significant distortion in relevancy representation. Despite all scores being clustered between 0.75-0.77, the normalization spreads them across the entire range from 0.0 to 1.0, suggesting much larger relevancy differences than actually exist. The current implementation lacks the ability to fully contextualize these scores within their theoretical range, where they all represent highly relevant documents with scores close to the maximum possible value of 1.0.

## Solution HLD

### Proposed Solution

<img width="2345" height="1532" alt="Image" src="https://github.com/user-attachments/assets/65f05b3d-5dbb-4ebe-ace7-f8a3c5c24f47" />

The proposed solution introduces an upper bound feature to complement the existing lower bound functionality in the min-max score normalization technique. This will be achieved through the following architectural changes:

1. Abstract Base Class: Create a new ScoreBound abstract class to encapsulate common behavior for both upper and lower bounds.
2. Bound Mode Enum: Extract the existing LowerBound.Mode into a standalone BoundMode enum to be used by both bound types.
3. Upper Bound Implementation: Introduce a new UpperBound class extending ScoreBound to handle upper bound logic.
4. Refactor Existing Lower Bound: Modify the LowerBound class to extend ScoreBound and use the new BoundMode enum.
5. Enhanced Normalization Technique: Update MinMaxScoreNormalizationTechnique to support both upper and lower bounds using a common interface.

### API Configuration
```
{
  "normalization": {
    "technique": "min_max",
    "parameters": {
      "lower_bounds": [
        { 
          "mode": "apply",
          "min_score": 0.0
        },
        { 
          "mode": "clip",
          "min_score": 0.0
        },
        {
          "mode": "ignore"
        }
      ],
      "upper_bounds": [
        {
          "mode": "apply",
          "max_score": 1.0
        },
        {
          "mode": "clip",
          "max_score": 1.0
        },
        {
          "mode": "ignore"
        }
      ]
    }
  }
}
```

### Key Design Decisions

**Standalone Bound Mode Enum**
* Decision: Extract Mode from LowerBound into a separate BoundMode enum
* Rationale: Allows shared use between upper and lower bounds, improving consistency and maintainability

**Symmetrical Upper Bound Implementation**
* Decision: Implement UpperBound similarly to LowerBound
* Rationale: Provides a consistent API and behavior for users, simplifying understanding and usage

**Minimal Changes to Existing API**
* Decision: Extend the current configuration structure by adding upper_bounds alongside lower_bounds, without modifying the existing lower_bounds structure or behavior
* Rationale: Addresses the functional requirement to maintain current functionality for lower bounds. Ensures proper interaction between upper and lower bounds while preserving existing lower bound behavior, allowing users to adopt the new feature without impacting their current queries

**Bound Processing in Normalization Technique**
* Decision: Process both bounds within the normalizeSingleScore method
* Rationale: Centralizes bound logic, ensuring correct interaction between upper and lower bounds

## Solution LLD

<img width="2345" height="1532" alt="Image" src="https://github.com/user-attachments/assets/cbe55476-e330-4a9c-9fb3-f5025bea6cc1" />

### New Score Calculation Formula

`normalized_score = (score - effective_min_score) / (effective_max_score - effective_min_score)`

## Preliminary Benchmarking

Initial benchmarking shows improvements in relevance metrics when using bounds in some scenarios. Here are two examples:

### Example 1: Upper Bounds (nfcorpus dataset)

Metric | Default | With Upper Bound | Improvement
-- | -- | -- | --
NDCG@5 | 0.3343 | 0.3379 | 1.10%
NDCG@10 | 0.303 | 0.3017 | -0.40%
NDCG@100 | 0.2671 | 0.2691 | 0.70%

### Example 2: Combined Lower/Upper Bounds (TREC-COVID dataset)

Metric | Default | With Bounds | Improvement
-- | -- | -- | --
NDCG@5 | 0.6025 | 0.6707 | 11.30%
NDCG@10 | 0.5518 | 0.6218 | 12.70%
NDCG@100 | 0.3859 | 0.4318 | 11.90%

Note: These results are from specific test configurations. Results may vary depending on the nature of queries, index settings, and characteristics of the dataset

## Testing
**Unit Tests:**
* Upper bound configuration parsing
* Score normalization with different modes
* Integration with lower bounds
* Edge cases and error conditions

**Integration Tests:**
* All three upper bound modes
* Integration with lower bounds

## Community Feedback
We appreciate all feedback from the community on this RFC. In addition, we are particularly interested in your thoughts on the following questions:

1. Would you prefer additional configuration options beyond what's proposed?
2. How should the system behave when both upper and lower bounds are specified in potentially conflicting ways?
3. How would you combine this with other scoring techniques in your current implementations?
4. What types of examples would help you understand when and how to use upper bounds effectively?

Please share your feedback through comments on this RFC, GitHub issues, or pull requests with proposed changes.

## References
* Feature Request: https://github.com/opensearch-project/neural-search/issues/1210
* Lower Bound RFC: https://github.com/opensearch-project/neural-search/issues/1189

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Add upper bound parameter for min_max normalization technique #1440

Introduction

Problem Statement

Requirements

Functional Requirements

Non-Functional Requirements

Current State

Current Score Calculation Formula

Example

Solution HLD

Proposed Solution

API Configuration

Key Design Decisions

Solution LLD

New Score Calculation Formula

Preliminary Benchmarking

Example 1: Upper Bounds (nfcorpus dataset)

Example 2: Combined Lower/Upper Bounds (TREC-COVID dataset)

Testing

Community Feedback

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Metric	Default	With Upper Bound	Improvement
NDCG@5	0.3343	0.3379	1.10%
NDCG@10	0.303	0.3017	-0.40%
NDCG@100	0.2671	0.2691	0.70%

Metric	Default	With Bounds	Improvement
NDCG@5	0.6025	0.6707	11.30%
NDCG@10	0.5518	0.6218	12.70%
NDCG@100	0.3859	0.4318	11.90%

[RFC] Add upper bound parameter for min_max normalization technique #1440

Description

Introduction

Problem Statement

Requirements

Functional Requirements

Non-Functional Requirements

Current State

Current Score Calculation Formula

Example

Solution HLD

Proposed Solution

API Configuration

Key Design Decisions

Solution LLD

New Score Calculation Formula

Preliminary Benchmarking

Example 1: Upper Bounds (nfcorpus dataset)

Example 2: Combined Lower/Upper Bounds (TREC-COVID dataset)

Testing

Community Feedback

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions