[Feature Request]: Change default behavior of ranker from intersection to union

### Describe the problem

```python
r1 = Knn(query=query)
r2 = Knn(query=query)
search = Search(
    rank=r1 + r2,
)
```

Each Knn operator will find two different sets of K documents. Depending on the value of `default` in the Knn operator, when evaluating `r1 + r2`, the executor will either return the union or the intersection of these two sets. By default, `default=None`, which makes the executor find the intersection.

I propose that we should make the default behavior to union.

## It's easy to stumble into this behavior

Suppose someone starts with normal Knn search, and then later adds another ranker to the expression

```python
search = Search(
    # rank=Knn(query=query),
    rank=Knn(query=query) + .3 * Knn(query=query, key="sparse_embedding"),
)
```

All of a sudden, the number of documents returned in the search keeps changing! Is there something wrong with my sparse vectors? It can be very confusing!

Imagine another scenario where I wanted to implement query expansion. This is what would happen to me:

```python
queries = llm_expand_query(query)
search = Search(
    rank=sum(Knn(query=q) for q in queries),
)
```

All of a sudden, the search API is returning 0 documents!

## Other

* This assumes precision >> recall. Intersection is good if you only want to show documents deemed relevant by multiple sources. Chroma is designed primarily for AI applications. LLMs are generally pretty lenient regarding precision nowadays, but recall is _very_ important. There is a large risk that relevant documents are not included in the intersected set.
* This is only theoretically good for hybrid search. In practice, I've found very little overlap between dense and hybrid search. Additionally, this does not make sense for things other than hybrid search - for example, you would not expect overlap if you were to use this for query expansion.
* I was a little suspicious if commutativity and associativity was broken because of this. In my tests it doesn't seem broken, but will our users worry about the order in which these rankers are combined?

### Describe the proposed solution

1. I wish we could simply change the default value of `Knn(default=None)` to the "lowest value", but when `use_rank=True`, it should be `f32.MAX` and when `use_rank=False`, it should be `f32.MIN`.
2.  This is important enough to me that I would silently change this behavior. Disable "intersection" mode and only allow union mode. When `default=None`, the executor treats `None` as the "lowest" value depending on the context.

### Alternatives considered

_No response_

### Importance

would make my life easier

### Additional Information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request]: Change default behavior of ranker from intersection to union #5852

Describe the problem

It's easy to stumble into this behavior

Other

Describe the proposed solution

Alternatives considered

Importance

Additional Information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request]: Change default behavior of ranker from intersection to union #5852

Description

Describe the problem

It's easy to stumble into this behavior

Other

Describe the proposed solution

Alternatives considered

Importance

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions