Integrate `SparseEncoder` model from SentenceTransformers `v5`

Hello!

## Context
We released Sentence Transformers v5 yesterday, introducing a new `SparseEncoder` model category. These are a subclass of `SentenceTransformer` that output sparse vectors instead of dense embeddings.

## Feature Description
Add support for `SparseEncoder` models in MTEB. I tested the current version and found one main blocking issue in the `Sentence TransformerWrapper` to make it compatible : https://github.com/embeddings-benchmark/mteb/blob/f27648baec2086389b162b2b94026d2ed6325416/mteb/models/sentence_transformer_wrapper.py#L114

with the .numpy (applied on a sparse tensor it's breaking and not wanted anyway)

## Questions
1. Is there a specific reason for the `.numpy()` conversion? Can we make it conditional for sparse models?
2. For similarity computation, sparse models have an optimized `similarity` function that should be use this instead of MTEB's standard similarity functions, is it gonna be always the case ? Because if dense way of computing the similarity is used it will broke or be really long.
3. Since no sparse indexes exist in MTEB, encoding by chunk would be the best option to make sure the similarity compute is reasonable but is it plan to handle other indexes ? 

This would probably be part of a bigger refactor of the Sentence Transformers handling with https://github.com/embeddings-benchmark/mteb/issues/2871 and other possible feature.

Would love to hear your thoughts thist. 

cc @tomaarsen  

Arthur Bresnu

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integrate `SparseEncoder` model from SentenceTransformers `v5` #2873

Context

Feature Description

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integrate SparseEncoder model from SentenceTransformers v5 #2873

Description

Context

Feature Description

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Integrate `SparseEncoder` model from SentenceTransformers `v5` #2873