-
Notifications
You must be signed in to change notification settings - Fork 448
Open
Description
Describe the bug
I just annotate stage 1 code to get rerank results by CrossEncoder directly, but the rerank results are very poor, code is like this:
from mteb import MTEB
import mteb
from sentence_transformers import CrossEncoder, SentenceTransformer
dual_encoder = SentenceTransformer("all-MiniLM-L6-v2")
cross_encoder = CrossEncoder("cross-encoder/ms-marco-TinyBERT-L-2-v2")
tasks = mteb.get_tasks(tasks=["NFCorpus"], languages=["eng"])
subset = "default" # subset name used in the NFCorpus dataset
eval_splits = ["test"]
evaluation = MTEB(tasks=tasks)
# evaluation.run(#no stage 1
# dual_encoder,
# eval_splits=eval_splits,
# save_predictions=True,
# output_folder="results/stage1",
# )
evaluation.run(
cross_encoder,
eval_splits=eval_splits,
# top_k=500,
save_predictions=True,
output_folder="results/stage2",
# previous_results=f"results/stage1/NFCorpus_{subset}_predictions.json", # get results directly by CrossEncoder
)
here is some of the stage 1 results when I made this change before:
"recall_at_1": 0.04323,
"recall_at_3": 0.09053,
"recall_at_5": 0.11959,
"recall_at_10": 0.15499,
"recall_at_20": 0.18891,
"recall_at_100": 0.31151,
"recall_at_1000": 0.63211,
"precision_at_1": 0.41486,
"precision_at_3": 0.34881,
"precision_at_5": 0.30279,
"precision_at_10": 0.24334,
"precision_at_20": 0.17817,
"precision_at_100": 0.07981,
"precision_at_1000": 0.02096,
and this is the result directly by CrossEncoder:
"recall_at_1": 0.00784,
"recall_at_3": 0.01523,
"recall_at_5": 0.01939,
"recall_at_10": 0.02783,
"recall_at_20": 0.03382,
"recall_at_100": 0.05049,
"recall_at_1000": 0.12993,
"precision_at_1": 0.13313,
"precision_at_3": 0.11765,
"precision_at_5": 0.10774,
"precision_at_10": 0.08545,
"precision_at_20": 0.0596,
"precision_at_100": 0.02201,
"precision_at_1000": 0.00699,
Running a single CrossEncoder , the results should be much better than a single SentenceTransformer,but it is not, so what is the problem?
Here is my environment:
sentence-transformers 5.0.0
mteb 1.38.35
To reproduce
I just use the official rerank code and annotate the stage 1(as the discription) to reproduce,here is the official rerank code:
from mteb import MTEB
import mteb
from sentence_transformers import CrossEncoder, SentenceTransformer
cross_encoder = CrossEncoder("cross-encoder/ms-marco-TinyBERT-L-2-v2")
dual_encoder = SentenceTransformer("all-MiniLM-L6-v2")
tasks = mteb.get_tasks(tasks=["NFCorpus"], languages=["eng"])
subset = "default" # subset name used in the NFCorpus dataset
eval_splits = ["test"]
evaluation = MTEB(tasks=tasks)
evaluation.run(
dual_encoder,
eval_splits=eval_splits,
save_predictions=True,
output_folder="results/stage1",
)
evaluation.run(
cross_encoder,
eval_splits=eval_splits,
top_k=5,
save_predictions=True,
output_folder="results/stage2",
previous_results=f"results/stage1/NFCorpus_{subset}_predictions.json",
)
Additional information
No response
Are you interested to contribute a fix for this bug?
No
KennethEnevoldsen
Metadata
Metadata
Assignees
Labels
No labels