Skip to content

Fine-tuned Cross-encoder scores are lower than ms-marco zero-shot scores ? #156

@cramraj8

Description

@cramraj8

Hi @thakur-nandan, @nreimers

I am fine-tuning the either cross-encoder/ms-marco-electra-base or cross-encoder/ms-marco-MiniLM-L-12-v2 models on other IR collections (tree-covid or NQ). But the fine-tuned model scores are lower than zero-shot scores. I wonder if there's a domain shift in custom datasets or am I doing the training wrong ? I am using sentence-transformer cross-encoder APIs for training.

Since these pre-trained models trained in certain settings (hyper-parameters and model architecture with loss), are these models sensitive to those settings during fine-tuning as well ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions