Skip to content

Support to Qwen3 embedding and reranker models and chat template #3346

@Quallyjiang

Description

@Quallyjiang

Qwen just published embedding and reranker models, performance is very good and these models support multi-language quite well. When could we support these models in openvino model server?

Also, both models support chat template and requires instructions while current embedding and rerank api doesn't support these inputs and only take raw strings. That means application should apply chat template by themselves. That's a quite tight coupling between application and the model, although we have a model server. Could model server take care of applying chat template instead?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions