Skip to content

Train multilingual pipeline with LLaMA embeddings #12790

Discussion options

You must be logged in to vote

That's a good question! Large language models like LLaMA and GPT-NeoX are generally used as generative models, i.e., models that accept a prompt as input and generate a completion for it. But architecturally, they are similar to other Transformer models such as BERT and can theoretically be used to produce dense representations/embeddings for downstream tasks such as tagging, parsing, entity recognition, etc.

Currently, we do not support their direct usage in spaCy pipelines outside of spacy-llm, which - as you correctly concluded - is a prompting component. However, we do have a couple of new libraries in development that we hope to release in the near future. These will serve as a good …

Replies: 1 comment 3 replies

Comment options

You must be logged in to vote
3 replies
@probavee
Comment options

@danieldk
Comment options

@probavee
Comment options

Answer selected by shadeMe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / tok2vec Feature: Token-to-vector layer and pretraining feat/llm Feature: LLMs (incl. spacy-llm)
3 participants