-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
haystack/components/generators/hugging_face_local.py "stop_words" not behaving as expected #8672
Comments
Can you share the smallest reproducible example @sephcodes ? A notebook, python script, whatever suits you the most. |
using the below class: (note that I added withdraw as a stop-word to show the effect of it being dropped rather than stopping on it like the documentation says it should ` class GenerateAnswer():
` how it's called in my notebook: prompt_template = """ You are a helpful AI assistant. You are given context and a question. Context: {context} Question: {question} Answer: question = "How do I withdraw a case" context = get_relevant_context.get_relevant_context(question, embeddings, embedder.embed_text, dataset) output: 'Can I withdraw a case?\nYes, you can withdraw a case by visiting the Case Details page in the Submitted Cases table. "Withdraw" is one of the case actions available for submitted cases that are in the following statuses:\nIn Process (all case types)\nRFI Issued (Prevailing Wage cases)\nNOD Issued (Temporary Labor cases)\nAccepted - Pending Recruitment (Temporary Labor cases)\n'context_text = " ".join(context) output: ['You can a case by visiting the Case Details page.....` Notice the extra space where the word 'withdraw' was in the context I gave the generator. This indicates that the model dropped it as a stop word rather than stopping generation upon encountering it. Lmk if any other info is needed |
@sephcodes I don't notice it. Would you please make an isolated example in a notebook preferably that uses minimal code demonstrating the issue? If there is an issue - I'll gladly solve it. 🙏 |
Describe the bug
according to the documentation: "param stop_words: If the model generates a stop word, the generation stops."
however, stop_words is just being used to remove whatever is defined as a stopword from the final output. It is not stopping generation when a stop word is encountered.
If possible, a way to stop generation using stop words would be useful
Error message
N/A
Expected behavior
Expecting the stop_words argument to stop generation of output when a value in stop_words list is encountered
Additional context
N/A
To Reproduce
generator = self.generator(model="HuggingFaceTB/SmolLM-1.7B-Instruct",
task="text-generation",
stop_words=self.stop_words,
generation_kwargs={
"max_new_tokens": 150,
# "do_sample": False,
"do_sample": True,
"temperature": 0.5,
# "top_p": 0.9,
# "eos_token_id": self.tokenizer.eos_token_id,
# "stopping_criteria": self.stopping_criteria
})
whatever is defined in self.stop_words will just be dropped from the output but not stop generation of text as expected
FAQ Check
N/A
System:
All
The text was updated successfully, but these errors were encountered: