haystack/components/generators/hugging_face_local.py "stop_words" not behaving as expected #8672

sephcodes · 2024-12-24T21:05:35Z

Describe the bug
according to the documentation: "param stop_words: If the model generates a stop word, the generation stops."
however, stop_words is just being used to remove whatever is defined as a stopword from the final output. It is not stopping generation when a stop word is encountered.
If possible, a way to stop generation using stop words would be useful

Error message
N/A

Expected behavior
Expecting the stop_words argument to stop generation of output when a value in stop_words list is encountered

Additional context
N/A

To Reproduce
generator = self.generator(model="HuggingFaceTB/SmolLM-1.7B-Instruct",
task="text-generation",
stop_words=self.stop_words,
generation_kwargs={
"max_new_tokens": 150,
# "do_sample": False,
"do_sample": True,
"temperature": 0.5,
# "top_p": 0.9,
# "eos_token_id": self.tokenizer.eos_token_id,
# "stopping_criteria": self.stopping_criteria
})
whatever is defined in self.stop_words will just be dropped from the output but not stop generation of text as expected

FAQ Check
N/A

System:
All

vblagoje · 2025-01-16T13:01:26Z

Can you share the smallest reproducible example @sephcodes ? A notebook, python script, whatever suits you the most.

sephcodes · 2025-01-16T14:01:55Z

using the below class: (note that I added withdraw as a stop-word to show the effect of it being dropped rather than stopping on it like the documentation says it should

`

class GenerateAnswer():
def init(self, time, generator, tokenizer, stop_words=None):
self.time = time
self.generator = generator
self.tokenizer = tokenizer
self.stop_words = stop_words if stop_words is not None else ["withdraw", "system", "Human:", "Assistant:", "\n\n", "Question", "Answer"]
self.stop_ids = [self.tokenizer(stop_word, add_special_tokens=False).input_ids for stop_word in self.stop_words]

  def generate_answer(self, prompt_template, question, context):
      start_time = self.time.time()
      prompt = prompt_template.format(context=context, question=question)
      checkpoint_1 = self.time.time() - start_time
      print(f"Checkpoint 1: {checkpoint_1:.4f} seconds")

      start_time = self.time.time()
      generator = self.generator(model="HuggingFaceTB/SmolLM-1.7B-Instruct",
                                      task="text-generation",
                                      stop_words=self.stop_words,
                                      generation_kwargs={
                                          "max_new_tokens": 150,
                                          "do_sample": True,
                                          "temperature": 0.5,
                                      })
      checkpoint_2 = self.time.time() - start_time
      print(f"Checkpoint 2: {checkpoint_2:.4f} seconds")

      start_time = self.time.time()
      generator.warm_up()
      checkpoint_3 = self.time.time() - start_time
      print(f"Checkpoint 3: {checkpoint_3:.4f} seconds")

      start_time = self.time.time()
      response = generator.run(prompt)
      checkpoint_4 = self.time.time() - start_time
      print(f"Checkpoint 4: {checkpoint_4:.4f} seconds")

      replies = response['replies']

      generate_time = sum([checkpoint_1, checkpoint_2, checkpoint_3, checkpoint_4])
      print(f"Total answer generation time: {generate_time:.4f} seconds")
      print(replies)
      return replies

`

how it's called in my notebook:
`
from transformers import AutoTokenizer, AutoModel
from haystack.components.generators import HuggingFaceLocalGenerator
AutoTokenizer.from_pretrained("BAAI/bge-small-en")
generate_answer = GenerateAnswer(time, HuggingFaceLocalGenerator, tokenizer)

prompt_template = """

You are a helpful AI assistant. You are given context and a question.
You must answer the question briefly and concisely using the information given in the context only as reference.
Summarize the answer IN YOUR OWN WORDS and STOP IMMEDIATELY after answering the question.
If you do not know the answer say 'I do not know!'.

Context: {context}

Question: {question}

Answer:
"""

question = "How do I withdraw a case"

context = get_relevant_context.get_relevant_context(question, embeddings, embedder.embed_text, dataset)

output: 'Can I withdraw a case?\nYes, you can withdraw a case by visiting the Case Details page in the Submitted Cases table. "Withdraw" is one of the case actions available for submitted cases that are in the following statuses:\nIn Process (all case types)\nRFI Issued (Prevailing Wage cases)\nNOD Issued (Temporary Labor cases)\nAccepted - Pending Recruitment (Temporary Labor cases)\n'

context_text = " ".join(context)
answer = generate_answer.generate_answer(prompt_template, question, context_text)

output: ['You can a case by visiting the Case Details page.....

`

Notice the extra space where the word 'withdraw' was in the context I gave the generator. This indicates that the model dropped it as a stop word rather than stopping generation upon encountering it.

Lmk if any other info is needed

vblagoje · 2025-01-17T14:03:12Z

@sephcodes I don't notice it. Would you please make an isolated example in a notebook preferably that uses minimal code demonstrating the issue? If there is an issue - I'll gladly solve it. 🙏

julian-risch added the P2 Medium priority, add to the next sprint if no P1 available label Jan 2, 2025

julian-risch assigned vblagoje Jan 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

haystack/components/generators/hugging_face_local.py "stop_words" not behaving as expected #8672

haystack/components/generators/hugging_face_local.py "stop_words" not behaving as expected #8672

sephcodes commented Dec 24, 2024

vblagoje commented Jan 16, 2025

sephcodes commented Jan 16, 2025 •

edited

Loading

vblagoje commented Jan 17, 2025 •

edited

Loading

haystack/components/generators/hugging_face_local.py "stop_words" not behaving as expected #8672

haystack/components/generators/hugging_face_local.py "stop_words" not behaving as expected #8672

Comments

sephcodes commented Dec 24, 2024

vblagoje commented Jan 16, 2025

sephcodes commented Jan 16, 2025 • edited Loading

output: ['You can a case by visiting the Case Details page.....

vblagoje commented Jan 17, 2025 • edited Loading

sephcodes commented Jan 16, 2025 •

edited

Loading

vblagoje commented Jan 17, 2025 •

edited

Loading