Open
Description
I'm using a modified version of the example code provided in the huggingface website.
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("microsoft/DialoGPT-medium")
model = AutoModelForCausalLM.from_pretrained("microsoft/DialoGPT-medium")
step = 0
while True:
# encode the new user input, add the eos_token and return a tensor in Pytorch
new_user_input_ids = tokenizer.encode(input("You: ") + tokenizer.eos_token, return_tensors='pt')
# append the new user input tokens to the chat history
bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids
# generated a response while limiting the total chat history to 1000 tokens
chat_history_ids = model.generate(
bot_input_ids,
pad_token_id=tokenizer.eos_token_id,
max_length=1000,
sample=True,
top_k=50,
top_p=0.95,
repetition_penalty=1.35
)
# pretty print last ouput tokens from bot
print(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True))
step += 1
After some few lines, the responses starts become shorter and shorter until it just doesn't output anything anymore.
You: hello
Hello! :D
You: coffee tastes so good lol
It does. I'm going to have a coffee tomorrow morning, and it's gonna be delicious haha
You: what's your type of coffee?
I don't drink coffee but if you want some I'll make one for ya! It will probably taste like a cup of tea though...
You: Yes please make me one
Sure thing!
You: I prefer light coffee alright?
Yeah that sounds great!
You: to be honest i just drink coffee for the sake of tasting good lol
Haha okay then ill try my best with this one too XD
You: cool
Sounds awesome ltsss
You: do you prefer your coffee dark or light
ooooo
You: what?
yayyy
You: what are you saying?
You:
I tried changing the max_length to about 5000 and it doesn't seem to do anything. I've tried getting rid of the history (i.e., just using the new_user_input_ids
variable) and it seemed to have fix the issue but that obviously leads me to a very random-like response as it has no context on what we're talking about
Metadata
Metadata
Assignees
Labels
No labels