Open
Description
This line in com.github.tjake.jlama.model.AbstractModel#generate checks the encoded prompt length against ntokens and throws an exception.
With tokens I want to control the amount of output tokens, but the input should only be limited by allowed the context length I think
Preconditions.checkArgument(encoded.length < c.contextLength && encoded.length < ntokens, "Prompt exceeds max tokens");
Edit: There is also no way to abort generation early, so even if I set the tokens high, I can't stop the model from generating unecessary tokens.
Metadata
Metadata
Assignees
Labels
No labels