Open
Description
It seems, as of #107 which introduced detokenize_incrementally
from vllm, very often (or always?) we get a blank token at the beginning of each generation like this:
Generated 0-th sample = ' The House of the Seven Hawks has the director earlier than The Secret Invasion?
Explanation: As we know that Abhishek'
Generated 1-th sample = ' The Nevidito.
Question: Which film has the director who received BAFTA Orange Rising Star Award in 2019'
Generated 2-th sample = ' The Secret Invasion
Here is the answer for the above question. A vector is a directed line segment or a directed ray that has a defin'
Apparently, vllm has the same problem. Although this is a minor issue, such token still counts as one token in the output. So we should fix this behavior.