
As this figure shows, you asked GPT to return an integer score between 0 and 5, however in the example you gave, the score was 4.8, which is a floating-point number. Is it okay to do that? And the second question: when evaluating on the four datasets MSVD, MSRVTT, ActivityNet, and TGIF, is it necessary to call OpenAI's API key? Can I use Qwen-Max as an alternative?