Open
Description
Your current environment
The output of `python collect_env.py`
How would you like to use vllm
I know that the variable Metrics is deprecated in the output of a model generation. But I absolutely need to have them (it is always at None now). Is there any way to be able to get the Metrics ? It involves metrics related to throughput, time to first token, etc... and should be related to each generation request and not a general profiling.
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.