Open
Description
Let's collect remaining issues we are aware of related to sampler performance
- Small regression (1 req / sec drop from
benchmark_throughput.py
) after Sampler Throughput Optimization #192 when only greedy sampling is used. - Logprobs, and JSON are extremely slow
Metadata
Metadata
Assignees
Labels
No labels