File tree Expand file tree Collapse file tree 2 files changed +6
-0
lines changed
Expand file tree Collapse file tree 2 files changed +6
-0
lines changed Original file line number Diff line number Diff line change 77- Latency (ms/token)
88- Time to First Token (TTFT)
99- Time Between Output Tokens (TBOT)
10+
11+ Pulled from the lightning-thunder repo. Reference:
12+ https://github.com/Lightning-AI/lightning-thunder/blob/4d3a3c3a7481efdc6a23cdeea99c3ffd31af5e78/thunder/benchmarks/benchmark_inference.py
1013"""
1114
1215# fmt: off
Original file line number Diff line number Diff line change 1111# SPDX-License-Identifier: BSD-3-Clause
1212#
1313# NOTE: `pytorch_nvfp4_quantize` and `linear_to_swizzled_128_4` are copied from NVIDIA's Fuser's test code.
14+ #
15+ # Pulled from the lightning-thunder repo. Reference:
16+ # https://github.com/Lightning-AI/lightning-thunder/blob/4d3a3c3a7481efdc6a23cdeea99c3ffd31af5e78/thunder/benchmarks/layers_for_inference_benchmark.py
1417
1518# fmt: off
1619
You can’t perform that action at this time.
0 commit comments