[Feature]: Compute and log the serving FLOPs #3490

zhuohan123 · 2024-03-19T07:29:03Z

🚀 The feature, motivation and pitch

vLLM should output the serving FLOPs. This should be helpful for debugging performance and check the GPU utilization.

Alternatives

No response

Additional context

No response

ayusher · 2024-04-02T03:34:23Z

Happy to take this up!

gardberg · 2024-04-04T16:57:24Z

@ayusher Just out of curiosity, how would you go about doing this? Let's take the simplest case of a single non-sharded model running on a machine. I assume we want to log the actual FLOPs / MACs (multiply accumulates) that the hardware is doing, and not estimate it from the modules of the model e.g. via profiling or just theoretically counting per-module, as this does not take into consideration the kernel implementation of the module (different implementations of attention in e.g. cuda can require different amount of FLOPs, right?). What would be the right way of doing this, where we also do not introduce too much overhead?

ayusher · 2024-04-26T18:12:10Z

Yeah, after further research this looks like a more involved problem than I initially anticipated. I'm not sure on the best approach or if this is worth considering right now.

gardberg · 2024-04-26T18:27:45Z

Yeah I agree. I know that some nvidia gpu libraries enable you to get flops, but does not seem like a trivial problem (not a good first issue I guess lol). I'd love to help if we find a manageable way though!

rakshithvasudev · 2024-08-24T03:43:52Z

Thank you all!

Are the flops required to be accurate? Is it okay if the FLOPS numbers are a close estimation?

As discussed by others in this thread it's not very straightforward to calculate FLOPS without digging into internals. That may also add overheads during inference. If okay with a close estimation, I can try to implement a basic FLOPS calculator.

github-actions · 2024-11-23T02:02:19Z

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!

github-actions · 2024-12-24T02:01:15Z

This issue has been automatically closed due to inactivity. Please feel free to reopen if you feel it is still relevant. Thank you!

krtkvrm · 2025-01-29T09:00:04Z

can I take this up?

gangula-karthik · 2025-04-17T07:44:13Z

hi @zhuohan123 is this issue still open to work on ?

duhaode520 · 2025-04-30T16:08:51Z

@zhuohan123 @robertgshaw2-redhat @WoosukKwon Hi, is this issue still open? I'd like to take this if it is still needed.

plops655 · 2025-05-19T08:59:36Z

@duhaode520 Do you have any ideas as to how to approach this?

sysradium · 2025-06-06T17:49:45Z

I have tried addressing this in my PR. Would be nice if anyone could have a look at it.

zhuohan123 added good first issue Good for newcomers feature request New feature or request labels Mar 19, 2024

github-actions bot added the stale Over 90 days of inactivity label Nov 23, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 24, 2024

simon-mo reopened this Dec 27, 2024

simon-mo removed the stale Over 90 days of inactivity label Dec 27, 2024

dianastea mentioned this issue Jan 24, 2025

FLOP counting for vLLM inference #12341

Draft

zhuohan123 added this to Onboarding Tasks Mar 6, 2025

zhuohan123 moved this to Todo in Onboarding Tasks Mar 6, 2025

sysradium linked a pull request Jun 6, 2025 that will close this issue

[Metrics] Compute and log the serving FLOPs #19290

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Compute and log the serving FLOPs #3490

[Feature]: Compute and log the serving FLOPs #3490

zhuohan123 commented Mar 19, 2024

ayusher commented Apr 2, 2024

Uh oh!

gardberg commented Apr 4, 2024 •

edited

Loading

Uh oh!

ayusher commented Apr 26, 2024

Uh oh!

gardberg commented Apr 26, 2024

Uh oh!

rakshithvasudev commented Aug 24, 2024

Uh oh!

github-actions bot commented Nov 23, 2024

Uh oh!

github-actions bot commented Dec 24, 2024

Uh oh!

krtkvrm commented Jan 29, 2025

Uh oh!

gangula-karthik commented Apr 17, 2025

Uh oh!

duhaode520 commented Apr 30, 2025

Uh oh!

plops655 commented May 19, 2025

Uh oh!

sysradium commented Jun 6, 2025

Uh oh!

Uh oh!

[Feature]: Compute and log the serving FLOPs #3490

[Feature]: Compute and log the serving FLOPs #3490

Comments

zhuohan123 commented Mar 19, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

ayusher commented Apr 2, 2024

Uh oh!

gardberg commented Apr 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ayusher commented Apr 26, 2024

Uh oh!

gardberg commented Apr 26, 2024

Uh oh!

rakshithvasudev commented Aug 24, 2024

Uh oh!

github-actions bot commented Nov 23, 2024

Uh oh!

github-actions bot commented Dec 24, 2024

Uh oh!

krtkvrm commented Jan 29, 2025

Uh oh!

gangula-karthik commented Apr 17, 2025

Uh oh!

duhaode520 commented Apr 30, 2025

Uh oh!

plops655 commented May 19, 2025

Uh oh!

sysradium commented Jun 6, 2025

Uh oh!

gardberg commented Apr 4, 2024 •

edited

Loading