Skip to content

fix metrics#4410

Open
CUHKSZzxy wants to merge 5 commits intoInternLM:mainfrom
CUHKSZzxy:fix-metrics
Open

fix metrics#4410
CUHKSZzxy wants to merge 5 commits intoInternLM:mainfrom
CUHKSZzxy:fix-metrics

Conversation

@CUHKSZzxy
Copy link
Collaborator

@CUHKSZzxy CUHKSZzxy commented Mar 13, 2026

Fine-grained metrics calculation:

Dataflow: client --> API server --> Engine core

API server request states (axis view):
|<──────────────────────────────── total ────────────────────────────────>|
|<──────────── completed ─────────────>|<────── uncompleted ─────────────>|
|<─ success ─>|<──────── fail ────────>|<─ routed ─>|<───── waiting ─────>|
              |<cancel>|<abort>|<error>|

Engine core request states (axis view):
|<────────────────── routed ──────────────────>|
|<───── running ──────>|<────── waiting ──────>|

Now it looks like:

[2026-03-13 14:54:59 DP0] Avg thr (in/out): 1445.8 / 68.8 tokens/s, Server (succeeded/failed/routed/waiting): 102 / 100 / 0 / 0, Engine (running/waiting): 0 / 0, KV cache: 17.5%,

Copilot AI review requested due to automatic review settings March 13, 2026 06:50
@CUHKSZzxy CUHKSZzxy requested a review from lvhan028 March 13, 2026 06:51
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the metrics system to provide fine-grained request state tracking. Instead of a single "completed" counter, requests are now categorized as succeeded, cancelled, or aborted. The num_api_waiting_reqs is now a computed property derived from total - completed - routed rather than being manually calculated in multiple places.

Changes:

  • Replaces num_completed_reqs with num_succeeded_reqs, num_cancelled_reqs, and num_aborted_reqs in SchedulerStats, with computed properties for derived metrics (num_failed_reqs, num_completed_reqs, num_uncompleted_reqs, num_api_waiting_reqs).
  • Updates MetricsProcessor to expose increase_succeeded_requests, increase_cancelled_requests, and increase_aborted_requests methods, and updates call sites in async_engine.py.
  • Updates Prometheus gauges and log messages to report the new fine-grained metrics.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
lmdeploy/metrics/stats.py Replaces num_completed_reqs field with fine-grained counters and computed properties
lmdeploy/metrics/metrics_processor.py Adds increase_succeeded_requests, increase_cancelled_requests, increase_aborted_requests methods
lmdeploy/serve/core/async_engine.py Updates call sites: cancelled in exception handler, aborted for pre-start abort, succeeded after generator loop
lmdeploy/metrics/loggers.py Updates log format and Prometheus gauges for new metric names

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants