Skip to content

Conversation

@jumpei527
Copy link
Contributor

@jumpei527 jumpei527 commented Sep 16, 2025

Description

This PR implements a new gRPC API for retrieving system resource statistics (CPU and memory usage) to enable better monitoring of Vald components.

What Changed
This implementation adds a ResourceStats gRPC service within the internal/net/grpc/stats package, providing comprehensive resource monitoring capabilities.

Related Issue

#3215

Versions

  • Vald Version: v1.7.17
  • Go Version: v1.24.5
  • Rust Version: v1.88.0
  • Docker Version: v28.3.2
  • Kubernetes Version: v1.33.3
  • Helm Version: v3.18.4
  • NGT Version: v2.4.3
  • Faiss Version: v1.11.0

Checklist

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Stats RPC service with ResourceStats method for querying CPU and memory resource statistics
    • New resource metrics payload including CPU limit/usage and memory limit/usage information
  • Documentation

    • Updated API documentation to reflect the new resource statistics service and data structures

@jumpei527 jumpei527 self-assigned this Sep 16, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 16, 2025

📝 Walkthrough

Walkthrough

A new Stats module is added to expose resource statistics (CPU and memory cgroup metrics) via gRPC service. The implementation spans proto definitions for payload types and the Stats service, Go code for cgroup measurement across v1/v2 modes, generated gRPC/Rust bindings, integration with the server, error definitions, tests, documentation, and CI workflow trigger updates.

Changes

Cohort / File(s) Summary
Proto API Definitions
apis/proto/v1/payload/payload.proto, apis/proto/v1/rpc/stats/stats.proto
Added ResourceStats and CgroupStats messages to payload; created new stats.proto with Stats gRPC service and ResourceStats RPC method returning resource usage data (CPU, memory limits and usage)
Generated gRPC Artifacts (Go)
apis/grpc/v1/rpc/stats/stats.pb.go, apis/grpc/v1/rpc/stats/stats_vtproto.pb.go
Generated Protocol Buffer and vtproto marshaling code for Stats service Go bindings
Swagger Documentation
apis/swagger/v1/rpc/stats/stats.swagger.json
Generated Swagger 2.0 definition for Stats RPC with InfoCgroupStats, v1InfoResourceStats, protobufAny, and rpcStatus models
Public API Documentation
apis/docs/v1/docs.md, apis/docs/v1/payload.md.tmpl
Added documentation for CgroupStats and ResourceStats payload types and Stats service with ResourceStats method; updated navigation and Table of Contents
Internal Go Implementation
internal/errors/stats.go
Added 14 public error constructors for cgroup-related failures (mode detection, sampling, metric parsing across v1/v2)
gRPC Stats Module (Go)
internal/net/grpc/stats/doc.go, internal/net/grpc/stats/stats.go, internal/net/grpc/stats/stats_test.go
Implemented Stats gRPC service with ResourceStats RPC; added cgroup metric measurement (v1/v2 detection, CPU/memory sampling, calculation); comprehensive test suite with 8 test functions
Server Integration
internal/servers/server/server.go
Added stats module import and call to stats.Register() during gRPC server initialization
Rust Bindings
rust/libs/proto/src/payload.v1.rs, rust/libs/proto/src/rpc.v1.tonic.rs
Generated Rust payload types (ResourceStats, CgroupStats) with prost Name trait implementations; generated Tonic gRPC client and server stubs for Stats service with compression and message size configuration
GitHub Actions Workflow Triggers
.github/workflows/dockers-*.yaml (16 files)
Added apis/grpc/v1/rpc/stats/*.go and related paths to push, pull_request, and pull_request_target triggers across agent/index/gateway/benchmark/discoverer/manager/readreplica workflows

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant StatsServer as Stats gRPC Server
    participant CgroupMetrics as Cgroup Measurement
    participant Response
    
    Client->>StatsServer: ResourceStats(Empty)
    activate StatsServer
    
    rect rgb(200, 220, 240)
    Note over StatsServer,CgroupMetrics: Collect Resource Stats
    StatsServer->>CgroupMetrics: measureCgroupStats()
    activate CgroupMetrics
    
    CgroupMetrics->>CgroupMetrics: detectCgroupMode()
    alt CGv2 Detected
        CgroupMetrics->>CgroupMetrics: readCgroupV2Metrics()
        Note over CgroupMetrics: Parse memory.current/max<br/>cpu.stat/cpu.max
    else CGv1 Detected
        CgroupMetrics->>CgroupMetrics: readCgroupV1Metrics()
        Note over CgroupMetrics: Parse memory_usage_in_bytes<br/>cpuacct.usage, cfs_quota_us
    end
    
    CgroupMetrics->>CgroupMetrics: Sample 1: collect metrics
    CgroupMetrics->>CgroupMetrics: Sleep 100ms
    CgroupMetrics->>CgroupMetrics: Sample 2: collect metrics
    CgroupMetrics->>CgroupMetrics: calculateCpuUsageCores(sample1, sample2)
    
    deactivate CgroupMetrics
    end
    
    StatsServer->>Response: Populate Info_ResourceStats<br/>name, ip, cgroupStats
    StatsServer-->>Client: Return ResourceStats
    deactivate StatsServer
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Breakdown: Generated proto/gRPC artifacts and workflow updates are low-effort repetitive changes. Core implementation complexity is moderate: the stats.go file contains logic for cgroup v1/v2 detection, dual-sample CPU measurement, metric parsing, and error handling. The test suite is comprehensive with 8 test functions covering detection, calculation, and async scenarios. The heterogeneity of files (Go, Rust, proto, workflows, docs) requires context switching but changes are cohesive within each layer.

Possibly related PRs

  • #2538: Modifies the same GitHub Actions workflow files to adjust CI trigger path filters, directly related to workflow path updates in this PR.

Suggested reviewers

  • kpango
  • datelier

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Add resource stats gRPC API" directly and accurately describes the primary objective of the changeset. The title clearly identifies what is being added (a resource stats gRPC API) and reflects the main purpose of the work, which is to implement a new gRPC service for retrieving system resource statistics including CPU and memory usage. The title is concise, specific, and uses meaningful language that communicates the key change to developers reviewing the history, without being vague or overly broad. While the changeset includes related changes such as payload type additions, Rust bindings, and workflow updates, the title appropriately focuses on the central feature—the gRPC API—which is the primary deliverable.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/internal/add-resource-stats-grpc-api

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8110c16 and a249751.

📒 Files selected for processing (1)
  • .gitfiles (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (46)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
  • GitHub Check: build / build (arm64, ubuntu-24.04-arm)
🔇 Additional comments (1)
.gitfiles (1)

209-210: LGTM! Stats feature files properly added to the manifest.

The new file entries are correctly organized, follow the project's naming conventions, and are properly sorted within their respective directory sections. The additions span the complete implementation stack: proto definitions, generated code, implementation, tests, documentation, and Rust bindings.

Also applies to: 239-239, 258-258, 1171-1171, 1400-1402, 2314-2314


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (7)
.github/workflows/dockers-gateway-mirror-image.yaml (1)

219-219: PR target: internal stats — duplicate of generator note

Same generator-update verification as noted in agent-ngt workflow.

.github/workflows/dockers-agent-faiss-image.yaml (1)

219-219: PR target: internal stats — duplicate of generator note

Matches earlier workflows; ensure generator template was updated.

.github/workflows/dockers-readreplica-rotate-image.yaml (1)

205-205: PR target: internal stats glob — duplicate of generator note

Confirm hack/docker/gen/main.go includes these paths.

.github/workflows/dockers-gateway-lb-image.yaml (1)

216-216: PR target: internal stats — duplicate of generator note

Same generator verification as above.

.github/workflows/dockers-index-creation-image.yaml (1)

207-207: PR target: internal stats glob — duplicate of generator note

Confirm the generator includes these entries.

.github/workflows/dockers-index-save-image.yaml (1)

207-207: PR target: internal stats glob — duplicate of generator note

Same generator-update verification applies.

.github/workflows/dockers-index-operator-image.yaml (1)

207-207: PR target: internal stats — duplicate of generator note

Ensure hack/docker/gen/main.go changes are committed to keep these globs on regeneration.

🧹 Nitpick comments (5)
internal/net/grpc/stats/doc.go (1)

17-18: Nit: capitalize “gRPC” in the package comment.

Tiny doc polish for GoDoc consistency.

-// Package stats provides generic functionality for grpc stats.
+// Package stats provides generic functionality for gRPC stats.
apis/proto/v1/payload/payload.proto (1)

651-657: Align field naming and add minimal validation to new ResourceStats.

Proto style in this file favors snake_case (e.g., app_name, cluster_ip). Consider renaming fields and adding basic validations while the API is new.

Apply:

 message ResourceStats {
-  string name = 1;
-  string ip = 2 [(buf.validate.field).string.ipv4 = true];
-  double cpuUsage = 3;
-  double memoryUsage = 4;
+  string name = 1 [(buf.validate.field).string.min_len = 1];
+  string ip = 2 [(buf.validate.field).string.ipv4 = true];
+  double cpu_usage = 3 [(buf.validate.field).double.gte = 0];     // define as ratio or cores; adjust if percent
+  double memory_usage = 4 [(buf.validate.field).double.gte = 0];  // define unit (bytes or ratio); adjust as needed
 }

If you intend percent values, also cap at 100 with lte = 100. If you prefer richer detail, you could reuse existing CPU/Memory messages to stay consistent with Info.Pod.

Please confirm the intended units (ratio/percent/cores; bytes/ratio) so we can finalize validation and docs updates (apis/docs, swagger, Go/Rust bindings).

apis/docs/v1/payload.md.tmpl (1)

1583-1601: Document units and semantics for ResourceStats fields

Please add concise descriptions so users know these are percentages and what the IP represents.

Apply this diff to enrich the field table:

 {{- define "_field:payload.v1.Info.ResourceStats" }}
   - Info.ResourceStats

     | field | type | label | description |
     | :---: | :--- | :---- | :---------- |
-    | name | string |  |  |
-    | ip | string |  |  |
-    | cpuUsage | double |  |  |
-    | memoryUsage | double |  |  |
+    | name | string |  | Hostname of the node/pod reporting stats. |
+    | ip | string |  | Local IP address (IPv4 or IPv6) of the node/pod; may be "unknown" if not resolvable. |
+    | cpuUsage | double |  | Host CPU utilization percentage [0-100], sampled over ~100 ms. |
+    | memoryUsage | double |  | System memory utilization percentage [0-100], computed as (MemTotal - MemAvailable)/MemTotal. |
 {{- end -}}
apis/docs/v1/docs.md (1)

44-45: Clarify Info.ResourceStats field descriptions and units

Add brief, user‑facing descriptions so consumers know these are host‑level percentages and what “ip” represents.

Apply this diff:

 ### Info.ResourceStats

-Represent the resource stats
+Represent host resource utilization percentages.

 | Field       | Type              | Label | Description |
 | ----------- | ----------------- | ----- | ----------- |
-| name        | [string](#string) |       |             |
-| ip          | [string](#string) |       |             |
-| cpuUsage    | [double](#double) |       |             |
-| memoryUsage | [double](#double) |       |             |
+| name        | [string](#string) |       | Hostname of the reporting node/pod. |
+| ip          | [string](#string) |       | Local IP address (IPv4/IPv6); may be "unknown". |
+| cpuUsage    | [double](#double) |       | CPU utilization [%] sampled over ~100 ms. |
+| memoryUsage | [double](#double) |       | System memory utilization [%] = (MemTotal - MemAvailable)/MemTotal. |

Also applies to: 567-579

apis/swagger/v1/rpc/stats/stats.swagger.json (1)

72-91: Add property descriptions to v1InfoResourceStats

Small UX win: describe each field and clarify units.

Apply this diff:

   "v1InfoResourceStats": {
     "type": "object",
     "properties": {
       "name": {
-        "type": "string"
+        "type": "string",
+        "description": "Hostname of the reporting node/pod."
       },
       "ip": {
-        "type": "string"
+        "type": "string",
+        "description": "Local IP address (IPv4 or IPv6); may be \"unknown\"."
       },
       "cpuUsage": {
         "type": "number",
-        "format": "double"
+        "format": "double",
+        "description": "CPU utilization percentage [0-100], sampled over ~100 ms."
       },
       "memoryUsage": {
         "type": "number",
-        "format": "double"
+        "format": "double",
+        "description": "System memory utilization percentage [0-100] = (MemTotal - MemAvailable)/MemTotal."
       }
     },
     "title": "Represent the resource stats"
   }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffcca50 and ba082d5.

⛔ Files ignored due to path filters (23)
  • apis/grpc/v1/agent/core/agent.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/agent/sidecar/sidecar.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/agent/sidecar/sidecar_vtproto.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/discoverer/discoverer.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/filter/egress/egress_filter.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/filter/ingress/ingress_filter.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/meta/meta.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/mirror/mirror.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/payload/payload.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/payload/payload.pb.json.go is excluded by !**/*.pb.json.go
  • apis/grpc/v1/payload/payload_vtproto.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/rpc/errdetails/error_details.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/rpc/stats/stats.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/rpc/stats/stats_vtproto.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/filter.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/flush.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/index.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/insert.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/object.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/remove.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/search.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/update.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
  • apis/grpc/v1/vald/upsert.pb.go is excluded by !**/*.pb.go, !**/*.pb.go
📒 Files selected for processing (30)
  • .gitfiles (5 hunks)
  • .github/workflows/dockers-agent-faiss-image.yaml (4 hunks)
  • .github/workflows/dockers-agent-ngt-image.yaml (4 hunks)
  • .github/workflows/dockers-agent-sidecar-image.yaml (4 hunks)
  • .github/workflows/dockers-benchmark-job-image.yaml (4 hunks)
  • .github/workflows/dockers-benchmark-operator-image.yaml (4 hunks)
  • .github/workflows/dockers-discoverer-k8s-image.yaml (4 hunks)
  • .github/workflows/dockers-gateway-filter-image.yaml (4 hunks)
  • .github/workflows/dockers-gateway-lb-image.yaml (4 hunks)
  • .github/workflows/dockers-gateway-mirror-image.yaml (4 hunks)
  • .github/workflows/dockers-index-correction-image.yaml (4 hunks)
  • .github/workflows/dockers-index-creation-image.yaml (4 hunks)
  • .github/workflows/dockers-index-deletion-image.yaml (4 hunks)
  • .github/workflows/dockers-index-exportation-image.yaml (4 hunks)
  • .github/workflows/dockers-index-operator-image.yaml (4 hunks)
  • .github/workflows/dockers-index-save-image.yaml (4 hunks)
  • .github/workflows/dockers-manager-index-image.yaml (4 hunks)
  • .github/workflows/dockers-readreplica-rotate-image.yaml (4 hunks)
  • CHANGELOG.md (4 hunks)
  • apis/docs/v1/docs.md (5 hunks)
  • apis/docs/v1/payload.md.tmpl (2 hunks)
  • apis/proto/v1/payload/payload.proto (1 hunks)
  • apis/proto/v1/rpc/stats/stats.proto (1 hunks)
  • apis/swagger/v1/rpc/stats/stats.swagger.json (1 hunks)
  • charts/vald/templates/gateway/ing.yaml (1 hunks)
  • internal/net/grpc/stats/doc.go (1 hunks)
  • internal/net/grpc/stats/stats.go (1 hunks)
  • internal/servers/server/server.go (2 hunks)
  • rust/libs/proto/src/payload.v1.rs (1 hunks)
  • rust/libs/proto/src/rpc.v1.tonic.rs (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2024-06-10T19:25:49.832Z
Learnt from: kpango
PR: vdaas/vald#2491
File: pkg/gateway/lb/handler/grpc/handler.go:306-316
Timestamp: 2024-06-10T19:25:49.832Z
Learning: User: kpango
PR: vdaas/vald#2491
File: pkg/gateway/lb/handler/grpc/handler.go:306-316
Timestamp: 2024-05-07T04:15:11.886Z
Learning: In this project, vtprotobuf is used, which provides enhanced methods for protobuf objects: CloneVT (object cloning), EqualVT (object comparison), SizeVT (returns object size), MarshalVT (faster version of proto.Marshal), and UnmarshalVT (faster version of proto.Unmarshal). These methods offer better performance than usual protobuf usage.

Applied to files:

  • .github/workflows/dockers-gateway-lb-image.yaml
  • .github/workflows/dockers-gateway-mirror-image.yaml
  • .gitfiles
  • .github/workflows/dockers-agent-ngt-image.yaml
  • .github/workflows/dockers-gateway-filter-image.yaml
📚 Learning: 2025-05-01T08:44:07.726Z
Learnt from: kpango
PR: vdaas/vald#0
File: :0-0
Timestamp: 2025-05-01T08:44:07.726Z
Learning: PR #2927 addressed gRPC connection panic issues by removing the singleflight.Group[pool.Conn] field from the gRPCClient struct in internal/net/grpc/client.go and introducing a centralized handleError function for consistent error processing in both Connect and Disconnect methods.

Applied to files:

  • .github/workflows/dockers-benchmark-job-image.yaml
🧬 Code graph analysis (3)
internal/servers/server/server.go (2)
internal/net/grpc/stats/stats.go (1)
  • Register (42-45)
internal/net/grpc/health/health.go (1)
  • Register (27-35)
internal/net/grpc/stats/stats.go (4)
internal/servers/server/server.go (2)
  • Server (47-52)
  • New (138-265)
internal/os/hostname.go (1)
  • Hostname (49-54)
internal/net/net.go (1)
  • LoadLocalIP (362-377)
internal/errors/errors.go (2)
  • Wrapf (112-120)
  • Wrap (99-107)
rust/libs/proto/src/payload.v1.rs (1)
rust/libs/proto/src/rpc.v1.rs (28)
  • full_name (48-48)
  • full_name (72-72)
  • full_name (87-87)
  • full_name (131-131)
  • full_name (136-136)
  • full_name (175-175)
  • full_name (180-180)
  • full_name (242-242)
  • full_name (247-247)
  • full_name (265-265)
  • full_name (295-295)
  • full_name (324-324)
  • full_name (329-329)
  • full_name (347-347)
  • type_url (48-48)
  • type_url (72-72)
  • type_url (87-87)
  • type_url (131-131)
  • type_url (136-136)
  • type_url (175-175)
  • type_url (180-180)
  • type_url (242-242)
  • type_url (247-247)
  • type_url (265-265)
  • type_url (295-295)
  • type_url (324-324)
  • type_url (329-329)
  • type_url (347-347)
🪛 Checkov (3.2.334)
apis/swagger/v1/rpc/stats/stats.swagger.json

[high] 1-93: Ensure that security requirement defined in securityDefinitions - version 2.0 files

(CKV_OPENAPI_6)


[high] 1-93: Ensure that the global security field has rules defined

(CKV_OPENAPI_4)


[high] 1-93: Ensure that security operations is not empty.

(CKV_OPENAPI_5)


[medium] 61-68: Ensure that arrays have a maximum number of items

(CKV_OPENAPI_21)


[high] 1-93: Ensure that securityDefinitions is defined and not empty - version 2.0 files

(CKV_OPENAPI_1)

🔇 Additional comments (49)
charts/vald/templates/gateway/ing.yaml (1)

333-333: No-op formatting change.

Trailing newline added; fine to merge.

.github/workflows/dockers-agent-sidecar-image.yaml (1)

41-41: CI triggers expanded appropriately for Stats and Payload.

Globs added for apis/grpc/v1/payload and stats plus internal/net/grpc/stats in both PR and PR_TARGET; looks good.

Also applies to: 43-43, 111-111, 179-179, 181-181, 249-249

.github/workflows/dockers-index-exportation-image.yaml (1)

42-42: CI path filters updated for Stats; LGTM.

Both stats paths are covered in PR and PR_TARGET blocks.

Also applies to: 96-96, 153-153, 207-207

.github/workflows/dockers-manager-index-image.yaml (1)

44-44: Stats path globs added consistently.

Change matches the pattern used in other docker workflows.

Also applies to: 100-100, 165-165, 221-221

.github/workflows/dockers-benchmark-operator-image.yaml (1)

40-40: Benchmark operator CI now watches Stats and Payload; OK.

Additions mirror other workflows.

Also applies to: 42-42, 97-97, 158-158, 160-160, 215-215

.github/workflows/dockers-discoverer-k8s-image.yaml (1)

43-43: Discoverer CI triggers include Stats paths; OK.

Consistent with the rest of the repo updates.

Also applies to: 99-99, 163-163, 219-219

.github/workflows/dockers-gateway-filter-image.yaml (1)

44-44: Confirm both stats globs appear twice in every dockers- workflow*

New stats path globs were added for pull_request and pull_request_target — ensure each dockers-*.yaml contains >=2 occurrences of both:

  • apis/grpc/v1/rpc/stats/*.go
  • internal/net/grpc/stats/*.go
#!/bin/bash
set -euo pipefail
cd "$(git rev-parse --show-toplevel 2>/dev/null || printf '%s' .)"
shopt -s nullglob
missing=0
for f in .github/workflows/dockers-*.yaml .github/workflows/dockers-*.yml; do
  [ -f "$f" ] || continue
  c_rpc=$(rg -nF 'apis/grpc/v1/rpc/stats/*.go' "$f" || true | wc -l | tr -d ' ')
  c_int=$(rg -nF 'internal/net/grpc/stats/*.go' "$f" || true | wc -l | tr -d ' ')
  if [ "$c_rpc" -lt 2 ] || [ "$c_int" -lt 2 ]; then
    echo "WARN: $f -> rpc/stats: $c_rpc, internal/net/grpc/stats: $c_int"
    missing=1
  fi
done
exit $missing
.github/workflows/dockers-agent-ngt-image.yaml (4)

43-43: Add stats RPC glob — LGTM

Including apis/grpc/v1/rpc/stats/*.go ensures image builds trigger when the new gRPC surface changes.


99-99: Watch internal stats implementation — LGTM

Adding internal/net/grpc/stats/*.go covers server-side updates for the new API.


168-168: Mirror PR target trigger for stats RPC — LGTM

Keeps PR/PR_TARGET filters consistent.


224-224: Mirror PR target trigger for internal stats — LGTM; ensure generator updated

Since this file is generated, confirm hack/docker/gen/main.go includes these new globs so future regenerations don’t drop them.

.github/workflows/dockers-gateway-mirror-image.yaml (3)

43-43: Add stats RPC glob — LGTM


99-99: Add internal stats glob — LGTM


163-163: PR target: stats RPC — LGTM

.github/workflows/dockers-agent-faiss-image.yaml (3)

43-43: Add stats RPC glob — LGTM


97-97: Add internal stats glob — LGTM


165-165: PR target: stats RPC — LGTM

.github/workflows/dockers-readreplica-rotate-image.yaml (5)

40-40: Add payload glob — good catch

ResourceStats additions in payloads will now trigger this image build.


42-42: Add stats RPC glob — LGTM


95-95: Add internal stats glob — LGTM


150-150: PR target: payload glob — LGTM


152-152: PR target: stats RPC glob — LGTM

.github/workflows/dockers-gateway-lb-image.yaml (3)

43-43: Add stats RPC glob — LGTM


98-98: Add internal stats glob — LGTM


161-161: PR target: stats RPC — LGTM

.github/workflows/dockers-index-creation-image.yaml (3)

44-44: Add stats RPC glob — LGTM


96-96: Add internal stats glob — LGTM


155-155: PR target: stats RPC glob — LGTM

.github/workflows/dockers-index-save-image.yaml (3)

44-44: Add stats RPC glob — LGTM


96-96: Add internal stats glob — LGTM


155-155: PR target: stats RPC glob — LGTM

.github/workflows/dockers-index-operator-image.yaml (5)

40-40: Add payload glob — LGTM

Covers changes to new/updated payload types (e.g., ResourceStats).


42-42: Add stats RPC glob — LGTM


96-96: Add internal stats glob — LGTM


151-151: PR target: payload glob — LGTM


153-153: PR target: stats RPC — LGTM

CHANGELOG.md (2)

429-446: Changelog formatting tweak is harmless.

No content changes; safe to keep.


511-548: OK to keep autogenerated “Summary by CodeRabbit” spacing edits.

No action needed.

.github/workflows/dockers-index-deletion-image.yaml (1)

43-43: Additions match the new Stats surface; good to go.

Both globs (apis/grpc/v1/rpc/stats/.go and internal/net/grpc/stats/.go) appear in both pull_request and pull_request_target sections — occurrences at lines 43, 96, 154, 207. Same note on generator parity and repo-wide consistency for all docker workflows.

.github/workflows/dockers-benchmark-job-image.yaml (1)

42-42: CI path filters: globs added across dockers workflows; generator must provide missing files

Both globs (apis/grpc/v1/rpc/stats/.go and internal/net/grpc/stats/.go) are present across .github/workflows/dockers-*.yaml; the referenced source directories/files do not exist in the repo — confirm the generator will produce these paths or commit the generated files.

internal/servers/server/server.go (2)

38-39: Import of stats package — correct placement and naming.


247-249: Register Stats before Health — correct ordering.

Registering stats prior to health ensures Health sets SERVING for the Stats service via GetServiceInfo().

rust/libs/proto/src/payload.v1.rs (1)

1357-1373: ResourceStats message: prost::Name metadata verified — no issues.

impl ::prost::Name for ResourceStats contains full_name "payload.v1.Info.ResourceStats" and type_url "/payload.v1.Info.ResourceStats"; field types match the proto. Regenerate this file from the proto (do not edit manually).

apis/proto/v1/rpc/stats/stats.proto (1)

17-38: Do not change this single route to /v1/resource/stats — follow the repo's unversioned REST convention

  • Repo uses unversioned top-level HTTP bindings; examples: apis/proto/v1/vald/object.proto:122 -> /object/list, apis/proto/v1/vald/index.proto:35 -> /index/info, apis/proto/v1/agent/core/agent.proto:38 -> /index/save, and apis/swagger/v1/rpc/stats/stats.swagger.json:15 -> /resource/stats. If you want versioned paths, do a repo-wide migration (update proto HTTP bindings + generated swagger) rather than changing just this endpoint.
  • Optional refactor: replace payload.v1.Empty with a dedicated request message to allow future filters (scope/format); update apis/proto/v1/rpc/stats/stats.proto accordingly.

Likely an incorrect or invalid review comment.

.gitfiles (1)

212-214: No action needed — .gitfiles is alphabetically sorted.

Verification: the provided script printed "OK: .gitfiles is sorted". The "/bin/bash: line 5: !: command not found" is an execution-environment artifact and can be ignored.

apis/docs/v1/payload.md.tmpl (1)

267-272: OK to wire ResourceStats into the docs templates

The scheme/field template hooks look consistent with the surrounding patterns. No issues.

apis/docs/v1/docs.md (1)

144-146: Include the new Stats service in ToC and keep the wording tight

Wording is fine; ensure docs generation keeps this section stable when templates regenerate.

If docs.md is generated, please re-run docs generation after template changes to keep them in sync.

Also applies to: 1642-1657

rust/libs/proto/src/rpc.v1.tonic.rs (1)

100-124: Generated gRPC client/server surface looks good

Client/server scaffolding for ResourceStats is standard Prost/Tonic output. No changes needed.

Also applies to: 217-299

apis/swagger/v1/rpc/stats/stats.swagger.json (1)

15-35: Add security to this operation or explicitly mark it public

I searched apis/swagger/v1 and found no securityDefinitions/security entries. If this endpoint requires auth, add a root-level securityDefinitions (Swagger 2.0) and include a "security" array on paths./resource/stats (GET) in apis/swagger/v1/rpc/stats/stats.swagger.json; otherwise document the endpoint as public.

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Sep 17, 2025

Deploying vald with  Cloudflare Pages  Cloudflare Pages

Latest commit: a249751
Status: ✅  Deploy successful!
Preview URL: https://f02ff1ae.vald.pages.dev
Branch Preview URL: https://feature-internal-add-resourc.vald.pages.dev

View logs

@codecov
Copy link

codecov bot commented Sep 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 18.53%. Comparing base (e7f6cf3) to head (a249751).
⚠️ Report is 8 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3243      +/-   ##
==========================================
+ Coverage   17.74%   18.53%   +0.79%     
==========================================
  Files         126      128       +2     
  Lines       13647    12193    -1454     
==========================================
- Hits         2421     2260     -161     
+ Misses      10949     9640    -1309     
- Partials      277      293      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jumpei527 jumpei527 changed the title Add resource stats gRPC API [WIP] Add resource stats gRPC API Sep 17, 2025
@vdaas-ci
Copy link
Collaborator

Profile Report

typevald-agent-ngtvald-lb-gatewayvald-discoverervald-manager-index
cpu
heap
other images

@Matts966 Matts966 dismissed stale reviews from coderabbitai[bot] and themself via d1e4666 October 22, 2025 07:38
@Matts966 Matts966 force-pushed the feature/internal/add-resource-stats-grpc-api branch from d1e4666 to 5f7e780 Compare October 22, 2025 07:40
coderabbitai[bot]
coderabbitai bot previously approved these changes Oct 22, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/e2e.v2.yaml (2)

95-97: Minor: Redundant null check in combo function.

The check o.environment === null (line 96) is unnecessary because baseInclude consistently uses string literals ("null", "management", etc.) rather than actual null values. Consider simplifying to return o.scenario + '-' + o.deployment + '-' + o.cluster + '-' + o.environment; or adding a clarifying comment if the null check is intended for forward compatibility.


262-262: Note: Debug kubectl logs command with --tail=-1 may produce large output.

The kubectl logs -l app.kubernetes.io/component=helm-operator --tail=-1 command at line 262 will fetch all logs (no limit), which could produce significant output if the helm-operator has been running for a while. This is acceptable for debugging, but consider monitoring CI log size limits if this becomes problematic in practice.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5f7e780 and 1f6cd7f.

📒 Files selected for processing (1)
  • .github/workflows/e2e.v2.yaml (2 hunks)
🔇 Additional comments (3)
.github/workflows/e2e.v2.yaml (3)

46-115: Well-implemented dynamic matrix generation with robust fallback logic.

The build-matrix job design is solid. The substring matching approach (line 101) provides flexible label-based filtering, and the fallback mechanism (line 105-107) ensures the workflow never gets stuck with an empty matrix. The label extraction safely handles undefined values (line 84), and the resulting JSON is correctly formatted for consumption by fromJson().


44-44: Condition update properly gates detect-ci-container to workflow triggers.

The updated condition correctly checks for label prefix 'actions/e2e-v2' alongside existing tag and dispatch triggers. The short-circuit AND logic at line 44 prevents evaluation of undefined github.event.label.name on non-labeled triggers.


120-123: Dynamic matrix dependency chain is correctly configured.

The e2e job properly depends on both detect-ci-container (for image TAG) and build-matrix (for matrix definition), and the fromJson() parsing of the string output is appropriate given the result-encoding: string setting on the upstream github-script step.

coderabbitai[bot]
coderabbitai bot previously approved these changes Oct 22, 2025
@Matts966 Matts966 added the actions/build/arm64 Build ARM64 Docker Images label Oct 22, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Oct 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants