Skip to content

Conversation

nikhilmantri0902
Copy link
Contributor

@nikhilmantri0902 nikhilmantri0902 commented Oct 7, 2025

πŸ“„ Summary

Revamp Services Page API to be wired up via querier using query builder V5
Closes #9251


βœ… Changes

  • Feature: Brief description
  • Bug fix: Brief description

🏷️ Required: Add Relevant Labels

⚠️ Manually add appropriate labels in the PR sidebar
Please select one or more labels (as applicable):

ex:

  • backend

πŸ‘₯ Reviewers

Tag the relevant teams for review:

  • frontend / backend / devops

πŸ§ͺ How to Test

  1. ...
  2. ...
  3. ...

πŸ” Related Issues

Closes #


πŸ“Έ Screenshots / Screen Recording (if applicable / mandatory for UI related changes)


πŸ“‹ Checklist

  • Dev Review
  • Test cases added (Unit/ Integration / E2E)
  • Manually tested the changes

πŸ‘€ Notes for Reviewers


Important

Adds a new /api/v1/services_qbv5 API endpoint using Query Builder V5 for the services page, with new modules and handlers integrated into the existing system.

  • Behavior:
    • Adds new /api/v1/services_qbv5 endpoint in http_handler.go using Query Builder V5.
    • Implements Get method in implservicesqb/handler.go to handle requests to the new endpoint.
  • Modules:
    • Introduces servicesqb module with Request, ResponseItem, and TagFilterItem structs in dto.go.
    • Implements Module and Handler interfaces in implservicesqb/module.go and implservicesqb/handler.go.
  • Integration:
    • Registers ServicesQB handler in signoz/handler.go and signoz/module.go.
    • Updates RegisterRoutes in http_handler.go to include the new endpoint.

This description was created by Ellipsis for 66f6b07. You can customize this summary. It will automatically update as commits are pushed.

@nikhilmantri0902 nikhilmantri0902 marked this pull request as draft October 7, 2025 12:59
@github-actions github-actions bot added the enhancement New feature or request label Oct 7, 2025
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to 66f6b07 in 2 minutes and 41 seconds. Click for details.
  • Reviewed 231 lines of code in 7 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with πŸ‘ or πŸ‘Ž to teach Ellipsis.
1. pkg/signoz/module.go:45
  • Draft comment:
    The ServicesQB module is added and instantiated via implservicesqb.NewModule(querier). Note that its Get method currently returns an empty slice. Please ensure that the QBV5 implementation is completed so that service names (including those starting with '@') are correctly handled to resolve issue #9251.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% This comment is informative and suggests ensuring the completion of the QBV5 implementation. It does not provide a specific code suggestion or ask for a test to be written. It also asks the author to "ensure" something, which is against the rules.
2. pkg/modules/servicesqb/implservicesqb/module.go:20
  • Draft comment:
    Typographical note: The comment mentions "QBv5" while the commit title refers to "QBV5". Consider standardizing the casing for clarity.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This comment is purely about documentation formatting and doesn't affect code functionality. It's a very minor stylistic issue about capitalization consistency. The rules state not to make purely informative comments or comments that are obvious/unimportant. This seems to fall into that category. Perhaps inconsistent documentation styling could lead to confusion for developers? Maybe there's an established style guide that makes this important? Even if there is a style guide, this is too minor of an issue to warrant a PR comment. It doesn't affect functionality and doesn't require a code change. Delete this comment as it's purely about documentation formatting and doesn't meet the threshold of importance for a PR comment.
3. pkg/modules/servicesqb/servicesqb.go:13
  • Draft comment:
    Inconsistent naming: the comment here uses "services_qbv5" while the Module interface comment (line 18) uses "services QBv5". Consider using a consistent term (e.g. "services QBV5") throughout the file.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% While the comment is technically correct about the inconsistency, it's a very minor documentation issue. The rules state to only keep comments that clearly require code changes and to not make purely informative comments. This seems more like a style nitpick that doesn't impact functionality or code quality significantly. The inconsistency could potentially cause confusion for developers searching for references to this service in the codebase. Documentation consistency does have some value. However, the impact is minimal, and this kind of minor documentation inconsistency doesn't meet the threshold of "clearly a code change required" as specified in the rules. The comment should be deleted as it violates the rule about not making purely informative comments and doesn't represent a significant enough issue to warrant a code change.
4. pkg/query-service/app/http_handler.go:540
  • Draft comment:
    Typo detected in the comment. It says "QBv5-based services endpoint" but to maintain consistency with the commit title and naming convention, it should likely be "QBV5-based services endpoint".
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% 1. The comment is about a minor capitalization issue in a code comment 2. Code comments are documentation, so consistency in terminology can be helpful 3. However, this is an extremely minor issue that doesn't affect functionality 4. The rules say to avoid unimportant comments and to only make comments that require code changes 5. A capitalization change in a comment is not a meaningful code change The comment is technically correct about maintaining consistency, but is it worth the overhead of having the PR author make this trivial change? While consistency is good, this is too minor of an issue to warrant a PR comment. It creates unnecessary back-and-forth for negligible benefit. Delete this comment. It's about an extremely minor capitalization issue in a code comment that doesn't affect functionality. The overhead of addressing it outweighs the minimal benefit.

Workflow ID: wflow_wXnVufL6U82uRcAn

You can customize Ellipsis by changing your verbosity settings, reacting with πŸ‘ or πŸ‘Ž, replying to comments, or adding code review rules.

@nikhilmantri0902
Copy link
Contributor Author

nikhilmantri0902 commented Oct 10, 2025

@srikanthccv

{
  "timestamp": "2025-10-10T10:42:17.847495+05:30",
  "level": "DEBUG",
  "code": {
    "function": "github.com/SigNoz/signoz/pkg/telemetrystore/telemetrystorehook.(*logging).AfterQuery",
    "file": "/Users/nikhilmantrisignoz/signozprojects/signoz/pkg/telemetrystore/telemetrystorehook/logging.go",
    "line": 47
  },
  "msg": "::TELEMETRYSTORE-QUERY::",
  "logger": "github.com/SigNoz/signoz/pkg/telemetrystore/telemetrystorehook",
  "db.query.text": "WITH __resource_filter AS (SELECT fingerprint FROM signoz_traces.distributed_traces_v3_resource WHERE ((((simpleJSONExtractString(labels, 'deployment.environment') = ?) AND labels LIKE ? AND (labels LIKE ?))) AND ((true OR true))) AND seen_at_ts_bucket_start >= ? AND seen_at_ts_bucket_start <= ?) SELECT toString(multiIf(multiIf(resource.`service.name` IS NOT NULL, resource.`service.name`::String, `resource_string_service$$name_exists`==true, `resource_string_service$$name`, NULL) IS NOT NULL, multiIf(resource.`service.name` IS NOT NULL, resource.`service.name`::String, `resource_string_service$$name_exists`==true, `resource_string_service$$name`, NULL), mapContains(attributes_string, 'service.name') = ?, attributes_string['service.name'], NULL)) AS `service.name`, quantile(0.99)(multiIf(duration_nano <> ?, duration_nano, NULL)) AS __result_0, avg(multiIf(duration_nano <> ?, duration_nano, NULL)) AS __result_1, count() AS __result_2, countIf(toFloat64(status_code) = ?) AS __result_3, countIf((toFloat64OrNull(response_status_code) >= ? AND toFloat64OrNull(response_status_code) < ?)) AS __result_4 FROM signoz_traces.distributed_signoz_index_v3 WHERE resource_fingerprint GLOBAL IN (SELECT fingerprint FROM __resource_filter) AND ((((multiIf(resource.`deployment.environment` IS NOT NULL, resource.`deployment.environment`::String, mapContains(resources_string, 'deployment.environment'), resources_string['deployment.environment'], NULL) = ?) AND multiIf(resource.`deployment.environment` IS NOT NULL, resource.`deployment.environment`::String, mapContains(resources_string, 'deployment.environment'), resources_string['deployment.environment'], NULL) IS NOT NULL)) AND ((parent_span_id = '' OR ((name, resource_string_service$$name) GLOBAL IN (SELECT DISTINCT name, serviceName from signoz_traces.distributed_top_level_operations)) AND parent_span_id != ''))) AND timestamp >= ? AND timestamp < ? AND ts_bucket_start >= ? AND ts_bucket_start <= ? GROUP BY `service.name` ORDER BY __result_0 DESC",
  "db.query.args": [
    "nikhil-local",
    "%deployment.environment%",
    "%deployment.environment\":\"nikhil-local%",
    1760067710,
    1760073110,
    true,
    0,
    0,
    2,
    400,
    500,
    "nikhil-local",
    "1760069510363000000",
    "1760073110363000000",
    1760067710,
    1760073110
  ],
  "db.duration": "80.334875ms"
}

Above is the log for the query. This contains the query as well as the arguments. The query in a beautified format below:

WITH __resource_filter
AS (
	SELECT fingerprint
	FROM signoz_traces.distributed_traces_v3_resource
	WHERE (
			(
				(
					(simpleJSONExtractString(labels, 'deployment.environment') = ?)
					AND labels LIKE ?
					AND (labels LIKE ?)
					)
				)
			AND (
				(
					true
					OR true
					)
				)
			)
		AND seen_at_ts_bucket_start >= ?
		AND seen_at_ts_bucket_start <= ?
	)
SELECT toString(multiIf(multiIf(resource.`service.name` IS NOT NULL, resource.`service.name`::String, `resource_string_service$$name_exists` == true, `resource_string_service$$name`, NULL) IS NOT NULL, multiIf(resource.`service.name` IS NOT NULL, resource.`service.name`::String, `resource_string_service$$name_exists` == true, `resource_string_service$$name`, NULL), mapContains(attributes_string, 'service.name') = ?, attributes_string ['service.name'], NULL)) AS `service.name`
	,quantile(0.99) (multiIf(duration_nano <> ?, duration_nano, NULL)) AS __result_0
	,avg(multiIf(duration_nano <> ?, duration_nano, NULL)) AS __result_1
	,count() AS __result_2
	,countIf(toFloat64(status_code) = ?) AS __result_3
	,countIf((
			toFloat64OrNull(response_status_code) >= ?
			AND toFloat64OrNull(response_status_code) < ?
			)) AS __result_4
FROM signoz_traces.distributed_signoz_index_v3
WHERE resource_fingerprint GLOBAL IN (
		SELECT fingerprint
		FROM __resource_filter
		)
	AND (
		(
			(
				(multiIf(resource.`deployment.environment` IS NOT NULL, resource.`deployment.environment`::String, mapContains(resources_string, 'deployment.environment'), resources_string ['deployment.environment'], NULL) = ?)
				AND multiIf(resource.`deployment.environment` IS NOT NULL, resource.`deployment.environment`::String, mapContains(resources_string, 'deployment.environment'), resources_string ['deployment.environment'], NULL) IS NOT NULL
				)
			)
		AND (
			(
				parent_span_id = ''
				OR (
					(
						NAME
						,resource_string_service$$name
						) GLOBAL IN (
						SELECT DISTINCT NAME
							,serviceName
						FROM signoz_traces.distributed_top_level_operations
						)
					)
				AND parent_span_id != ''
				)
			)
		)
	AND TIMESTAMP >= ?
	AND TIMESTAMP < ?
	AND ts_bucket_start >= ?
	AND ts_bucket_start <= ?
GROUP BY `service.name`
ORDER BY __result_0 DESC

@srikanthccv
Copy link
Member

Without looking at the results, can we say why this query will produce incorrect or correct results?

@nikhilmantri0902 nikhilmantri0902 marked this pull request as ready for review October 10, 2025 06:11
@srikanthccv
Copy link
Member

@nikhilmantri0902 writing down the check list discussed for the changes that involve queries you should keep in mind

given the table schemas, and their relations

  1. you should prove, in theory, the query works and produces correct results
  2. verify it in practice because each database has quirks associated with it so the query should be verified against the real data
  3. prepare a list of cases where it could break in theory
  4. verify the when if it actually breaks in practice given the way system is built to operate (something could break in theory but never happens because the designed system won't let that state ever be reached)

Once the correctness is achieved, we need to look into the performance part of it (it's an independent piece on its own). For the scope of this PR, please do the above exercise.

@nikhilmantri0902
Copy link
Contributor Author

Hi @srikanthccv I have pushed a commit that now passed FieldContextResource to group by service.name clause, the query now looks something like this:

WITH __resource_filter
AS (
	SELECT fingerprint
	FROM signoz_traces.distributed_traces_v3_resource
	WHERE (
			(
				(
					(simpleJSONExtractString(labels, 'deployment.environment') = ?)
					AND labels LIKE ?
					AND (labels LIKE ?)
					)
				)
			AND (
				(
					true
					OR true
					)
				)
			)
		AND seen_at_ts_bucket_start >= ?
		AND seen_at_ts_bucket_start <= ?
	)
SELECT toString(multiIf(multiIf(resource.`service.name` IS NOT NULL, resource.`service.name`::String, mapContains(resources_string, 'service.name'), resources_string ['service.name'], NULL) IS NOT NULL, multiIf(resource.`service.name` IS NOT NULL, resource.`service.name`::String, mapContains(resources_string, 'service.name'), resources_string ['service.name'], NULL), NULL)) AS `service.name`
	,quantile(0.99) (multiIf(duration_nano <> ?, duration_nano, NULL)) AS __result_0
	,avg(multiIf(duration_nano <> ?, duration_nano, NULL)) AS __result_1
	,count() AS __result_2
	,countIf(toFloat64(status_code) = ?) AS __result_3
	,countIf((
			toFloat64OrNull(response_status_code) >= ?
			AND toFloat64OrNull(response_status_code) < ?
			)) AS __result_4
FROM signoz_traces.distributed_signoz_index_v3
WHERE resource_fingerprint GLOBAL IN (
		SELECT fingerprint
		FROM __resource_filter
		)
	AND (
		(
			(
				(multiIf(resource.`deployment.environment` IS NOT NULL, resource.`deployment.environment`::String, mapContains(resources_string, 'deployment.environment'), resources_string ['deployment.environment'], NULL) = ?)
				AND multiIf(resource.`deployment.environment` IS NOT NULL, resource.`deployment.environment`::String, mapContains(resources_string, 'deployment.environment'), resources_string ['deployment.environment'], NULL) IS NOT NULL
				)
			)
		AND (
			(
				parent_span_id = ''
				OR (
					(
						NAME
						,resource_string_service$$name
						) GLOBAL IN (
						SELECT DISTINCT NAME
							,serviceName
						FROM signoz_traces.distributed_top_level_operations
						)
					)
				AND parent_span_id != ''
				)
			)
		)
	AND TIMESTAMP >= ?
	AND TIMESTAMP < ?
	AND ts_bucket_start >= ?
	AND ts_bucket_start <= ?
GROUP BY `service.name`
ORDER BY __result_0 DESC

This now utilizes resources_string for service.name querying

@srikanthccv
Copy link
Member

@nikhilmantri0902 please resolve the conflicts.

@nikhilmantri0902
Copy link
Contributor Author

@srikanth we will need a small frontend change also here. The response is now structured inside data field:

{
  "status": "success",
  "data": [
    {
      "serviceName": "demo-app",
      "p99": 6929960842.199999,
      "avgDuration": 4826518611.155125,
      "numCalls": 361,
      "callRate": 0.20055555555555554,
      "numErrors": 0,
      "errorRate": 0,
      "num4XX": 0,
      "fourXXRate": 0,
      "dataWarning": {
        "topLevelOps": [
          "overflow_operation",
          "home"
        ]
      }
    }
  ]
}

as I am using render.Success method for response.

Earlier API response structure was :

[
  {
    "serviceName": "customer",
    "p99": 635039347.6400002,
    "avgDuration": 324454105.8835821,
    "numCalls": 1005,
    "callRate": 0.5583333333333333,
    "numErrors": 0,
    "errorRate": 0,
    "num4XX": 0,
    "fourXXRate": 0,
    "dataWarning": {
      "topLevelOps": [
        "overflow_operation",
        "/customer"
      ]
    }
  }
]

using ah.WriteJSON method .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

@ at the start of the service name breaks the Services page

3 participants