Skip to content

Infinitespark 96/issue 2309 add filtering sorting statistics to llm count #2502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

infinitespark-96
Copy link
Contributor

@infinitespark-96 infinitespark-96 commented Jun 17, 2025

Important note

This branch was created from this one since I needed the new changes for sorting I recently made for span count:
infinitespark-96/issue-2234-add-sorting-to-all-columns-in-traces-view
There is an open Pull Request for this: #2424

Details

In this PR I added filtering, and sorting for llm_span_count.
Also, I added statistics (average) for this field
Demo:
image
image
Screencast: https://github.com/user-attachments/assets/baf16b81-30c3-4bcc-b85b-de1d9d9cb696

Testing

Added test coverage for sorting by llm_span_count in TracesResourceTest:

  • Extended getTracesByProject__whenSortingByValidFields__thenReturnTracesSorted to support llm_span_count
    Added test coverage for filtering by llm_span_count in TracesResourceTest:
  • whenFilterLlmSpanCountGreaterThanEqual__thenReturnTracesFiltered
  • whenFilterLlmSpanCountLessThanEqual__thenReturnTracesFiltered
  • whenFilterLlmSpanCountEqual__thenReturnTracesFiltered
    Added test coverage for average llm_span_count in StatsMapperTest
  • Extended mapProjectStats_returnsTotals test to include new average

YarivHashaiComet and others added 23 commits May 23, 2025 08:53
…ll-columns-in-traces-view

# Conflicts:
#	apps/opik-frontend/src/lib/sorting.ts
… when zero values are passed from the frontend
@infinitespark-96 infinitespark-96 requested a review from a team as a code owner June 17, 2025 18:03
@andrescrz
Copy link
Collaborator

andrescrz commented Jun 18, 2025

@infinitespark-96 the other PR was merged. Please fix conflicts by moving this one on top of latest main.

…rting-statistics-to-llm-count

# Conflicts:
#	apps/opik-backend/src/main/java/com/comet/opik/api/sorting/SortableFields.java
#	apps/opik-backend/src/main/java/com/comet/opik/api/sorting/TraceSortingFactory.java
#	apps/opik-backend/src/main/java/com/comet/opik/domain/TraceDAO.java
#	apps/opik-backend/src/test/java/com/comet/opik/api/resources/v1/priv/TracesResourceTest.java
Copy link
Collaborator

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution. I have some questions about the production code, mostly on the DB queries. Additionally, there are some details to be polished in the tests for maintainability.

Otherwise, this is generally in the good direction.

@@ -325,7 +325,7 @@ ORDER BY (workspace_id, project_id, id) DESC, last_updated_at DESC
sumMap(s.usage) as usage,
sum(s.total_estimated_cost) as total_estimated_cost,
COUNT(s.id) AS span_count,
countIf(s.type = 'llm') AS llm_span_count,
toInt64(countIf(s.type = 'llm')) AS llm_span_count,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why you need the explicit toInt64 conversion? I'd rather avoid this unless strictly needed.

Applies to all occurrences of this change in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't have this conversion and pass a negative value from the frontend, we get a 500 error from the server, and the actual error is:
image
image

I understand that maybe it's better to make the change on some other level of the backend app to prevent 500 error in cases like that (negative value from the frontend passed), as well with the UI/UX improvement to allow only positive values for this field in filters modal input since logically we cannot have negative amount of spans.
What do you think? @andrescrz

Comment on lines 141 to 153
var actualMap = actualStats.stream()
.collect(java.util.stream.Collectors.toMap(ProjectStatItem::getName,
java.util.function.Function.identity()));

assertThat(actualStats)
var expectedMap = expectedStats.stream()
.collect(java.util.stream.Collectors.toMap(ProjectStatItem::getName,
java.util.function.Function.identity()));

assertThat(actualMap.keySet()).containsAll(expectedMap.keySet());

expectedMap.forEach((name, expected) -> assertThat(actualMap.get(name))
.usingRecursiveComparison(StatsUtils.getRecursiveComparisonConfiguration())
.isEqualTo(expectedStats);
.isEqualTo(expected));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you attempting to accomplish with this assertion approach? The original approach was more straight forward. Additionally, there could be an issue when iterating here if the expectedMap is not well formed. Was there any problem i the original approach?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're absolutely correct; I've reverted this change. Currently, I'm facing a problem, and I'm unsure of the best approach to take.
After adding an average for llmSpansCount, and making these changes int he StatsUtils.java file:
image
image
image

I'm having lots of tests failing because of one reason: when creating traces and don't specify the actual value of llm spans that we create for a particular trace, e.g., .llmSpanCount(0), when no actual spans are created.
All fail on stats endpoint:
image
Error is the same in all the failed tests:
Actual amount of spans are 0, and average is 0 is returned, but in the expected value we have value=1.53735375E8 since llmSpanCount was auto-generated by Podam, and we don't do the real cycle of putting in the db, getting the actual page, and the traces from there with correct values for spanCount and llmSpanCount.
image
And if we manually add these correct values, all good, no error:
image
I'm a bit confused, what would be the best strategy here..

  1. Set manually in each failed test the correct value for llmSpanCount
  2. Or maybe create/modify some existing helper method and use it for trace creation since in many tests we have the repeated code for setting these values:
    image
  3. Or maybe something better?
    Can you help, please? @andrescrz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can check my latest commit as a reference to what I mean here

Comment on lines 2595 to 2598
int llmSpanCount = 3;
List<Span> spans = new ArrayList<>();

for (int i = 0; i < llmSpanCount; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this, better use the existing convention with PodamFactoryUtils.manufacturePojoList(factory, Span.class)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.build());
}

for (Trace trace : traces.subList(1, traces.size())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the existing convention with var otherSpans = traces.stream().skip(1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

int llmSpanCount = 1;
List<Span> spans = new ArrayList<>();

for (int i = 0; i < llmSpanCount; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.build());
}

for (Trace trace : traces.subList(1, traces.size())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.usage(null)
.totalEstimatedCost(null)
.build());
spans.add(factory.manufacturePojo(Span.class).toBuilder()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you intend with this extra addition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it, it's implemented differently now

int llmSpanCount = 2;
List<Span> spans = new ArrayList<>();

for (int i = 0; i < llmSpanCount; i++) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

int llmCountUnexpected = 3;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 103 to 108
var llmSpanStat = stats.stats().stream()
.filter(s -> StatsMapper.LLM_SPAN_COUNT.equals(s.getName()))
.findFirst()
.orElseThrow();

assertThat(llmSpanStat).isInstanceOf(ProjectStats.AvgValueStat.class);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed, please remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@infinitespark-96
Copy link
Contributor Author

Thanks a lot for the review @andrescrz, I should have updated you on this, it's not the final version since after I reviewed your commits, which clearly show me the correct way of doing things, I realized what I should improve and change in my code. Most of your comments just confirmed that my thoughts were correct. I'll update the code in a few hours.

@andrescrz
Copy link
Collaborator

Thanks a lot for the review @andrescrz, I should have updated you on this, it's not the final version since after I reviewed your commits, which clearly show me the correct way of doing things, I realized what I should improve and change in my code. Most of your comments just confirmed that my thoughts were correct. I'll update the code in a few hours.

@infinitespark-96 thanks for the clarification. I will wait for the next revision then. Cheers.

@infinitespark-96
Copy link
Contributor Author

Thanks a lot for the review @andrescrz, I should have updated you on this, it's not the final version since after I reviewed your commits, which clearly show me the correct way of doing things, I realized what I should improve and change in my code. Most of your comments just confirmed that my thoughts were correct. I'll update the code in a few hours.

@infinitespark-96 thanks for the clarification. I will wait for the next revision then. Cheers.

The latest version of the code has been pushed with all the required improvements and fixes.

In my latest commit, I addressed issues related to the average llmSpanCount in the stats endpoint. Specifically, I set:

.llmSpanCount(0)
.spanCount(0)

I’ll need to make sure the correct llmSpanCount is set everywhere to resolve the remaining failed tests related to the stats endpoint, as described in my previous question.

Other than those outstanding updates, the changes are ready for review again.
Any feedback or help would be appreciated—thanks in advance!
@andrescrz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants