-
Notifications
You must be signed in to change notification settings - Fork 692
Infinitespark 96/issue 2309 add filtering sorting statistics to llm count #2502
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ting-to-all-columns-in-traces-view
…ting-to-all-columns-in-traces-view
…ting-to-all-columns-in-traces-view
…ll-columns-in-traces-view # Conflicts: # apps/opik-frontend/src/lib/sorting.ts
…re-join pattern, add additional tests
…nd usage fields, some code revert
…ting-to-all-columns-in-traces-view
…ting-to-all-columns-in-traces-view
…tering-sorting-statistics-to-llm-count
… when zero values are passed from the frontend
…m_span_count_avg as AvgValueStat
@infinitespark-96 the other PR was merged. Please fix conflicts by moving this one on top of latest main. |
…rting-statistics-to-llm-count # Conflicts: # apps/opik-backend/src/main/java/com/comet/opik/api/sorting/SortableFields.java # apps/opik-backend/src/main/java/com/comet/opik/api/sorting/TraceSortingFactory.java # apps/opik-backend/src/main/java/com/comet/opik/domain/TraceDAO.java # apps/opik-backend/src/test/java/com/comet/opik/api/resources/v1/priv/TracesResourceTest.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution. I have some questions about the production code, mostly on the DB queries. Additionally, there are some details to be polished in the tests for maintainability.
Otherwise, this is generally in the good direction.
@@ -325,7 +325,7 @@ ORDER BY (workspace_id, project_id, id) DESC, last_updated_at DESC | |||
sumMap(s.usage) as usage, | |||
sum(s.total_estimated_cost) as total_estimated_cost, | |||
COUNT(s.id) AS span_count, | |||
countIf(s.type = 'llm') AS llm_span_count, | |||
toInt64(countIf(s.type = 'llm')) AS llm_span_count, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why you need the explicit toInt64
conversion? I'd rather avoid this unless strictly needed.
Applies to all occurrences of this change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't have this conversion and pass a negative value from the frontend, we get a 500 error from the server, and the actual error is:
I understand that maybe it's better to make the change on some other level of the backend app to prevent 500 error in cases like that (negative value from the frontend passed), as well with the UI/UX improvement to allow only positive values for this field in filters modal input since logically we cannot have negative amount of spans.
What do you think? @andrescrz
var actualMap = actualStats.stream() | ||
.collect(java.util.stream.Collectors.toMap(ProjectStatItem::getName, | ||
java.util.function.Function.identity())); | ||
|
||
assertThat(actualStats) | ||
var expectedMap = expectedStats.stream() | ||
.collect(java.util.stream.Collectors.toMap(ProjectStatItem::getName, | ||
java.util.function.Function.identity())); | ||
|
||
assertThat(actualMap.keySet()).containsAll(expectedMap.keySet()); | ||
|
||
expectedMap.forEach((name, expected) -> assertThat(actualMap.get(name)) | ||
.usingRecursiveComparison(StatsUtils.getRecursiveComparisonConfiguration()) | ||
.isEqualTo(expectedStats); | ||
.isEqualTo(expected)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you attempting to accomplish with this assertion approach? The original approach was more straight forward. Additionally, there could be an issue when iterating here if the expectedMap is not well formed. Was there any problem i the original approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're absolutely correct; I've reverted this change. Currently, I'm facing a problem, and I'm unsure of the best approach to take.
After adding an average for llmSpansCount, and making these changes int he StatsUtils.java file:
I'm having lots of tests failing because of one reason: when creating traces and don't specify the actual value of llm spans that we create for a particular trace, e.g., .llmSpanCount(0), when no actual spans are created.
All fail on stats endpoint:
Error is the same in all the failed tests:
Actual amount of spans are 0, and average is 0 is returned, but in the expected value we have value=1.53735375E8 since llmSpanCount was auto-generated by Podam, and we don't do the real cycle of putting in the db, getting the actual page, and the traces from there with correct values for spanCount and llmSpanCount.
And if we manually add these correct values, all good, no error:
I'm a bit confused, what would be the best strategy here..
- Set manually in each failed test the correct value for llmSpanCount
- Or maybe create/modify some existing helper method and use it for trace creation since in many tests we have the repeated code for setting these values:
- Or maybe something better?
Can you help, please? @andrescrz
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can check my latest commit as a reference to what I mean here
int llmSpanCount = 3; | ||
List<Span> spans = new ArrayList<>(); | ||
|
||
for (int i = 0; i < llmSpanCount; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, better use the existing convention with PodamFactoryUtils.manufacturePojoList(factory, Span.class)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.build()); | ||
} | ||
|
||
for (Trace trace : traces.subList(1, traces.size())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the existing convention with var otherSpans = traces.stream().skip(1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
int llmSpanCount = 1; | ||
List<Span> spans = new ArrayList<>(); | ||
|
||
for (int i = 0; i < llmSpanCount; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.build()); | ||
} | ||
|
||
for (Trace trace : traces.subList(1, traces.size())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
.usage(null) | ||
.totalEstimatedCost(null) | ||
.build()); | ||
spans.add(factory.manufacturePojo(Span.class).toBuilder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you intend with this extra addition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it, it's implemented differently now
int llmSpanCount = 2; | ||
List<Span> spans = new ArrayList<>(); | ||
|
||
for (int i = 0; i < llmSpanCount; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
} | ||
|
||
int llmCountUnexpected = 3; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
var llmSpanStat = stats.stats().stream() | ||
.filter(s -> StatsMapper.LLM_SPAN_COUNT.equals(s.getName())) | ||
.findFirst() | ||
.orElseThrow(); | ||
|
||
assertThat(llmSpanStat).isInstanceOf(ProjectStats.AvgValueStat.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed, please remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Thanks a lot for the review @andrescrz, I should have updated you on this, it's not the final version since after I reviewed your commits, which clearly show me the correct way of doing things, I realized what I should improve and change in my code. Most of your comments just confirmed that my thoughts were correct. I'll update the code in a few hours. |
@infinitespark-96 thanks for the clarification. I will wait for the next revision then. Cheers. |
…Count for all tests
… getTracesByProject__whenSortingByValidFields__thenReturnTracesSorted
… in mapProjectStats
…rror because of an incorrect llmSpanCountValue
The latest version of the code has been pushed with all the required improvements and fixes. In my latest commit, I addressed issues related to the average llmSpanCount in the stats endpoint. Specifically, I set:
I’ll need to make sure the correct llmSpanCount is set everywhere to resolve the remaining failed tests related to the stats endpoint, as described in my previous question. Other than those outstanding updates, the changes are ready for review again. |
Important note
This branch was created from this one since I needed the new changes for sorting I recently made for span count:
infinitespark-96/issue-2234-add-sorting-to-all-columns-in-traces-view
There is an open Pull Request for this: #2424
Details
In this PR I added filtering, and sorting for llm_span_count.


Also, I added statistics (average) for this field
Demo:
Screencast: https://github.com/user-attachments/assets/baf16b81-30c3-4bcc-b85b-de1d9d9cb696
Testing
Added test coverage for sorting by llm_span_count in TracesResourceTest:
Added test coverage for filtering by llm_span_count in TracesResourceTest:
Added test coverage for average llm_span_count in StatsMapperTest