Added Sycophancy Evaluation Metric in SDK, FE, Docs #2624

yashkumar2603 · 2025-06-29T17:01:21Z

Details

Resolves #2520
This PR adds the SycEval metric for evaluating sycophantic behavior in large language models. The metric tests whether models change their responses based on user pressure rather than maintaining independent reasoning by presenting rebuttals of varying rhetorical strength.
It is based on this paper https://arxiv.org/pdf/2502.08177
as linked in the issue.

Key Features:

Multi-step evaluation process: Initial classification then Rebuttal generation then Response evaluation then Sycophancy detection
Configurable rebuttal types: Simple, ethos, justification, and citation-based rebuttals
Context modes: In-context and preemptive rebuttal presentation
Separate rebuttal model: Uses dedicated model (defaults to llama3-8b) to avoid contamination
Binary scoring: Returns 0.0 (no sycophancy) or 1.0 (sycophancy detected)
Detailed metadata: Includes initial/rebuttal classifications and sycophancy type

Implementation:

SycEval class with sync/async scoring methods
Response classification and parsing
Error handling and validation for all classification types
can be imported using from opik.evaluation.metrics import SycEval in SDK easily, I tried to follow the coding style of the project, and other things mentioned in the contributing doc.

Issues

I faced one problem, I wasnt able to figure out a way to add the different results found out by the sycophancy analysis, such as sycophancy_type into the scores category in FrontEnd, as that would have required a STRING type in the LLM_SCHEMA_TYPE
So I instead made those available on the SDK, but not on the frontend. Please suggest something to tackle this problem. Guide me to make the necessary improvements in PR.

Documentation

Added comprehensive docstrings with usage examples
Updated evaluation metrics documentation
Added configuration parameter explanations
Included research context and score interpretation guidelines (a little when needed)

Working Video

2025-06-29_23-50-51.mp4

/claim #2520

Edit: added working video I forgot to add

yashkumar2603 · 2025-06-29T18:24:22Z

Hello, @vincentkoc please review and suggest changes if any. Also kindly help me understand the frontend issue mentioned above.

Thank you 😃

vincentkoc · 2025-07-02T19:40:14Z

Hello, @vincentkoc please review and suggest changes if any. Also kindly help me understand the frontend issue mentioned above.

Thank you 😃

Thanks! @yashkumar2603 the team will review and circle back.

sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/metric.py

sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/parser.py

yaricom · 2025-07-04T12:01:42Z

Hi @yashkumar2603 ! Thank you for your work on this PR — it looks very promising. I’ve left a few review comments. Additionally, I’d like to ask you to add a unit test for your metric that uses mocked model calls but verifies the scoring logic in both synchronous and asynchronous modes.

Please take a look how other LLM judge metrics are tested.

yashkumar2603 · 2025-07-04T14:02:11Z

Thanks for the review @yaricom !!
I am glad you liked the work. I will surely take a look at the unit tests, fix the reviews and update the PR.
Thank you again for your time !!

1. Implemented suggestions from reviews on the previous commit and made necessary changes. 2. Added unit tests for the sycophancy_evaluation_metric just like how it is applied for the other metrics

yashkumar2603 · 2025-07-04T17:32:02Z

I have added the unit tests and also made necessary changes based on the reviews.
Kindly review 🙏🏾

sdks/python/tests/library_integration/metrics_with_llm_judge/test_evaluation_metrics.py

yaricom · 2025-07-07T11:50:07Z

@aadereiko Could you please take a look at frontend changes if you have any comments or suggestions.

aadereiko · 2025-07-07T17:00:41Z

@yaricom @yashkumar2603
The FE part looks good :)

Moved test for invalid score into the unit tests as it uses a dummy model and doesnt need to be in integration tests. removed unnecessary @model_parametrizer from the same test.

yashkumar2603 · 2025-07-07T18:01:52Z

I have made the changes mentioned in the above comment, moved the test from integration to unit.
You are right, I had misplaced it. Thank you for pointing out.
Kindly review, merge it.

Thank you for your time.

yaricom · 2025-07-08T09:26:28Z

sdks/python/tests/unit/evaluation/metrics/llm_judges/syc_eval/test_parser.py

+    ):
+        parser.parse_model_output(content=invalid_model_output, name=metric.name)
+
+def test__syc_eval__invalid_score():


This seems to be testing the metric’s score method rather than the parser’s parse_model_output. I suggest moving it to test_metric.py for better alignment.

yaricom · 2025-07-08T09:31:39Z

sdks/python/src/opik/evaluation/metrics/llm_judges/syc_eval/metric.py

+    This metric presents rebuttals of varying rhetorical strength to test if models change their responses
+    based on user pressure rather than maintaining independent reasoning. Returns progressive sycophancy
+    score (beneficial changes) and regressive sycophancy score (harmful changes).
+


Please add a description of the score values produced—specifically, which value indicates detected sycophancy and what the possible output values are.

yaricom · 2025-07-08T09:41:27Z

sdks/python/tests/library_integration/metrics_with_llm_judge/test_evaluation_metrics.py

+
+
+@model_parametrizer
+def test__syc_eval__happyflow(model):


This test fails with default configuration because 'llama3-8b' is not supported.

self = <opik.evaluation.models.litellm.litellm_chat_model.LiteLLMChatModel object at 0x10fb7c2d0> def _check_model_name(self) -> None: import litellm try: _ = litellm.get_llm_provider(self.model_name) except litellm.exceptions.BadRequestError: > raise ValueError(f"Unsupported model: '{self.model_name}'!") E ValueError: Unsupported model: 'llama3-8b'! src/opik/evaluation/models/litellm/litellm_chat_model.py:102: ValueError

Please set a supported model as the default value for the rebuttal_model parameter.

You can verify your code by running the OPIK server locally and executing your test.

yaricom · 2025-07-08T09:44:23Z

sdks/python/tests/unit/evaluation/metrics/llm_judges/syc_eval/test_parser.py

+    ):
+        parser.parse_model_output(content=invalid_model_output, name=metric.name)
+
+def test_syc_eval_invalid_classification():


Let’s rename this test for better clarity to reflect what method is being tested, the conditions, and the expected output. Something like this:

test__parse_model_output__syc_eval_invalid_classification__raise_error

yaricom · 2025-07-08T09:44:55Z

sdks/python/tests/unit/evaluation/metrics/llm_judges/syc_eval/test_parser.py

+import pytest
+from opik.evaluation.metrics.llm_judges.syc_eval.metric import SycEval
+
+def test_syc_eval_score_out_of_range():


Let’s rename this test for better clarity to reflect what method is being tested, the conditions, and the expected output. Something like this:

test__parse_model_output__syc_eval_score_out_of_range__raise_error

yaricom · 2025-07-08T09:47:07Z

sdks/python/tests/unit/evaluation/metrics/llm_judges/syc_eval/test_parser.py

+        parser.parse_model_output(content=invalid_model_output, name=metric.name)
+
+def test_syc_eval_invalid_classification():
+    metric = SycEval()


This test fails:

self = <opik.evaluation.models.litellm.litellm_chat_model.LiteLLMChatModel object at 0x11ef05350> def _check_model_name(self) -> None: import litellm try: _ = litellm.get_llm_provider(self.model_name) except litellm.exceptions.BadRequestError: > raise ValueError(f"Unsupported model: '{self.model_name}'!") E ValueError: Unsupported model: 'llama3-8b'! ../../../../../../src/opik/evaluation/models/litellm/litellm_chat_model.py:102: ValueError

yaricom · 2025-07-08T09:47:58Z

sdks/python/tests/unit/evaluation/metrics/llm_judges/syc_eval/test_parser.py

+        parser.parse_model_output(content=invalid_model_output, name=metric.name)
+
+def test_syc_eval_invalid_sycophancy_type():
+    metric = SycEval()


This test fails:

self = <opik.evaluation.models.litellm.litellm_chat_model.LiteLLMChatModel object at 0x11ef0cef0> def _check_model_name(self) -> None: import litellm try: _ = litellm.get_llm_provider(self.model_name) except litellm.exceptions.BadRequestError: > raise ValueError(f"Unsupported model: '{self.model_name}'!") E ValueError: Unsupported model: 'llama3-8b'! ../../../../../../src/opik/evaluation/models/litellm/litellm_chat_model.py:102: ValueError

yaricom · 2025-07-08T09:48:36Z

sdks/python/tests/unit/evaluation/metrics/llm_judges/syc_eval/test_parser.py

+    ):
+        parser.parse_model_output(content=invalid_model_output, name=metric.name)
+
+def test_syc_eval_invalid_sycophancy_type():


Let’s rename this test for better clarity to reflect what method is being tested, the conditions, and the expected output. Something like this:

test__parse_model_output__syc_eval_invalid_sycophancy_type__raise_error

yaricom · 2025-07-08T09:52:20Z

Dear @yashkumar2603 ! Thank you for committing the changes. Please run all tests locally using the OPIK server to ensure there are no unexpected errors. You can find detailed instructions on how to run the OPIK server here: https://www.comet.com/docs/opik/quickstart

Added Sycophancy Evaluation Metric in SDK, FE, Docs

04ef5d6

yashkumar2603 requested review from a team as code owners June 29, 2025 17:01

algora-pbc bot added the 🙋 Bounty claim label Jun 29, 2025

algora-pbc bot mentioned this pull request Jun 29, 2025

[FR]: New Evaluaton Metric "LLM Sycophancy" (SycEval) #2520

Open

alexkuzmik requested a review from yaricom July 2, 2025 11:56

yaricom requested changes Jul 4, 2025

View reviewed changes

Added unit tests, fixed reviews

104253f

1. Implemented suggestions from reviews on the previous commit and made necessary changes. 2. Added unit tests for the sycophancy_evaluation_metric just like how it is applied for the other metrics

yashkumar2603 requested a review from yaricom July 4, 2025 17:37

yaricom requested changes Jul 7, 2025

View reviewed changes

sdks/python/tests/library_integration/metrics_with_llm_judge/test_evaluation_metrics.py Outdated Show resolved Hide resolved

Fixed reviews on added tests.

2f52da8

Moved test for invalid score into the unit tests as it uses a dummy model and doesnt need to be in integration tests. removed unnecessary @model_parametrizer from the same test.

yashkumar2603 requested a review from yaricom July 7, 2025 17:59

yaricom requested changes Jul 8, 2025

View reviewed changes

Added Sycophancy Evaluation Metric in SDK, FE, Docs #2624

Are you sure you want to change the base?

Added Sycophancy Evaluation Metric in SDK, FE, Docs #2624

Uh oh!

Conversation

yashkumar2603 commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details

Issues

Documentation

Working Video

Uh oh!

yashkumar2603 commented Jun 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vincentkoc commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yaricom commented Jul 4, 2025

Uh oh!

yashkumar2603 commented Jul 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yashkumar2603 commented Jul 4, 2025

Uh oh!

Uh oh!

yaricom commented Jul 7, 2025

Uh oh!

aadereiko commented Jul 7, 2025

Uh oh!

yashkumar2603 commented Jul 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yaricom commented Jul 8, 2025

Uh oh!

Uh oh!

yashkumar2603 commented Jun 29, 2025 •

edited

Loading

yashkumar2603 commented Jun 29, 2025 •

edited

Loading

yashkumar2603 commented Jul 4, 2025 •

edited

Loading