Agentic Optimizer #2489

dsblank · 2025-06-16T06:18:23Z

Details

Previously, we could only optimize prompts for a generic LiteLLM model. This PR allows optimizing chat-prompts on any agent framework.

Here is an example of the changes in the simplest case:

BEFORE:

from typing import Any, Dict
from opik.evaluation.metrics import LevenshteinRatio
from opik.evaluation.metrics.score_result import ScoreResult
from opik_optimizer.datasets import hotpot_300
from opik_optimizer import (
    ChatPrompt, 
    FewShotBayesianOptimizer,
)
dataset = hotpot_300()

def levenshtein_ratio(dataset_item: Dict[str, Any], llm_output: str) -> ScoreResult:
    metric = LevenshteinRatio()
    return metric.score(reference=dataset_item["answer"], output=llm_output)

project_name = "optimize-few-shot-bayesian-hotpot"

prompt = ChatPrompt(system="Answer the question.", user="{question}")

optimizer = FewShotBayesianOptimizer(
    model="openai/gpt-4o-mini",
    project_name=project_name,
    min_examples=3,
    max_examples=8,
    n_threads=16,
    seed=42,
)

optimization_result = optimizer.optimize_prompt(
    prompt=prompt,
    dataset=dataset,
    metric=levenshtein_ratio,
    n_trials=10,
    n_samples=150,
)

optimization_result.display()

AFTER:

from typing import Dict, Any
from opik.evaluation.metrics import LevenshteinRatio
from opik.evaluation.metrics.score_result import ScoreResult
from opik_optimizer.datasets import hotpot_300
from opik_optimizer import (
    OptimizableAgent,
    ChatPrompt,
    FewShotBayesianOptimizer,
    AgentConfig,
)

dataset = hotpot_300()

def levenshtein_ratio(dataset_item: Dict[str, Any], llm_output: str) -> ScoreResult:
    metric = LevenshteinRatio()
    return metric.score(reference=dataset_item["answer"], output=llm_output)

class LiteLLMAgent(OptimizableAgent):
    model = "openai/gpt-4o-mini"
    project_name = "optimize-few-shot-bayesian-hotpot"

agent_config = AgentConfig(
    chat_prompt=ChatPrompt(system="Answer the question.", user="{question}")
)

optimizer = FewShotBayesianOptimizer(
    model="openai/gpt-4o-mini",
    min_examples=3,
    max_examples=8,
    n_threads=16,
    seed=42,
)

optimization_result = optimizer.optimize_agent(
    agent_class=LiteLLMAgent,
    agent_config=agent_config,
    dataset=dataset,
    metric=levenshtein_ratio,
    n_trials=10,
    n_samples=50,
)
optimization_result.display()

The main difference is replacing:

project_name = "optimize-few-shot-bayesian-hotpot"

prompt = ChatPrompt(system="Answer the question.", user="{question}")

with:

class LiteLLMAgent(OptimizableAgent):
    model = "openai/gpt-4o-mini"
    project_name = "optimize-few-shot-bayesian-hotpot"

agent_config = AgentConfig(
    chat_prompt=ChatPrompt(system="Answer the question.", user="{question}")
)

And passing agent_class and agent_config to the optimize_agent() method.

dsblank mentioned this pull request Jun 16, 2025

Generic Agentic optimization #2480

Closed

dsblank marked this pull request as ready for review June 18, 2025 22:59

dsblank requested review from vincentkoc and a team as code owners June 18, 2025 22:59

dsblank added 26 commits June 18, 2025 19:05

Agentic Optimizer

c4c2326

Added a Pydantic AI example

753a578

Added MetaPromptOptimizer

cc3d704

Added evolutionary optimizer, and litellm example

0c6349f

Type hint fixes

4f2c1c9

Linting

7c4e1f3

Typing

409c3dc

Typing

044a271

Typing

94fee15

Typing

794a706

Typing

74e8c2c

Typing

0bd555d

New ADK evolutionary example

5ae8ec2

Changed AgentConfig to pydantic model

3669d75

Changed AgentConfig to pydantic model

4cdc0d5

Changed AgentConfig to Pydantic model

3dd9371

Changed AgentConfig to Pydantic model

1e942aa

Changed AgentConfig to Pydantic model

25b1783

Changed AgentConfig to Pydantic model

656eedc

Changed AgentConfig to Pydantic model

551d2ce

Changed AgentConfig to Pydantic model

8a54e93

Changed AgentConfig to Pydantic model

a543b74

Various typing fixes

91618c0

Various typing fixes

dc1d32a

Removed old versions of examples

faa1ff1

Added agent_class to experiment config

f041dd6

dsblank added 13 commits June 18, 2025 19:05

Deal with possible None values

ef6bd20

Make methods private

3c53b23

Updated fewshot and all its examples

d4f9c24

WIP: conversion to pure chat-bot

c98db1c

WIP: conversion to pure chat-bot, langgraph

144f655

WIP: metaprompt broken

18c70d2

Fixed metaprompt issue

0bfb605

Log agent_config to experiment

edc4a6a

Don't return None

4b4b84b

All tests passing

358f718

Added agent.evaluate(); fixed benchmarks

45a9ca5

mypy fixes

b3460dd

Added missing required n_threads

16a7bac

dsblank force-pushed the dsb/agentic-optimizer branch from 3cba5db to 16a7bac Compare June 18, 2025 23:05

dsblank added 9 commits June 19, 2025 06:20

Added missing pydantic algorithm examples

20d99d7

Added tool support for the base optimizer

3ce631a

Refine litellm tool use

6c437c0

Reduce threads in examples

793fffe

Make less brittle

1301651

Protect from accessing empty messages

27321a9

Protect from accessing empty messages

2f8a42d

Protect from accessing empty messages

f95f2df

LiteLLM fixes: add tool tracking

02478c9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Agentic Optimizer #2489

Agentic Optimizer #2489

Uh oh!

dsblank commented Jun 16, 2025 •

edited

Loading

Uh oh!

Uh oh!

Agentic Optimizer #2489

Are you sure you want to change the base?

Agentic Optimizer #2489

Uh oh!

Conversation

dsblank commented Jun 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Details

Uh oh!

Uh oh!

dsblank commented Jun 16, 2025 •

edited

Loading