[Feature Request] Add support for evaluating baseline

Currently, only new experiments can be created. However, in some cases, such as [running backtests on a new version of an agent](https://docs.smith.langchain.com/evaluation/tutorials/backtesting#evaluate-baseline), it is necessary to run evals on an experiment that doesn't have evaluation results.

By adding support for evaluations to be run on a specific example, evaluations can be run against a baseline experiment - useful when converting production traces to an experiment.

This can be achieved by allowing an experiment name to be provided in a config file instead of a dataset, target function, experiment prefix, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Add support for evaluating baseline #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Add support for evaluating baseline #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions