You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, only new experiments can be created. However, in some cases, such as running backtests on a new version of an agent, it is necessary to run evals on an experiment that doesn't have evaluation results.
By adding support for evaluations to be run on a specific example, evaluations can be run against a baseline experiment - useful when converting production traces to an experiment.
This can be achieved by allowing an experiment name to be provided in a config file instead of a dataset, target function, experiment prefix, etc.