You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+26-6Lines changed: 26 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,13 @@
1
1
2
-
# LLM Attacker - AWS
2
+
# MIPSEval
3
3
4
-
LLM Attacker is a modular framework for simulating and evaluating the behavior of Large Language Models (LLMs) in adversarial or structured multi-turn conversational scenarios. It supports both OpenAI-hosted models and locally hosted models.
4
+
Multi-turn Injection Planning System for LLM Evaluation
5
+
6
+
MIPSEval is a modular framework for simulating and evaluating the behavior of Large Language Models (LLMs) in adversarial or structured multi-turn conversational scenarios. It supports both OpenAI-hosted models and locally hosted models.
7
+
8
+
MIPSEval uses LLMs to design a conversation strategy as well as execute it, making it fully automated. The strategy can further be adapted by the LLM, based on the ongoing conversation. The successful strategies are saved so that they can be automatically run multiple times to check if they are common pitfalls for the LLM being tested.
Default OpenAI models used to run MIPSEval are gpt-4o for Planner and gpt-4o-mini for executioner. This can be changed in ```setup.py``` for executioner and ```llm_planner.py``` for planner (in get_step_for_evaluator function). Testing was done with the default models used.
0 commit comments