-
Notifications
You must be signed in to change notification settings - Fork 629
feat: multi-turn / reasoning loops + parallel tool calling #370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…into fix/multiple-tool-calling
…into fix/multiple-tool-calling
…into fix/multiple-tool-calling
…multiple-tool-calling
…multiple-tool-calling
|
have a look at https://github.com/piotrostr/listen/blob/main/listen-kit/src/reasoning_loop/gemini.rs, I tried something similar as this PR but the way rig traits are structured makes it really difficult, higher-level struct makes things easier A generic struct with public method Otherwise it gets very hectic So instead of impl traits just have a wider ReasoningLoop struct that accepts any model: https://github.com/piotrostr/listen/blob/main/listen-kit/src/reasoning_loop/mod.rs |
…multiple-tool-calling
|
@0xMochan what's the status of this PR? |
Ready for review, i think there's something wrong with my docstrings not sure how to fix it |
You might need to reference using The link should be evident on how to resolve. |
…multiple-tool-calling
| // We use `UserContent::document` for those who handle it directly! | ||
| let messages = self | ||
| .documents | ||
| .iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to this PR, can't wait for reasoning loops to be merged!
I tried this PR with AWS Bedrock and there is one subtle breaking change. The previous function prompt_with_context merged all documents into a single attachment and the new version creates a separate document for each. That wouldn't be a problem if models didn't have a hard limit for the number of attachments. For AWS Bedrock in particular, it's 5. aws doc
Not sure for other providers, but I think there are similar restrictions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Presumably this can be handled in the bedrock integration module, since this is usually where provider-specific limitations are handled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, great job! I really disliked how prompt_with_context worked in general so i wanted to find a better solution. I'll see if i can make a specific exception for bedrock or if you have a code suggestion, i'm all ears!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since all docs are TXT, can we just fuse them like before but wrap inside UserContent::document ?
...
let messages = self
.documents
.iter()
.map(|doc| doc.to_string())
.collect::<Vec<_>>()
.join(" | ");
let message = UserContent::document(
messages,
Some(ContentFormat::String),
Some(DocumentMediaType::TXT),
);
Some(Message::User {
content: OneOrMany::one(message),
})
I just tried that with bedrock and works as expected.
cvauclair
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work! Couple comments but otherwise this is solid!
| // We use `UserContent::document` for those who handle it directly! | ||
| let messages = self | ||
| .documents | ||
| .iter() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Presumably this can be handled in the bedrock integration module, since this is usually where provider-specific limitations are handled.
…multiple-tool-calling
…multiple-tool-calling
Carlos contirbuted to the original spec of multi-tool calling Co-authored-by: carlos-verdes <[email protected]>
cvauclair
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good!
|
@cvauclair is it time to merge 👀 |
|
Deepseek provider needs to be fixed to support the current pull, Its first argument should be rig/rig-core/src/providers/deepseek.rs Lines 342 to 346 in 2d45ad5
The current error result context is as follows: [
{
"index": 0,
"id": "call_0_7d51f346-b324-4c7e-a328-d76b40a4cb4a",
"type": "function",
"function": {
"name": "add",
"arguments": "{\"x\": 15, \"y\": 25}"
}
},
{
"index": 1,
"id": "call_1_b8844bf0-4431-4f13-a3c1-ca7644f17d11",
"type": "function",
"function": {
"name": "subtract",
"arguments": "{\"x\": 100, \"y\": 50}"
}
},
{
"index": 2,
"id": "call_2_8e271cdb-4079-4639-bee5-875e4d8a4c2c",
"type": "function",
"function": {
"name": "add",
"arguments": "{\"x\": 10, \"y\": 10}"
}
}
]The second request will use the wrong id [
{
"content": "40",
"role": "tool",
"tool_call_id": "add"
},
{
"content": "50",
"role": "tool",
"tool_call_id": "subtract"
},
{
"content": "20",
"role": "tool",
"tool_call_id": "add"
}
]You will get an error: {"error":{"message":"Duplicate value for 'tool_call_id' of add in message[3]","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}The repaired request looks like this: [
{
"content": "40",
"role": "tool",
"tool_call_id": "call_0_eabd8f36-c51b-4c54-8b9c-578c63347442"
},
{
"content": "50",
"role": "tool",
"tool_call_id": "call_1_c4396b60-8971-48e3-9f10-507fc872a3bb"
},
{
"content": "20",
"role": "tool",
"tool_call_id": "call_2_394f0078-2ed9-42b7-9121-dfd0431e6ac6"
}
] |
|
|
This is a series of problems. I'm not sure if I should open a separate issue to discuss these issues, so I'll write them here for now:
Tested with DeepSeek-V3-0324 Note: For issue 3, when I use static tools, it works fine. I found that Cherry Studio uses Prompt + custom parser to implement streaming tool calls (and works well), but does not implement non-streaming tool calls.😂 |
Will be bringing this up internally so we can sync and move quickly on a further course of action, thank you again! (In the meantime, if you're using |
@byeblack |
Here is a simple example that you can easily reproduce: https://github.com/byeblack/rig-multi-turn-demo update: Off-topic:
|
Yes, this is a limitation of our dynamic tool set, and honestly, it's a bit of a limitation here. I'll see where I can maybe transfer dynamic tools to the downstream calling but likely, a better approach is introducing a RAG tool rather than the explicit lag layer we have. This is in a large rework of how agents will work via the middleware agent approach (#346).
The If you are interested in keeping a closer tab on this, please join our discord, I'd love to chat more about how we can evolve this! |
multi_turnandPromptRequestThis PR expands on the
Prompttrait by enabling configurable prompt methods. By tweakingPromptandChatto returnIntoFutureinstead ofFuture, agent can implement a specialized version that returnsPromptRequest, a fluent type-state builder that implementsIntoFuture, allowing for configurable, type-safe prompting.Usage
The main new introduction is the
multi_turnmethod, as this configures the prompt to perform a loop to continuously call tools until the agent is satisfied. This also ensures the model always returns an agentic response at the end, instead of a raw tool response like in the earlier example.Using
.with_historywill allow you to reference a vector of messages that you are borrowing allowing for multi-turn to append to it as need be. This allows for more natural usage patterns as well as better ordering of messages since multi-turn will ensure prompt, tool calls, and tool results get ordered correctly.Caveats/Breaking
Because of the existing behavior of the
Prompttrait, the normal usage w/o.multi_turnwill still suffer from a lack of parallel tool calls AND return direct tool responses. This is due to things like extractors that rely on this. I presume this may change when we get full middleware tech.Additionally, the
Chattrait is still using ownedchat_historywhich uses clones. This was difficult to change because adding borrow requirements to the trait broke every single usage of it very badly. Currently, we useChattrait as an example of creating agent bundles or super agents via a struct of multiple tiny agents which can be used via the same interface. I presume these usage patterns would be replaced with middleware layer tech as the primary go to way of customizing multi-agent patterns.This change is 100% client-side compatible. This change unfortunately slightly alters the type-signatures of
PromptandChattraits which means anyone doing custom implementations of these traits will need adjustments (unless they were using#[allow(refining_impl_trait)]).Another change has been made removing
promptfrom theCompletionRequeststruct since actual providers don't differentiate the prompt from the latest message in chat_history. This allows us to actually order things properly inchat_historyand also allowed me to remove the confusingCompletionRequest::prompt_with_documents. This has been swapped withCompletionRequest::normalized_documentswhich makes document handling more streamlined (they never get added to chat_history, to avoid duplication).Open Questions
The
PromptRequestbuilder typestate is able to encapsulate a lot of configuration for prompting, but the way tool loops are handled are still less than satisfactory. The normal method for it only exists to appease extractors (while middleware gets put together) so it feels odd to have default behavior still be less than satisfactory.Should
.multi_turn(1)be the default and extractors use a special bypass method forshort_circuitorraw?