feat: multi-turn / reasoning loops + parallel tool calling #370

0xMochan · 2025-04-01T02:01:34Z

`multi_turn` and `PromptRequest`

Supercedes #290 and #224

This PR expands on the Prompt trait by enabling configurable prompt methods. By tweaking Prompt and Chat to return IntoFuture instead of Future, agent can implement a specialized version that returns PromptRequest, a fluent type-state builder that implements IntoFuture, allowing for configurable, type-safe prompting.

Usage

let agent = client
    .agent(anthropic::CLAUDE_3_5_SONNET)
    .preamble("...")
    .build();

# existing usage still works
agent.prompt("how tall is michael jordan").await?;
agent.chat("how tall is michael jordan", vec![]).await?;

# new usage lets you work with existing chat histories
let mut chat_history = vec![];

agent
  .prompt("how tall is michael jordan")
  .with_history(&mut chat_history)
  .await?;

agent
  .prompt("Calculate 5 - 2 = ?. Describe the result to me.")
  .with_history(&mut chat_history)
  .multi_turn(20)
  .await?;

The main new introduction is the multi_turn method, as this configures the prompt to perform a loop to continuously call tools until the agent is satisfied. This also ensures the model always returns an agentic response at the end, instead of a raw tool response like in the earlier example.

image referenced from #290

Using .with_history will allow you to reference a vector of messages that you are borrowing allowing for multi-turn to append to it as need be. This allows for more natural usage patterns as well as better ordering of messages since multi-turn will ensure prompt, tool calls, and tool results get ordered correctly.

Caveats/Breaking

Because of the existing behavior of the Prompt trait, the normal usage w/o .multi_turn will still suffer from a lack of parallel tool calls AND return direct tool responses. This is due to things like extractors that rely on this. I presume this may change when we get full middleware tech.

Additionally, the Chat trait is still using owned chat_history which uses clones. This was difficult to change because adding borrow requirements to the trait broke every single usage of it very badly. Currently, we use Chat trait as an example of creating agent bundles or super agents via a struct of multiple tiny agents which can be used via the same interface. I presume these usage patterns would be replaced with middleware layer tech as the primary go to way of customizing multi-agent patterns.

This change is 100% client-side compatible. This change unfortunately slightly alters the type-signatures of Prompt and Chat traits which means anyone doing custom implementations of these traits will need adjustments (unless they were using #[allow(refining_impl_trait)]).

Another change has been made removing prompt from the CompletionRequest struct since actual providers don't differentiate the prompt from the latest message in chat_history. This allows us to actually order things properly in chat_history and also allowed me to remove the confusing CompletionRequest::prompt_with_documents. This has been swapped with CompletionRequest::normalized_documents which makes document handling more streamlined (they never get added to chat_history, to avoid duplication).

Open Questions

The PromptRequest builder typestate is able to encapsulate a lot of configuration for prompting, but the way tool loops are handled are still less than satisfactory. The normal method for it only exists to appease extractors (while middleware gets put together) so it feels odd to have default behavior still be less than satisfactory.

Should .multi_turn(1) be the default and extractors use a special bypass method for short_circuit or raw?

…/cohere-v2

…into fix/multiple-tool-calling

…/cohere-v2

…into fix/multiple-tool-calling

…multiple-tool-calling

piotrostr · 2025-04-03T12:30:53Z

have a look at https://github.com/piotrostr/listen/blob/main/listen-kit/src/reasoning_loop/gemini.rs, I tried something similar as this PR but the way rig traits are structured makes it really difficult, higher-level struct makes things easier

A generic struct with public method stream and match arm for picking the model

Otherwise it gets very hectic

So instead of impl traits just have a wider ReasoningLoop struct that accepts any model: https://github.com/piotrostr/listen/blob/main/listen-kit/src/reasoning_loop/mod.rs

…multiple-tool-calling

cvauclair · 2025-04-10T18:38:10Z

@0xMochan what's the status of this PR?

0xMochan · 2025-04-11T01:31:32Z

@0xMochan what's the status of this PR?

Ready for review, i think there's something wrong with my docstrings not sure how to fix it

joshua-mo-143 · 2025-04-14T12:14:59Z

@0xMochan what's the status of this PR?

Ready for review, i think there's something wrong with my docstrings not sure how to fix it

You might need to reference using crate::foo::Bar format if the struct links don't resolve.

The link should be evident on how to resolve.

…multiple-tool-calling

mkranjac · 2025-04-17T18:06:42Z

rig-core/src/completion/request.rs

+        // We use `UserContent::document` for those who handle it directly!
+        let messages = self
+            .documents
+            .iter()


Thanks to this PR, can't wait for reasoning loops to be merged!

I tried this PR with AWS Bedrock and there is one subtle breaking change. The previous function prompt_with_context merged all documents into a single attachment and the new version creates a separate document for each. That wouldn't be a problem if models didn't have a hard limit for the number of attachments. For AWS Bedrock in particular, it's 5. aws doc
Not sure for other providers, but I think there are similar restrictions...

Good catch! Presumably this can be handled in the bedrock integration module, since this is usually where provider-specific limitations are handled.

Yea, great job! I really disliked how prompt_with_context worked in general so i wanted to find a better solution. I'll see if i can make a specific exception for bedrock or if you have a code suggestion, i'm all ears!

since all docs are TXT, can we just fuse them like before but wrap inside UserContent::document ?

... let messages = self .documents .iter() .map(|doc| doc.to_string()) .collect::<Vec<_>>() .join(" | "); let message = UserContent::document( messages, Some(ContentFormat::String), Some(DocumentMediaType::TXT), ); Some(Message::User { content: OneOrMany::one(message), })

I just tried that with bedrock and works as expected.

cvauclair

Excellent work! Couple comments but otherwise this is solid!

cvauclair · 2025-04-17T19:11:31Z

rig-core/src/completion/request.rs

+        // We use `UserContent::document` for those who handle it directly!
+        let messages = self
+            .documents
+            .iter()


Good catch! Presumably this can be handled in the bedrock integration module, since this is usually where provider-specific limitations are handled.

rig-core/src/agent/prompt_request.rs

rig-core/src/agent/completion.rs

…multiple-tool-calling

Carlos contirbuted to the original spec of multi-tool calling Co-authored-by: carlos-verdes <[email protected]>

cvauclair

Looking good!

0xMochan · 2025-04-21T16:44:59Z

@cvauclair is it time to merge 👀

byeblack · 2025-04-22T23:52:16Z

Deepseek provider needs to be fixed to support the current pull, Its first argument should be call.id

rig/rig-core/src/providers/deepseek.rs

Lines 342 to 346 in 2d45ad5

    
           completion::AssistantContent::tool_call( 
        
               &call.function.name, 
        
               &call.function.name, 
        
               call.function.arguments.clone(), 
        
           )

The current error result context is as follows:
response:

[
  {
    "index": 0,
    "id": "call_0_7d51f346-b324-4c7e-a328-d76b40a4cb4a",
    "type": "function",
    "function": {
      "name": "add",
      "arguments": "{\"x\": 15, \"y\": 25}"
    }
  },
  {
    "index": 1,
    "id": "call_1_b8844bf0-4431-4f13-a3c1-ca7644f17d11",
    "type": "function",
    "function": {
      "name": "subtract",
      "arguments": "{\"x\": 100, \"y\": 50}"
    }
  },
  {
    "index": 2,
    "id": "call_2_8e271cdb-4079-4639-bee5-875e4d8a4c2c",
    "type": "function",
    "function": {
      "name": "add",
      "arguments": "{\"x\": 10, \"y\": 10}"
    }
  }
]

The second request will use the wrong id

[
  {
    "content": "40",
    "role": "tool",
    "tool_call_id": "add"
  },
  {
    "content": "50",
    "role": "tool",
    "tool_call_id": "subtract"
  },
  {
    "content": "20",
    "role": "tool",
    "tool_call_id": "add"
  }
]

You will get an error:

{"error":{"message":"Duplicate value for 'tool_call_id' of add in message[3]","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

The repaired request looks like this:

[
  {
    "content": "40",
    "role": "tool",
    "tool_call_id": "call_0_eabd8f36-c51b-4c54-8b9c-578c63347442"
  },
  {
    "content": "50",
    "role": "tool",
    "tool_call_id": "call_1_c4396b60-8971-48e3-9f10-507fc872a3bb"
  },
  {
    "content": "20",
    "role": "tool",
    "tool_call_id": "call_2_394f0078-2ed9-42b7-9121-dfd0431e6ac6"
  }
]

joshua-mo-143 · 2025-04-23T00:17:16Z

Deepseek provider needs to be fixed to support the current pull, Its first argument should be call.id

rig/rig-core/src/providers/deepseek.rs

Lines 342 to 346 in 2d45ad5

    
           completion::AssistantContent::tool_call( 
        
               &call.function.name, 
        
               &call.function.name, 
        
               call.function.arguments.clone(), 
        
           )

The current error result context is as follows: response:

[
  {
    "index": 0,
    "id": "call_0_7d51f346-b324-4c7e-a328-d76b40a4cb4a",
    "type": "function",
    "function": {
      "name": "add",
      "arguments": "{\"x\": 15, \"y\": 25}"
    }
  },
  {
    "index": 1,
    "id": "call_1_b8844bf0-4431-4f13-a3c1-ca7644f17d11",
    "type": "function",
    "function": {
      "name": "subtract",
      "arguments": "{\"x\": 100, \"y\": 50}"
    }
  },
  {
    "index": 2,
    "id": "call_2_8e271cdb-4079-4639-bee5-875e4d8a4c2c",
    "type": "function",
    "function": {
      "name": "add",
      "arguments": "{\"x\": 10, \"y\": 10}"
    }
  }
]

The second request will use the wrong id

[
  {
    "content": "40",
    "role": "tool",
    "tool_call_id": "add"
  },
  {
    "content": "50",
    "role": "tool",
    "tool_call_id": "subtract"
  },
  {
    "content": "20",
    "role": "tool",
    "tool_call_id": "add"
  }
]

You will get an error:

{"error":{"message":"Duplicate value for 'tool_call_id' of add in message[3]","type":"invalid_request_error","param":null,"code":"invalid_request_error"}}

The repaired request looks like this:

[
  {
    "content": "40",
    "role": "tool",
    "tool_call_id": "call_0_eabd8f36-c51b-4c54-8b9c-578c63347442"
  },
  {
    "content": "50",
    "role": "tool",
    "tool_call_id": "call_1_c4396b60-8971-48e3-9f10-507fc872a3bb"
  },
  {
    "content": "20",
    "role": "tool",
    "tool_call_id": "call_2_394f0078-2ed9-42b7-9121-dfd0431e6ac6"
  }
]

#414

byeblack · 2025-04-23T07:16:43Z

This is a series of problems. I'm not sure if I should open a separate issue to discuss these issues, so I'll write them here for now:
~~1. Multi-turn may fail when the user uses a non-English language (you will often get the following results)~~

<｜tool▁calls▁begin｜><｜tool▁call▁begin｜>function<｜tool▁sep｜>send_message_text
```json
{"message":"xxxxxxx"}
```<｜tool▁call▁end｜><｜tool▁calls▁end｜>

Dynamic tools cannot switch between multi-turn contexts (when you have dozens of tools, they can't coordinate)
3. Even with multi-turn set up, simple tasks still cannot be completed (I'm considering writing a simple complex example🤯), The same prompt only takes 2-3 requests to complete in Cherry Studio, but it can't complete the task in rig, I need to spend time to study it

Tested with DeepSeek-V3-0324

Note:
For issue 1, that's just a symptom of missing tools. For dynamic tools, the list is currently only sent on the first request. When using static tools, the tool list is always sent.

For issue 3, when I use static tools, it works fine. I found that Cherry Studio uses Prompt + custom parser to implement streaming tool calls (and works well), but does not implement non-streaming tool calls.😂

joshua-mo-143 · 2025-04-23T09:47:10Z

This is a series of problems. I'm not sure if I should open a separate issue to discuss these issues, so I'll write them here for now:

Multi-turn may fail when the user uses a non-English language (you will often get the following results)
<｜tool▁calls▁begin｜><｜tool▁call▁begin｜>function<｜tool▁sep｜>send_message_text
```json
{"message":"xxxxxxx"}
```<｜tool▁call▁end｜><｜tool▁calls▁end｜>
Dynamic tools cannot switch between multi-turn contexts (when you have dozens of tools, they can't coordinate)

Even with multi-turn set up, simple tasks still cannot be completed (I'm considering writing a simple complex example🤯), The same prompt only takes 2-3 requests to complete in Cherry Studio, but it can't complete the task in rig, I need to spend time to study it

Tested with DeepSeek-V3-0324

Will be bringing this up internally so we can sync and move quickly on a further course of action, thank you again!

(In the meantime, if you're using main branch in your project, you might need to avoid multi-turn for now and use manual turns until a fix can get merged in)

0xMochan · 2025-04-23T18:42:30Z

This is a series of problems. I'm not sure if I should open a separate issue to discuss these issues, so I'll write them here for now: ~~1. Multi-turn may fail when the user uses a non-English language (you will often get the following results)~~
<｜tool▁calls▁begin｜><｜tool▁call▁begin｜>function<｜tool▁sep｜>send_message_text
```json
{"message":"xxxxxxx"}
```<｜tool▁call▁end｜><｜tool▁calls▁end｜>
Dynamic tools cannot switch between multi-turn contexts (when you have dozens of tools, they can't coordinate)
3. Even with multi-turn set up, simple tasks still cannot be completed (I'm considering writing a simple complex example🤯), The same prompt only takes 2-3 requests to complete in Cherry Studio, but it can't complete the task in rig, I need to spend time to study it

Tested with DeepSeek-V3-0324

Note: For issue 1, that's just a symptom of missing tools. For dynamic tools, the list is currently only sent on the first request. When using static tools, the tool list is always sent.

For issue 3, when I use static tools, it works fine. I found that Cherry Studio uses Prompt + custom parser to implement streaming tool calls (and works well), but does not implement non-streaming tool calls.😂

@byeblack
I'm trying to parse thru and re-produce the issues described here. Can you make an explicit new issue with steps so I can reproduce?

byeblack · 2025-04-23T20:43:49Z

This is a series of problems. I'm not sure if I should open a separate issue to discuss these issues, so I'll write them here for now: ~~1. Multi-turn may fail when the user uses a non-English language (you will often get the following results)~~
<｜tool▁calls▁begin｜><｜tool▁call▁begin｜>function<｜tool▁sep｜>send_message_text
```json
{"message":"xxxxxxx"}
```<｜tool▁call▁end｜><｜tool▁calls▁end｜>
Dynamic tools cannot switch between multi-turn contexts (when you have dozens of tools, they can't coordinate)
3. Even with multi-turn set up, simple tasks still cannot be completed (I'm considering writing a simple complex example🤯), The same prompt only takes 2-3 requests to complete in Cherry Studio, but it can't complete the task in rig, I need to spend time to study it

Tested with DeepSeek-V3-0324
Note: For issue 1, that's just a symptom of missing tools. For dynamic tools, the list is currently only sent on the first request. When using static tools, the tool list is always sent.
For issue 3, when I use static tools, it works fine. I found that Cherry Studio uses Prompt + custom parser to implement streaming tool calls (and works well), but does not implement non-streaming tool calls.😂
@byeblack I'm trying to parse thru and re-produce the issues described here. Can you make an explicit new issue with steps so I can reproduce?

Here is a simple example that you can easily reproduce: https://github.com/byeblack/rig-multi-turn-demo

update:
By logging, you will find that all follow-up requests have lost the tool list.
I found that Cherry Studio's follow-up mechanism is to follow up if the tool is called, otherwise it will directly output the content. If rig can have a mechanism to detect the call of the tool, I think I will take this approach.

Off-topic:

Reasoning loops through extractors are also not practical for me, because I need to manually add tool context to let LLM know how to better assign tasks.
If possible, I also hope that there is some means to manually intervene in reasoning loops, for two reasons, one is because we can evaluate whether the process is correct; the other is that we can add more context. This way, we don't have to waste tokens and customer time. Multiple tools in parallel are good, but the wrong direction will only waste resources.

0xMochan · 2025-04-23T22:32:09Z

By logging, you will find that all follow-up requests have lost the tool list.
I found that Cherry Studio's follow-up mechanism is to follow up if the tool is called, otherwise it will directly output the content. If rig can have a mechanism to detect the call of the tool, I think I will take this approach.

Yes, this is a limitation of our dynamic tool set, and honestly, it's a bit of a limitation here. I'll see where I can maybe transfer dynamic tools to the downstream calling but likely, a better approach is introducing a RAG tool rather than the explicit lag layer we have. This is in a large rework of how agents will work via the middleware agent approach (#346).

Reasoning loops through extractors are also not practical for me, because I need to manually add tool context to let LLM know how to better assign tasks.
If possible, I also hope that there is some means to manually intervene in reasoning loops, for two reasons, one is because we can evaluate whether the process is correct; the other is that we can add more context. This way, we don't have to waste tokens and customer time. Multiple tools in parallel are good, but the wrong direction will only waste resources.

The reasoning_loop.rs serves as an example that should be useful to build off of. It's not designed to be used directly, hense why it's not in our repo directly and is an example. I agree! I think a proper reasoning loop needs ways for the user / client to intervene with code like error recovery and such. This is the beginning of reasoning loops and multi-turn in rig and I think it'll only get better from here.

If you are interested in keeping a closer tab on this, please join our discord, I'd love to chat more about how we can evolve this!

0xMochan added 14 commits February 24, 2025 08:15

fix(cohere): partial v2

16ae393

feat(cohere): partial

3b89ceb

feat: cohere v2 impl

3084d30

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into feat…

d66b880

…/cohere-v2

partial

94a6b74

Merge branch 'feat/cohere-v2' of https://github.com/0xPlaygrounds/rig …

c3c20f4

…into fix/multiple-tool-calling

fix(cohere): working cohere impl

93c85a7

Merge branch 'feat/cohere-v2' of https://github.com/0xPlaygrounds/rig …

dc9555b

…into fix/multiple-tool-calling

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into feat…

a46ab09

…/cohere-v2

Merge branch 'feat/cohere-v2' of https://github.com/0xPlaygrounds/rig …

eaeafe4

…into fix/multiple-tool-calling

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into fix/…

ea9100d

…multiple-tool-calling

partial

f2fb723

partial: mostly implemented, some borrowing issues

4a1beae

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into fix/…

e1a2275

…multiple-tool-calling

joshua-mo-143 added the breaking label Apr 2, 2025

fix: pipelines disabled but multi-turn works

962c326

fix: it works :D

7d11548

joshua-mo-143 mentioned this pull request Apr 7, 2025

feat: impl base Tower services/layers #365

Merged

0xMochan added 2 commits April 7, 2025 16:41

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into fix/…

d618d1a

…multiple-tool-calling

fix: tests

4b38294

0xMochan added 2 commits April 10, 2025 18:04

refactor: blow up agent and introduce typestate

44aa8e9

style: clippy fmt

fd3c4f8

0xMochan marked this pull request as ready for review April 11, 2025 01:31

0xMochan requested a review from cvauclair April 11, 2025 01:31

0xMochan mentioned this pull request Apr 11, 2025

refactor: Improve Streaming API #388

Merged

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into fix/…

368f9df

…multiple-tool-calling

0xMochan added 2 commits April 14, 2025 16:55

style: fix docstrings

f0e0f62

style: missed edit from last commit

a86bb4f

0xMochan requested a review from joshua-mo-143 April 16, 2025 18:41

mkranjac reviewed Apr 17, 2025

View reviewed changes

cvauclair reviewed Apr 17, 2025

View reviewed changes

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into fix/…

06a9bcf

…multiple-tool-calling

0xMochan mentioned this pull request Apr 17, 2025

fix: AWS Bedrock documents #404

Closed

0xMochan and others added 2 commits April 17, 2025 15:36

Merge branch 'main' of https://github.com/0xPlaygrounds/rig into fix/…

97cd0a8

…multiple-tool-calling

refactor: update message handling in completion and streaming models.

1dd15d8

Carlos contirbuted to the original spec of multi-tool calling Co-authored-by: carlos-verdes <[email protected]>

0xMochan requested a review from cvauclair April 18, 2025 23:48

cvauclair approved these changes Apr 21, 2025

View reviewed changes

0xMochan mentioned this pull request Apr 22, 2025

bug: Chat::chat treats content: "" as plain text on XAI API function calls, ignoring tool_calls #410

Closed

0xMochan merged commit 2d45ad5 into main Apr 22, 2025
5 checks passed

0xMochan deleted the fix/multiple-tool-calling branch April 22, 2025 20:38

This was referenced Apr 22, 2025

feat: handle multiple tool calls #224

Closed

feat: tool result callback #290

Closed

github-actions bot mentioned this pull request Apr 22, 2025

chore: release #402

Merged

joshua-mo-143 mentioned this pull request Apr 23, 2025

fix: deepseek function call conversion typo #414

Merged

0xMochan mentioned this pull request Apr 25, 2025

feat: think tool, vector store tool, better agent tool #424

Merged

joshua-mo-143 mentioned this pull request Apr 25, 2025

refactor(docs): add multi-turn info to agent page 0xPlaygrounds/rig-docs#23

Merged

0xMochan mentioned this pull request Apr 30, 2025

feat: tool call results are not sent back to model, giving json results back to user instead of human readable text #289

Closed

joshua-mo-143 mentioned this pull request Sep 10, 2025

Add reasoning content for deepseek R1 API and tackle problems of tool_calls #324

Closed

feat: multi-turn / reasoning loops + parallel tool calling #370

feat: multi-turn / reasoning loops + parallel tool calling #370

Uh oh!

Conversation

0xMochan commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

multi_turn and PromptRequest

Usage

Caveats/Breaking

Open Questions

Uh oh!

piotrostr commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cvauclair commented Apr 10, 2025

Uh oh!

0xMochan commented Apr 11, 2025

Uh oh!

joshua-mo-143 commented Apr 14, 2025

Uh oh!

mkranjac Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

cvauclair Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

0xMochan Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

mkranjac Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

cvauclair left a comment

Choose a reason for hiding this comment

Uh oh!

cvauclair Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cvauclair left a comment

Choose a reason for hiding this comment

Uh oh!

0xMochan commented Apr 21, 2025

Uh oh!

Uh oh!

byeblack commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joshua-mo-143 commented Apr 23, 2025

Uh oh!

byeblack commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joshua-mo-143 commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0xMochan commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

byeblack commented Apr 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

0xMochan commented Apr 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

0xMochan commented Apr 1, 2025 •

edited

Loading

`multi_turn` and `PromptRequest`

piotrostr commented Apr 3, 2025 •

edited

Loading

byeblack commented Apr 22, 2025 •

edited

Loading

byeblack commented Apr 23, 2025 •

edited

Loading

joshua-mo-143 commented Apr 23, 2025 •

edited

Loading

0xMochan commented Apr 23, 2025 •

edited

Loading

byeblack commented Apr 23, 2025 •

edited

Loading