Trying out Gemini's `Tool` API #718

DonggeLiu · 2024-11-18T04:45:55Z

Trying out the Tool API on a new agent (Repairer) as a PoC.
This is an alternative to system instructions (#717, which had an unreproducible error) to address the malformatted response issue.
If this works well, I will use Tool in Prototyper too.

Repairer focuses on fixing fuzz target (and build script) when prototyper encounters a build error. It comes with different prompts (and likely examples later).

DonggeLiu · 2024-11-18T04:46:26Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-18T05:20:20Z

Hi @mihaimaruseac I am unsure what's the best way to let an agent focus on one tool in our use case. Could you please let me know if I missed (or am wrong about) anything?

Given the goal is to generate a buildable fuzz target, the agent has two main tasks:

Understand the build error of a given fuzz target, then generate a revised one.
Collect information from the build environment (e.g., inspect source code).

Correspondingly, there are two tools (or FunctionCalls):

A compile function to build the fuzz target when LLM generates a new one, and show build errors (if any).
A Bash function to find or view source code.

I did not separate them into two agents, because I think task 1 needs both tools: In order to understand a build error (e.g., wrong type), the agent needs to cross-check the related source code (e.g., type/function definition).

Also I am unsure if I used the Tool API in the best way.
The API does not support using multiple tools (and passing their function parameter definitions), My mitigation is to define an overall parameter as a struct/object:

function_call:{
overal_parameter: {
  args:{
    tool1_params: {
      args: {...}
    }
    tool2_params: {
      args: {...}
    }
  }
}
}

Then I call tool1/tool2 with the params from tool1_params/tool2_params. I can also define which params are mandatory.

I thought about a flattened struct:

function_call: {
  args:{
    tool1_param_1
    tool1_param_2
    tool2_param_2
    tool2_param_2
  }
}

But LLM can mix/miss some params, and I cannot define which params are mandatory given I don't know which tool will use.

Thanks!

DonggeLiu · 2024-11-19T02:53:26Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-19T04:11:06Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-19T05:09:00Z

Apparently Tool API also has the same input token limit error (32768).
https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-19-718-dg-comparison/sample/output-ada-url-ada_can_parse_with_base/09.html

DonggeLiu · 2024-11-19T05:38:12Z

/gcbrun exp -n dg -ag

mihaimaruseac · 2024-11-19T19:17:36Z

For the agents part I think it all makes sense.

On the Tool API, from the description it looks like it should support multiple functions. Is it that the agents gets confused on which parameters belong to which function? Or the API actually doesn't support multiple functions, although documentation says it does?

DonggeLiu · 2024-11-19T23:47:00Z

Thanks @mihaimaruseac

Or the API actually doesn't support multiple functions, although documentation says it does?

Yep, it's a limitation from the API, despite the description and definition suggests otherwise.

I recall I discovered this from an exception thrown by vertex when I defined 2 tools.

mihaimaruseac · 2024-11-19T23:52:14Z

Yeah, in this case the nested structs work

oliverchang · 2024-11-22T04:59:08Z

This is an alternative to system instructions (#717, which had an unreproducible error) to address the malformatted response issue.

System instructions and tools serve different purposes right?

There's still value to having system instructions for describing the general task (I assume this is better than just us prepending this in our first prompt)?

oliverchang · 2024-11-22T05:03:55Z

@DonggeLiu can you explain the limitations here a bit more?

We can only have 1 tool... but each tool supports multiple functions right? Can't we still achieve what we want with this?

i.e.

tool {
  function_declarations [ bash, compile ]
}

(from https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling)
?

DonggeLiu · 2024-11-23T21:22:49Z

System instructions and tools serve different purposes right?
There's still value to having system instructions for describing the general task (I assume this is better than just us prepending this in our first prompt)?

Yep, they have different purpose and system instruction can be used to describe general tasks.

When I created #717 and this PR, the main goal was to ensure LLM to respond in a predefined format (XML-style tags) so that we can parse the tool commands (e.g., bash, compile). The instruction content in #717 for this purpose is no longer needed because this PR shows Tool API already ensures all LM responses are in well-formatted JSONs.

I plan to cherry-pick some commits from #717 and this PR so that we use system instructions with a new content, focusing on:

the overall goal
some guidelines when using tools.

DonggeLiu · 2024-11-23T21:34:38Z

https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/function-calling

Thanks, maybe I defined multiple tools instead of multiple functions in one tool and misremembered?
I will re-implement the function declaration as in the doc and verify it here.

On hindsight, I should have committed and pushed the old code with the code and its error is documented in this PR for us to double-check.

DonggeLiu added 10 commits November 16, 2024 09:27

Copied from prototyper

ea77426

Update aiplatform API to avoid escape error in responses

9b077ae

A PoC repairer

dcda9b0

Set tools for GenerativeModel

da43d0c

LLM response can be GenerationResponse type when using tools

ffd6d80

Use Repairer in prototyper

eb8c5c4

Support having function-under-test in priming

08fcc75

A PoC template builder for Repairer

2601ea9

better usage of Tool

5dbfd96

lint

beaac01

DonggeLiu marked this pull request as draft November 18, 2024 04:46

DonggeLiu requested a review from mihaimaruseac November 18, 2024 04:52

Build agent base image in cloud build

1bf44f3

DonggeLiu added 3 commits November 19, 2024 14:49

Retry on TooManyRequests error

f10733c

Minor bug fix

1016109

Minor prompt and comment changes

7eab405

DonggeLiu mentioned this pull request Nov 19, 2024

Remove northamerica-northeast1 from Vertex AI regions #719

Merged

DonggeLiu force-pushed the fuzz-target-repairer branch from 7eab405 to 7bcbe5a Compare November 19, 2024 05:37

lint

5a775f6

DonggeLiu force-pushed the fuzz-target-repairer branch from 7bcbe5a to 5a775f6 Compare November 19, 2024 09:53

DonggeLiu mentioned this pull request Nov 22, 2024

Agent improvements: Adopt system instructions and allow multiple command executions #717

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Trying out Gemini's `Tool` API #718

Trying out Gemini's `Tool` API #718

Uh oh!

DonggeLiu commented Nov 18, 2024 •

edited

Loading

Uh oh!

DonggeLiu commented Nov 18, 2024

Uh oh!

DonggeLiu commented Nov 18, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

mihaimaruseac commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

mihaimaruseac commented Nov 19, 2024

Uh oh!

oliverchang commented Nov 22, 2024

Uh oh!

oliverchang commented Nov 22, 2024 •

edited

Loading

Uh oh!

DonggeLiu commented Nov 23, 2024

Uh oh!

DonggeLiu commented Nov 23, 2024

Uh oh!

Uh oh!

Trying out Gemini's Tool API #718

Are you sure you want to change the base?

Trying out Gemini's Tool API #718

Uh oh!

Conversation

DonggeLiu commented Nov 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DonggeLiu commented Nov 18, 2024

Uh oh!

DonggeLiu commented Nov 18, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

mihaimaruseac commented Nov 19, 2024

Uh oh!

DonggeLiu commented Nov 19, 2024

Uh oh!

mihaimaruseac commented Nov 19, 2024

Uh oh!

oliverchang commented Nov 22, 2024

Uh oh!

oliverchang commented Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DonggeLiu commented Nov 23, 2024

Uh oh!

DonggeLiu commented Nov 23, 2024

Uh oh!

Uh oh!

Trying out Gemini's `Tool` API #718

Trying out Gemini's `Tool` API #718

DonggeLiu commented Nov 18, 2024 •

edited

Loading

oliverchang commented Nov 22, 2024 •

edited

Loading