Agent improvements: Adopt system instructions and allow multiple command executions #717

DonggeLiu · 2024-11-13T03:40:31Z

Allow passing system instructions to LLM
Allow executing multiple bash commands in one response
Prompt fixes
Minor corrections and bug fixes

DonggeLiu · 2024-11-13T03:43:40Z

In addition to the new features, this also generated buildable fuzz targets for project xs in local experiments for the first time (IIRC):

2024-11-13 14:33:05 [Trial ID: 01] INFO [logger.info]: ===== ROUND 10 Recompile =====
2024-11-13 14:33:11 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 compilation time: 0:00:06.169302
2024-11-13 14:33:11 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Fuzz target compiles: True
2024-11-13 14:33:12 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Final fuzz target binary exists: True
2024-11-13 14:33:13 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Final fuzz target function referenced: True

Past:

DonggeLiu · 2024-11-13T03:49:13Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-13T11:10:35Z

Report: https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-13-717-dg-comparison/index.html

Seeing many errors like:

File "/usr/local/lib/python3.11/dist-packages/google/api_core/grpc_helpers.py", line 76, in error_remapped_callable
return callable_(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1181, in __call__
return _end_unary_response_blocking(state, call, False, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Unable to submit request because the input token count is 35103 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models"
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.72.170:443 {grpc_message:"Unable to submit request because the input token count is 35103 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models", grpc_status:3, created_time:"2024-11-13T04:04:17.961683542+00:00"}"

This is likely due to the new system instructions added, I will lower input size limit accordingly.

Good news is finally got non-0 build rate on both benchmarks from xs:

DonggeLiu · 2024-11-13T11:55:00Z

/gcbrun exp -n dg1 -ag

DonggeLiu · 2024-11-14T05:45:07Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-14T06:33:55Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-14T10:11:23Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-14T11:32:42Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-15T03:23:25Z

Hi @mihaimaruseac, could you please help me check if I did something wrong in this PR that can cause the following error when invoking chat model's send_message()?
I keep getting the following over-long input error (example) in cloud experiments but I cannot reproduce it locally:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/google/api_core/grpc_helpers.py", line 76, in error_remapped_callable
return callable_(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1181, in __call__
return _end_unary_response_blocking(state, call, False, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Unable to submit request because the input token count is 58602 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models"
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.188.234:443 {grpc_message:"Unable to submit request because the input token count is 58602 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models", grpc_status:3, created_time:"2024-11-14T11:52:11.683245084+00:00"}"
>

IIUC, the message says the input has 58602 tokens, exceeding the 32768 token limit.
I guess this relates to using system_instruction because I did not see this error before, but I find it confusing for two reasons:

Re-run the same steps locally won't trigger this error. The number of token is ~57719 (close to 58602, which makes sense), greater than the token limit 32768. But I got a valid response from LLM and did not trigger the error.
Cannot find where the 32768 limit come from. I noticed several places mentioning this number (e.g., context caching) but I reckon since we did not use them manually, they are not relevant to us?

Thanks!

DonggeLiu · 2024-11-15T03:31:01Z

Meanwhile, I will attempt using the built-in Tool API to replace the current manual XML parsing.

mihaimaruseac · 2024-11-15T17:24:22Z

I think this is because the error log from the compilation is too long?

The model itself has a limit of tokens, afaik, and we might be hitting that?

DonggeLiu · 2024-11-15T22:09:40Z

I think this is because the error log from the compilation is too long?

Hmm... If that's the case, shouldn't we be able to reproduce the error locally?
When running it locally, I did not get any error even when the input has 57719 tokens, which is much large than the 32768 limit mentioned in the error message from cloud experiment.

The model itself has a limit of tokens, afaik, and we might be hitting that?

Please correct me if I am wrong, but I thought gemini-1.5-pro input token limit is much larger:
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-1.5-pro:~:text=Max%20input%20tokens%3A%202%2C097%2C152

mihaimaruseac · 2024-11-15T22:17:58Z

Oh, I was wrong. I'll think about it, but currently I don't have an idea, sorry

DonggeLiu · 2024-11-15T22:57:30Z

Oh, I was wrong. I'll think about it, but currently I don't have an idea, sorry

Thanks!
If you have some theories later, I am happy to try them out in this PR : )

oliverchang · 2024-11-19T05:09:56Z

I believe the token limit issues are because of this:

https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#ml_processing

Not all regions support longer context for gemini-1.5-pro-002

DonggeLiu · 2024-11-19T05:44:13Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-19T09:53:18Z

/gcbrun exp -n dg -ag

DonggeLiu · 2024-11-22T02:37:18Z

Converting this to a draft because Tool is better in generating outputs in required type/struct.
I will manually cherry-pick commits from this PR (#717) and #718 into a new branch.

DonggeLiu mentioned this pull request Nov 18, 2024

Trying out Gemini's Tool API #718

Draft

DonggeLiu mentioned this pull request Nov 19, 2024

Remove northamerica-northeast1 from Vertex AI regions #719

Merged

DonggeLiu added 13 commits November 19, 2024 16:43

New system instructions

71bc0ef

Apply system instructions

3b25840

Refine priming

3498faf

Minor correction

3c8d99a

Bug fix

9477c02

Do not emphasize on simple/minimum fuzz target

8cdbaca

Complete conclusion protocol

1759a67

Minimize priming

8b83ef1

Visually separate RESPONSE/PROMPT and their content by a line break

8b17486

Strip empty lines and spaces from bash output

5272385

Allow executing multiple bash commands in one response

adb9e72

Allow passing system instructions to LLM

4c41553

More concise objective and instructions

e8e737e

DonggeLiu added 16 commits November 19, 2024 16:43

Make code consistent

4adf624

lower input token limit by system instruction token size

ebf36c0

Simplify system instruction

b0d6d3c

Consider previous text in the same prompt when truncate new text

8d1de5b

minor fix

dc007b4

ASK LLM do not compile

f474d16

Prioritize understanding over retrying

6286ae5

Remove the compile command so that LLM cannot learn

122b0a5

Debug truncating prompt

1ca97d3

Reduce unnecessary logs

611b4b7

Fix bug to remove compile command from build result

588afc9

Simpler debugging

76d7975

Fix bug in truncation

2b24d05

Set log level

610efda

Fix truncation and be more strict on individual output size limit

b2be748

Retry on VertexAI's InternalServerError

f189d18

DonggeLiu force-pushed the system-instructions branch from 1c2d235 to f189d18 Compare November 19, 2024 05:44

DonggeLiu marked this pull request as draft November 22, 2024 02:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent improvements: Adopt system instructions and allow multiple command executions #717

Agent improvements: Adopt system instructions and allow multiple command executions #717

DonggeLiu commented Nov 13, 2024

DonggeLiu commented Nov 13, 2024 •

edited

Loading

DonggeLiu commented Nov 13, 2024

DonggeLiu commented Nov 13, 2024 •

edited

Loading

DonggeLiu commented Nov 13, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 15, 2024 •

edited

Loading

DonggeLiu commented Nov 15, 2024 •

edited

Loading

mihaimaruseac commented Nov 15, 2024

DonggeLiu commented Nov 15, 2024

mihaimaruseac commented Nov 15, 2024

DonggeLiu commented Nov 15, 2024

oliverchang commented Nov 19, 2024

DonggeLiu commented Nov 19, 2024

DonggeLiu commented Nov 19, 2024

DonggeLiu commented Nov 22, 2024

Agent improvements: Adopt system instructions and allow multiple command executions #717

Are you sure you want to change the base?

Agent improvements: Adopt system instructions and allow multiple command executions #717

Conversation

DonggeLiu commented Nov 13, 2024

DonggeLiu commented Nov 13, 2024 • edited Loading

DonggeLiu commented Nov 13, 2024

DonggeLiu commented Nov 13, 2024 • edited Loading

DonggeLiu commented Nov 13, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 14, 2024

DonggeLiu commented Nov 15, 2024 • edited Loading

DonggeLiu commented Nov 15, 2024 • edited Loading

mihaimaruseac commented Nov 15, 2024

DonggeLiu commented Nov 15, 2024

mihaimaruseac commented Nov 15, 2024

DonggeLiu commented Nov 15, 2024

oliverchang commented Nov 19, 2024

DonggeLiu commented Nov 19, 2024

DonggeLiu commented Nov 19, 2024

DonggeLiu commented Nov 22, 2024

DonggeLiu commented Nov 13, 2024 •

edited

Loading

DonggeLiu commented Nov 13, 2024 •

edited

Loading

DonggeLiu commented Nov 15, 2024 •

edited

Loading

DonggeLiu commented Nov 15, 2024 •

edited

Loading