Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent improvements: Adopt system instructions and allow multiple command executions #717

Draft
wants to merge 29 commits into
base: main
Choose a base branch
from

Conversation

DonggeLiu
Copy link
Collaborator

  1. Allow passing system instructions to LLM
  2. Allow executing multiple bash commands in one response
  3. Prompt fixes
  4. Minor corrections and bug fixes

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Nov 13, 2024

In addition to the new features, this also generated buildable fuzz targets for project xs in local experiments for the first time (IIRC):

2024-11-13 14:33:05 [Trial ID: 01] INFO [logger.info]: ===== ROUND 10 Recompile =====
2024-11-13 14:33:11 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 compilation time: 0:00:06.169302
2024-11-13 14:33:11 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Fuzz target compiles: True
2024-11-13 14:33:12 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Final fuzz target binary exists: True
2024-11-13 14:33:13 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Final fuzz target function referenced: True

Past:

  1. (non-agent) https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-08-709-ochang-mp-comparison/index.html
  2. (agent) https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-10-716-dg-comparison/index.html

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Nov 13, 2024

Report: https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-13-717-dg-comparison/index.html

Seeing many errors like:

File "/usr/local/lib/python3.11/dist-packages/google/api_core/grpc_helpers.py", line 76, in error_remapped_callable
return callable_(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1181, in __call__
return _end_unary_response_blocking(state, call, False, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Unable to submit request because the input token count is 35103 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models"
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.72.170:443 {grpc_message:"Unable to submit request because the input token count is 35103 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models", grpc_status:3, created_time:"2024-11-13T04:04:17.961683542+00:00"}"

This is likely due to the new system instructions added, I will lower input size limit accordingly.

Good news is finally got non-0 build rate on both benchmarks from xs:
image

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg1 -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

3 similar comments
@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Nov 15, 2024

Hi @mihaimaruseac, could you please help me check if I did something wrong in this PR that can cause the following error when invoking chat model's send_message()?
I keep getting the following over-long input error (example) in cloud experiments but I cannot reproduce it locally:

Traceback (most recent call last):
File "/usr/local/lib/python3.11/dist-packages/google/api_core/grpc_helpers.py", line 76, in error_remapped_callable
return callable_(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1181, in __call__
return _end_unary_response_blocking(state, call, False, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Unable to submit request because the input token count is 58602 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models"
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.188.234:443 {grpc_message:"Unable to submit request because the input token count is 58602 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models", grpc_status:3, created_time:"2024-11-14T11:52:11.683245084+00:00"}"
>

IIUC, the message says the input has 58602 tokens, exceeding the 32768 token limit.
I guess this relates to using system_instruction because I did not see this error before, but I find it confusing for two reasons:

  1. Re-run the same steps locally won't trigger this error. The number of token is ~57719 (close to 58602, which makes sense), greater than the token limit 32768. But I got a valid response from LLM and did not trigger the error.
  2. Cannot find where the 32768 limit come from. I noticed several places mentioning this number (e.g., context caching) but I reckon since we did not use them manually, they are not relevant to us?

Thanks!

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Nov 15, 2024

Meanwhile, I will attempt using the built-in Tool API to replace the current manual XML parsing.

@mihaimaruseac
Copy link
Member

I think this is because the error log from the compilation is too long?

The model itself has a limit of tokens, afaik, and we might be hitting that?

@DonggeLiu
Copy link
Collaborator Author

I think this is because the error log from the compilation is too long?

Hmm... If that's the case, shouldn't we be able to reproduce the error locally?
When running it locally, I did not get any error even when the input has 57719 tokens, which is much large than the 32768 limit mentioned in the error message from cloud experiment.

The model itself has a limit of tokens, afaik, and we might be hitting that?

Please correct me if I am wrong, but I thought gemini-1.5-pro input token limit is much larger:
https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models#gemini-1.5-pro:~:text=Max%20input%20tokens%3A%202%2C097%2C152

@mihaimaruseac
Copy link
Member

Oh, I was wrong. I'll think about it, but currently I don't have an idea, sorry

@DonggeLiu
Copy link
Collaborator Author

Oh, I was wrong. I'll think about it, but currently I don't have an idea, sorry

Thanks!
If you have some theories later, I am happy to try them out in this PR : )

@oliverchang
Copy link
Collaborator

I believe the token limit issues are because of this:

https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations#ml_processing

image

Not all regions support longer context for gemini-1.5-pro-002

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

1 similar comment
@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu DonggeLiu marked this pull request as draft November 22, 2024 02:35
@DonggeLiu
Copy link
Collaborator Author

Converting this to a draft because Tool is better in generating outputs in required type/struct.
I will manually cherry-pick commits from this PR (#717) and #718 into a new branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants