Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agent improvements: Adopt system instructions and allow multiple command executions #717

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

DonggeLiu
Copy link
Collaborator

  1. Allow passing system instructions to LLM
  2. Allow executing multiple bash commands in one response
  3. Prompt fixes
  4. Minor corrections and bug fixes

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Nov 13, 2024

In addition to the new features, this also generated buildable fuzz targets for project xs in local experiments for the first time (IIRC):

2024-11-13 14:33:05 [Trial ID: 01] INFO [logger.info]: ===== ROUND 10 Recompile =====
2024-11-13 14:33:11 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 compilation time: 0:00:06.169302
2024-11-13 14:33:11 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Fuzz target compiles: True
2024-11-13 14:33:12 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Final fuzz target binary exists: True
2024-11-13 14:33:13 [Trial ID: 01] DEBUG [logger.debug]: ROUND 10 Final fuzz target function referenced: True

Past:

  1. (non-agent) https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-08-709-ochang-mp-comparison/index.html
  2. (agent) https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-10-716-dg-comparison/index.html

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

DonggeLiu commented Nov 13, 2024

Report: https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-11-13-717-dg-comparison/index.html

Seeing many errors like:

File "/usr/local/lib/python3.11/dist-packages/google/api_core/grpc_helpers.py", line 76, in error_remapped_callable
return callable_(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1181, in __call__
return _end_unary_response_blocking(state, call, False, None)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/grpc/_channel.py", line 1006, in _end_unary_response_blocking
raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Unable to submit request because the input token count is 35103 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models"
debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.72.170:443 {grpc_message:"Unable to submit request because the input token count is 35103 but model only supports up to 32768. Reduce the input token count and try again. You can also use the CountTokens API to calculate prompt token count and billable characters. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/learn/models", grpc_status:3, created_time:"2024-11-13T04:04:17.961683542+00:00"}"

This is likely due to the new system instructions added, I will lower input size limit accordingly.

Good news is finally got non-0 build rate on both benchmarks from xs:
image

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg1 -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant