Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test-to-harness: initial set up #511

Merged
merged 13 commits into from
Aug 2, 2024
Merged

Conversation

DavidKorczynski
Copy link
Collaborator

@DavidKorczynski DavidKorczynski commented Jul 26, 2024

Ref: #494

Some more comments on this PR in #511 (comment)

@DavidKorczynski
Copy link
Collaborator Author

Will add a benchmark corpus before making this ready for review

Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
@DavidKorczynski
Copy link
Collaborator Author

/gcbrun exp -n dk-test-to-harenss-1 -m vertex_ai_gemini-1-5 -b comparison from-test-small

Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
@DavidKorczynski
Copy link
Collaborator Author

/gcbrun exp -n dk-test-to-harenss-2 -m vertex_ai_gemini-1-5 -b from-test-small

@DavidKorczynski
Copy link
Collaborator Author

@DavidKorczynski
Copy link
Collaborator Author

/gcbrun exp -n dk-1231231432-m vertex_ai_gemini-1-5 -b comparison

@DavidKorczynski
Copy link
Collaborator Author

/gcbrun exp -n dk-comparison-jj1 -m vertex_ai_gemini-1-5 -b comparison

@DavidKorczynski
Copy link
Collaborator Author

@DavidKorczynski
Copy link
Collaborator Author

/gcbrun skip

@DavidKorczynski DavidKorczynski marked this pull request as ready for review July 27, 2024 13:50
@DavidKorczynski
Copy link
Collaborator Author

This is ready for review. The experiment https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2024-07-27-511-dk-test-to-harenss-2-from-test-small/index.html shows improvements in several projects where we previously had no gains: https://llm-exp.oss-fuzz.com/Result-reports/scheduled/2024-07-06-weekly-all/index.html

I think this is a promising direction, not least because we're seeing improvements and there are many technical improvements we can do since we now, more or less, only do a copy-paste of the test into the prompt. I think the PR is in a state though where we can do incremental improvements on this.

Particularly I think there are improvements needed in (1) architecture around benchmarks; (2) more context around tests; (3) more experiments around tests, e.g. we copy whole files in now where we could probably refine this (e.g. where there are multiple tests in a file we can extract the tests).

cppify_headers=cppify_headers,
commit=commit,
use_context=use_context,
function_dict=function))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for now, but would you please merge the same code in if/else block later to reduce repetition later?
Thanks

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, will do

Signed-off-by: David Korczynski <[email protected]>
@DavidKorczynski
Copy link
Collaborator Author

/gcbrun exp -n dk-comparisonasdf12 -m vertex_ai_gemini-1-5 -b minor-for-ci

@DavidKorczynski
Copy link
Collaborator Author

/gcbrun exp -n dk-comparisonasfdf12 -m vertex_ai_gemini-1-5 -b minor-for-ci

@DavidKorczynski DavidKorczynski merged commit 5b5ee46 into main Aug 2, 2024
6 checks passed
@DavidKorczynski DavidKorczynski deleted the test-to-harness-migration-init branch August 2, 2024 21:40
DavidKorczynski pushed a commit that referenced this pull request Sep 10, 2024
This PR adds JVM project support for the test-to-harness approach
initiated in #511. This PR also adds new benchmark set using the
test-to-harness approach on Java projects.

---------

Signed-off-by: Arthur Chan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants