Open
Description
The default C++ logic has some limitations for C projects that's causing failures during e.g. builds. A C-specific flow would benefit in alleviating these and also open the door up on how to add more harness-generation workflows. Majority of this will need changes in prompt_builder
by simply adding a new prompt class.
I'd like to take the following high-level steps to achieve this:
- Add a new flow with specific C logic that fits into the current system without being intrusive (i.e. existing experiments will continue the same), where the C specific flow shows improvements in local runs in comparison to the existing default builder.
- Integrate in the CI so we can run experiments with the C-specific flow.
- Continue expanding on the C specific flow.
- Migrate so we can run multiple different prompt on the same experiment. This will be interesting, e.g. I expect we will have a situation where there is no clear general winner but each prompt will have their own set of targets they perform well in. We can use this to guide research further. I think there should be a larger spread to account for prompts not necessarily being a 1-dimensional comparison (which is better/worse) but rather a multi-dimensional (x performs better in scenarios m,z,v and y performs better in a,b,c).
Metadata
Metadata
Assignees
Labels
No labels