Skip to content

Review experiment status & potential next steps #226

@Locke0

Description

@Locke0

Review the finetuning experiments we've done from the relevant notes in this order:

Review papers:

Review and identify what are the next steps for improving 7B CUA models on our perturbation evaluation (especially on the spatial relational instructions). I have listed some directions such as improving the training data mix, and using RL training methods instead of SFT with LoRA.

You should investigate for all possible directions (breadth first) and find evidences and justifications / uncertainties for how to prioritize these investigation directions based on the risks and time required.

The next step after this ticket will be investigating more in depth on the top priority direction and design minimal rapid experiments to validate specific hypothesis (e.g., do this in this way will improve 7B model on our perturbation eval spatial relational instructions)

Metadata

Metadata

Assignees

Labels

researchresearch, experiments, papers

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions