A modular, future-proof experiment framework for research and development on Visual Language Models (VLMs). Built with Hydra, Weights & Biases, and extensible YAML-based configurations — ideal for conducting prompting experiments, fine-tuning comparisons, training image impact studies, and more.
vlm-research/
├── main.py # Lightweight experiment dispatcher
├── experiment_registry.yaml # Maps experiment name -> experiments/<name>/run.py
│
├── experiments/ # Each experiment is isolated and self-contained
│ ├── peft_vs_full/
│ │ ├── run.py # Entry point for this experiment
│ │ ├── logic.py # Core training/eval logic
│ │ ├── config.yaml # Local override config (Hydra)
│ │ ├── outputs/
│ │ └── logs/ # Hydra logs and model artifacts
│ │
│ └── prompt_eval/
│ ├── run.py
│ ├── evaluator.py
│ ├── config.yaml
│ ├── outputs/
│ └── logs/ # Hydra logs and model artifacts
│
├── configs/ # Global Hydra configs (model, dataset, sweep)
│ ├── config.yaml
│ ├── model/
│ ├── training/
│ ├── dataset/
│ └── sweep/
│
├── core/ # Shared utilities
│ ├── wandb_utils.py
│ ├── loaders.py
│ ├── metrics.py
│ └── registry.py # Load experiment modules dynamically
│
├── README.md
└── requirements.txt
git clone https://github.com/yourusername/vlm-research.git
cd vlm-research
pip install -r requirements.txt
python main.py experiment=peft_vs_full model=qwen7b dataset=car_damage
python main.py -m sweep.lr=1e-5,5e-5,1e-4 sweep.batch_size=4,8,16
Or with YAML:
python main.py -m +sweep=lr_vs_batch
Each run will automatically log:
- Metrics
- Config values
- Grouped by experiment type
- Custom run names (e.g.,
qwen7b_1e-5_8batch
)
W&B project and experiment name are customizable via config.
- Add new YAML to
configs/experiment/
- Update
experiment_registry.yaml
- Add handling logic to
runner.py