[事前学習] - マルチモーダルモデル学習（VILA step-2）

# Overview

[llm-jp-VILA](https://github.com/llm-jp/llm-jp-VILA) を参考に，マルチモーダルの学習を行う．

# Details

モデルカードPR: https://github.com/llm-jp/model-cards/pull/{id}

[llm-jp-VILA](https://github.com/llm-jp/llm-jp-VILA) の step-0, step-1 の学習を行なったモデルに対して，指示チューニングを行う．

* モデル

  モデルの構成要素 | モデル / アーキテクチャ
  -- | --
  VIsion Encoder | [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384)
  Projector | `mlp2x_gelu`
  LLM | [llm-jp/llm-jp-3-13b-instruct](https://huggingface.co/llm-jp/llm-jp-3-13b-instruct)

* データ
  * [llm-jp-VILA](https://github.com/llm-jp/llm-jp-VILA) の学習 step-2 と同じデータ．
  * ただし，日本語データのうち [llava-instruct-ja](https://huggingface.co/datasets/llm-jp/llava-instruct-ja) および [japanese-photos-conv](https://huggingface.co/datasets/llm-jp/japanese-photos-conversation) は，gpt-4o ではなく，代わりにそれぞれ [Qwen/Qwen2.5-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-32B-Instruct)，[Qwen/Qwen2.5-VL-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct) のオープンなモデルにより合成したデータで代替する．

# Resources

* **計算機**
  * クラスタ: mdx (llm-jp-nvlink)
  * ノード種別: gpu (A100x8)
  * ノード台数: 4 - 8
* **コード**
  * リポジトリ: **FIXME** https://github.com/{org}/{repo}
  * コミット: **FIXME** xxxxxx
* **入力データ**:
  * `llm-jp-nvlink:/data/kei0917/VILA-open-ja/playground/data`
* **出力データ**:
  * 保存先: `llm-jp-nvlink:/data/experiments/0205_vila_step2`
  * データ内訳:
    * checkpoints: 150GB（バッファ容量を含む）
* **W&B ログ**:
  * https://wandb.ai/{team}/{project} **FIXME**
* **開始日**: YYYY-MM-DD
* **終了予定日**: YYYY-MM-DD （バッファ期間を含む）



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[事前学習] - マルチモーダルモデル学習（VILA step-2） #205

Overview

Details

Resources

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

モデルの構成要素	モデル / アーキテクチャ
VIsion Encoder	google/siglip-so400m-patch14-384
Projector	`mlp2x_gelu`
LLM	llm-jp/llm-jp-3-13b-instruct

[事前学習] - マルチモーダルモデル学習（VILA step-2） #205

Description

Overview

Details

Resources

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions