Skip to content

Commit ed6a068

Browse files
committed
update
1 parent 88ef49f commit ed6a068

File tree

2 files changed

+12
-6
lines changed

2 files changed

+12
-6
lines changed

CITATION.cff

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
# Visit https://bit.ly/cffinit to generate yours today!
33

44
cff-version: 1.2.0
5-
title: Instance Capability Tagger
5+
title: 'Instance Capability Tagger: Enhancing Multimodal Data Efficiency for Model Training'
66
message: >-
77
If you use this model, please cite it using the metadata from this file.
88
type: software

README.md

+11-5
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,18 @@
11
# PP-InsCapTagger
22

3-
## 模型
3+
## Updates
4+
- \[2024.10.10\] Code and model are available at [this](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/datacopilot/example/pp_inscaptagger).
5+
6+
7+
## TODO
8+
- [x] Release code and model
9+
- [ ] English version README
10+
- [ ] Upload arXiv paper
411

5-
Code and model are available in [this](https://github.com/PaddlePaddle/PaddleMIX/tree/develop/paddlemix/datacopilot/example/pp_inscaptagger).
612

713
## 方案简介
814

9-
PP-InsCapTagger(Instance Capability Tagger) 是 DataCopilot 基于 PaddleMIX 实现的数据集行为标签模型,用于为多模态数据实例能力打标,通过实例能力分布对数据集进行优化,可以提高模型训练效率,为数据集分析和评价提供了一种高效的方案。
15+
PP-InsCapTagger(Instance Capability Tagger) 是 DataCopilot 基于 PaddleMIX 实现的数据能力标签模型,用于为多模态数据实例能力打标,通过实例能力分布对数据集进行优化,可以提高模型训练效率,为数据集分析和评价提供了一种高效的方案。
1016
结合模型推理打标结果对LLaVA SFT数据集进行优化,可以**提高LLaVA模型SFT阶段50%的训练效率**
1117

1218
数据实例能力标签:在多模态任务中,每条数据都可以抽象出一种或多种能力,在训练时,模型会从这些数据中学习并增强自身对应的能力,如下图。为了评价和优化数据集,我们可以通过模型为每条多模态数据在模型训练中贡献的实例能力进行打标,并根据打标结果中数据实例能力分布进行数据集的优化,进而提升模型的训练效率。
@@ -106,7 +112,7 @@ LLaVA v1.5模型SFT阶段训练时,使用的指令微调数据集为[LLaVA-Ins
106112

107113
我们分别使用llava_v1_5_mix665k数据集和筛选后的数据集进行llava-1.5-7b SFT阶段训练,对比结果如下表所示:
108114

109-
| Version | ScienceQA | TextVQA | VQAv2 | GQA | mmmu | mme |
115+
| Model | ScienceQA | TextVQA | VQAv2 | GQA | mmmu | mme |
110116
|:----------------------:|:-----------:|:---------:|:-------:|:-------:|:-------:|:----------------:|
111117
| llava-1.5-7b <br> (origin) | 66.8 | 58.2 | 78.5 | 62.0 | - | - |
112118
| llava-1.5-7b <br> (rerun) | 69.01 | 57.6 | 79.0 | 62.95 | 36.89 | 1521 <br> 323 |
@@ -128,7 +134,7 @@ If you use `PP-InsCapTagger` in your work, please consider citing the following
128134
author = {Lv, Wenyu and Huang, Kui and Zhao, Yian},
129135
license = {Apache-2.0},
130136
month = oct,
131-
title = {{Instance Capability Tagger}},
137+
title = {{Instance Capability Tagger: Enhancing Multimodal Data Efficiency for Model Training}},
132138
url = {https://github.com/lyuwenyu/PP-InsCapTagger},
133139
version = {1.0},
134140
year = {2024}

0 commit comments

Comments
 (0)