Skip to content

Commit 3b2ed2c

Browse files
authored
Merge AnalysisPTQ & AnalysisQAT to Analysis (#1692)
1 parent 2bb09da commit 3b2ed2c

File tree

14 files changed

+909
-1134
lines changed

14 files changed

+909
-1134
lines changed

docs/zh_cn/tutorials/quant/AnalysisPTQ.md renamed to docs/zh_cn/tutorials/quant/Analysis.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# PTQ(Post Training Quantization)量化分析工具详细教程
1+
# 量化分析工具详细教程
22

33
## 1. 量化分析工具功能
44
1. 统计分析(statistical_analyse):
@@ -13,17 +13,18 @@
1313
- 输入预期精度,直接产出符合预期精度的量化模型。
1414

1515

16-
## 2. paddleslim.quant.AnalysisPTQ 可传入参数解析
16+
## 2. paddleslim.quant.Analysis 可传入参数解析
1717
| **参数名** | **参数释义** |
1818
|-----------------------------|-----------------------------------------|
19-
| model_dir | 必须传入的模型文件路径,可为文件夹名;若模型为ONNX类型,直接输入'.onnx'模型文件名称即可 |
19+
| float_model_dir | 必须传入的模型文件路径,可为文件夹名;若模型为ONNX类型,直接输入'.onnx'模型文件名称即可 |
20+
| quant_model_dir | 默认为None,传入的量化模型文件路径,可为文件夹名;若模型为ONNX类型,直接输入'.onnx'模型文件名称即可; 若不传入,分析工具将使用PTQ进行量化并分析|
2021
| model_filename | 默认为None,若model_dir为文件夹名,则必须传入以'.pdmodel'结尾的模型名称,若model_dir为'.onnx'模型文件名称,则不需要传入 |
2122
| params_filename | 默认为None,若model_dir为文件夹名,则必须传入以'.pdiparams'结尾的模型名称,若model_dir为'.onnx'模型文件名称,则不需要传入 |
2223
| eval_function | 若需要验证精度,需要传入自定义的验证函数;若不传入,精度误差分析将根据Cosine Similarity计算得出 |
2324
| data_loader | 模型校准时使用的数据,DataLoader继承自`paddle.io.DataLoader`。可以直接使用模型套件中的DataLoader,或者根据[paddle.io.DataLoader](https://www.paddlepaddle.org.cn/documentation/docs/zh/api/paddle/io/DataLoader_cn.html#dataloader)自定义所需要的DataLoader |
2425
| save_dir | 分析后保存模型精度或pdf等文件的文件夹,默认为`analysis_results`|
2526
| resume | 是否加载中间分析文件,默认为False|
26-
| ptq_config | 可传入的离线量化中的参数,详细可参考[离线量化文档](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/quant/quant_post) |
27+
| quant_config | 可传入的离线量化中的参数,详细可参考[离线量化文档](https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/quant/quant_post) |
2728

2829

2930

@@ -45,7 +46,7 @@ import paddle
4546
from PIL import Image
4647
from paddle.vision.datasets import DatasetFolder
4748
from paddle.vision.transforms import transforms
48-
from paddleslim.quant.analysis_ptq import AnalysisPTQ
49+
from paddleslim.quant.analysis import Analysis
4950
paddle.enable_static()
5051

5152
class ImageNetDataset(DatasetFolder):
@@ -72,12 +73,12 @@ image = paddle.static.data(
7273
train_loader = paddle.io.DataLoader(
7374
train_dataset, feed_list=[image], batch_size=8, return_list=False)
7475

75-
analyzer = AnalysisPTQ(
76-
model_dir="./MobileNetV1_infer",
76+
analyzer = Analysis(
77+
float_model_dir="./MobileNetV1_infer",
7778
model_filename="inference.pdmodel",
7879
params_filename="inference.pdiparams",
7980
save_dir="MobileNetV1_analysis",
80-
ptq_config={
81+
quant_config={
8182
'quantizable_op_type': ["conv2d", "depthwise_conv2d"],
8283
'weight_quantize_type': 'abs_max',
8384
'activation_quantize_type': 'moving_average_abs_max',
@@ -124,22 +125,17 @@ analyzer.statistical_analyse()
124125
```shell
125126
analyzer.metric_error_analyse()
126127
```
127-
调用该接口,会遍历量化模型中的一层,并计算量化该层后模型的损失。调用该接口时,需要输入Eval Function。会产出所有只量化一层的模型精度排序,将默认保存在 `./analysis_results/analysis.txt` 中。
128+
若不传入quant_model_dir,并且调用该接口,会遍历量化模型中的一层,并计算量化该层后模型的损失。调用该接口时,需要输入Eval Function。会产出所有只量化一层的模型精度排序,将默认保存在 `./analysis_results/analysis.txt` 中。
129+
130+
若传入quant_model_dir,并且调用该接口,会遍历量化模型中的每一层,去掉量化节点并计算当前层不量化的模型精度。调用该接口时,需要输入Eval Function。会产出所有去掉一层量化的模型精度排序,将默认保存在 `./analysis_results/analysis.txt` 中。具体使用可参考[GPT量化训练敏感度分析DEMO](../../../../example/quantization_analysis/GPT/README.md)
131+
128132

129133

130134

131135
**直接产出符合预期精度的目标量化模型**
132136
```shell
133-
analyzer.get_target_quant_model(target_metric=70.0)
137+
analyzer.get_target_quant_model(target_metric=0.70)
134138
```
135139

136140
## 4. 根据分析结果执行离线量化
137141
执行完量化分析工具后,可根据 `analysis.txt` 中的精度排序,在量化中去掉效果较差的层,具体操作为:在调用 `paddleslim.quant.quant_post_static` 时加入参数 `skip_tensor_list`,将需要去掉的层传入即可。
138-
139-
140-
## FAQ:
141-
- 与QAT(Quantization-Aware Training)量化分析工具的区别:与QAT量化分析工具不同的是,PTQ量化分析工具则是加载待量化的原模型,对模型所有层依次进行量化,每次量化一层,进行验证获取精度误差分析。而QAT量化分析工具加载量化训练后的量化模型,遍历所有量化的层,依次去掉量化层,加载Float模型的参数,并进行验证获取精度误差分析。
142-
143-
- PTQ量化分析工具设计的原因:PTQ量化分析工具依次量化模型中的每一层,而不是依次去掉量化层是由于PTQ本身的高效性。依次量化一层进行验证,查看对模型精度的损失十分直观。
144-
145-
- 量化分析工具为什么要区分PTQ和QAT:实验证明PTQ和QAT后的量化模型的敏感层并不完全一致,将两种算法分开,敏感度分析结果更加准确。

docs/zh_cn/tutorials/quant/AnalysisQAT.md

Lines changed: 0 additions & 98 deletions
This file was deleted.

example/post_training_quantization/detection/analysis.py

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@
2323
from ppdet.metrics import COCOMetric, VOCMetric, KeyPointTopDownCOCOEval
2424
from keypoint_utils import keypoint_post_process
2525
from post_process import PPYOLOEPostProcess
26-
from paddleslim.quant.analysis_ptq import AnalysisPTQ
26+
from paddleslim.quant.analysis import Analysis
2727

2828

2929
def argsparser():
@@ -87,10 +87,11 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
8787
elif isinstance(config['input_list'], dict):
8888
if k in config['input_list'].keys():
8989
data_input[config['input_list'][k]] = np.array(v)
90-
outs = exe.run(compiled_test_program,
91-
feed=data_input,
92-
fetch_list=test_fetch_list,
93-
return_numpy=False)
90+
outs = exe.run(
91+
compiled_test_program,
92+
feed=data_input,
93+
fetch_list=test_fetch_list,
94+
return_numpy=False)
9495
res = {}
9596
if 'arch' in config and config['arch'] == 'keypoint':
9697
res = keypoint_post_process(data, data_input, exe,
@@ -115,8 +116,7 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
115116
metric.log()
116117
map_res = metric.get_results()
117118
metric.reset()
118-
map_key = 'keypoint' if 'arch' in config and config[
119-
'arch'] == 'keypoint' else 'bbox'
119+
map_key = 'keypoint' if 'arch' in config and config['arch'] == 'keypoint' else 'bbox'
120120
return map_res[map_key][0]
121121

122122

@@ -127,9 +127,8 @@ def main():
127127
ptq_config = config['PTQ']
128128

129129
# val dataset is sufficient for PTQ
130-
data_loader = create('EvalReader')(config['EvalDataset'],
131-
config['worker_num'],
132-
return_list=True)
130+
data_loader = create('EvalReader')(
131+
config['EvalDataset'], config['worker_num'], return_list=True)
133132
ptq_data_loader = reader_wrapper(data_loader, config['input_list'])
134133

135134
# fast_val_anno_path, such as annotation path of several pictures can accelerate analysis
@@ -139,10 +138,11 @@ def main():
139138
global val_loader
140139
_eval_batch_sampler = paddle.io.BatchSampler(
141140
dataset, batch_size=config['EvalReader']['batch_size'])
142-
val_loader = create('EvalReader')(dataset,
143-
config['worker_num'],
144-
batch_sampler=_eval_batch_sampler,
145-
return_list=True)
141+
val_loader = create('EvalReader')(
142+
dataset,
143+
config['worker_num'],
144+
batch_sampler=_eval_batch_sampler,
145+
return_list=True)
146146
global metric
147147
if config['metric'] == 'COCO':
148148
clsid2catid = {v: k for k, v in dataset.catid2clsid.items()}
@@ -161,14 +161,14 @@ def main():
161161
else:
162162
raise ValueError("metric currently only supports COCO and VOC.")
163163

164-
analyzer = AnalysisPTQ(
165-
model_dir=config["model_dir"],
164+
analyzer = Analysis(
165+
float_model_dir=config["model_dir"],
166166
model_filename=config["model_filename"],
167167
params_filename=config["params_filename"],
168168
eval_function=eval_function,
169169
data_loader=ptq_data_loader,
170170
save_dir=config['save_dir'],
171-
ptq_config=ptq_config,
171+
quant_config=ptq_config,
172172
resume=True, )
173173

174174
analyzer.statistical_analyse()

example/post_training_quantization/pytorch_yolo_series/analysis.py

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
from post_process import YOLOPostProcess, coco_metric
2222
from dataset import COCOValDataset, COCOTrainDataset
2323
from paddleslim.common import load_config, load_onnx_model
24-
from paddleslim.quant.analysis_ptq import AnalysisPTQ
24+
from paddleslim.quant.analysis import Analysis
2525

2626

2727
def argsparser():
@@ -41,7 +41,8 @@ def argsparser():
4141
'--resume',
4242
type=bool,
4343
default=False,
44-
help="When break off while ananlyzing, could resume analysis program and load already analyzed information."
44+
help=
45+
"When break off while ananlyzing, could resume analysis program and load already analyzed information."
4546
)
4647
return parser
4748

@@ -54,10 +55,11 @@ def eval_function(exe, compiled_test_program, test_feed_names, test_fetch_list):
5455
ncols=80) as t:
5556
for data in val_loader:
5657
data_all = {k: np.array(v) for k, v in data.items()}
57-
outs = exe.run(compiled_test_program,
58-
feed={test_feed_names[0]: data_all['image']},
59-
fetch_list=test_fetch_list,
60-
return_numpy=False)
58+
outs = exe.run(
59+
compiled_test_program,
60+
feed={test_feed_names[0]: data_all['image']},
61+
fetch_list=test_fetch_list,
62+
return_numpy=False)
6163
res = {}
6264
postprocess = YOLOPostProcess(
6365
score_threshold=0.001, nms_threshold=0.65, multi_label=True)
@@ -103,15 +105,15 @@ def main():
103105
load_onnx_model(config["model_dir"])
104106
inference_model_path = config["model_dir"].rstrip().rstrip(
105107
'.onnx') + '_infer'
106-
analyzer = AnalysisPTQ(
107-
model_dir=inference_model_path,
108+
analyzer = Analysis(
109+
float_model_dir=inference_model_path,
108110
model_filename='model.pdmodel',
109111
params_filename='model.pdiparams',
110112
eval_function=eval_function,
111113
data_loader=data_loader,
112114
save_dir=config['save_dir'],
113115
resume=FLAGS.resume,
114-
ptq_config=ptq_config)
116+
quant_config=ptq_config)
115117

116118
analyzer.statistical_analyse()
117119
analyzer.metric_error_analyse()

example/quantization_analysis/GPT/analysis.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121

2222
import paddle
2323
from paddleslim.common import load_config as load_slim_config
24-
from paddleslim.quant.analysis_qat import AnalysisQAT
24+
from paddleslim.quant.analysis import Analysis
2525
from ppfleetx.data import build_dataloader
2626
from ppfleetx.distributed.apis import env
2727
from utils import parse_config
@@ -164,17 +164,15 @@ def main():
164164
global eval_loader
165165
eval_loader = eval_reader_wrapper(valid_data_loader)
166166

167-
analyzer = AnalysisQAT(
167+
analyzer = Analysis(
168168
quant_model_dir=global_config["quant_model_dir"],
169169
float_model_dir=global_config["float_model_dir"],
170170
model_filename=global_config["model_filename"],
171171
params_filename=global_config["params_filename"],
172-
quantizable_op_type=global_config['quantizable_op_type'],
173-
qat_metric=global_config['qat_metric']
174-
if 'qat_metric' in global_config else None,
175172
eval_function=eval_function,
176173
data_loader=eval_loader,
177174
save_dir=FLAGS.save_dir,
175+
quant_config=all_config['quant_config'],
178176
resume=global_config['resume'], )
179177
analyzer.metric_error_analyse()
180178

example/quantization_analysis/GPT/configs/gpt_345M_analysis.yaml

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,16 @@ Global:
55
float_model_dir: ./GPT_345M_Baseline
66
model_filename: model.pdmodel
77
params_filename: model.pdiparams
8-
quantizable_op_type: ["mul", "matmul", "matmul_v2"]
98
resume: False
109
reader_config: ./configs/gpt_reader.yaml
1110
cloze_eval: True # True for LAMBADA Dataset; False for WikiText
1211

13-
12+
quant_config:
13+
quantizable_op_type: ["mul", "matmul", "matmul_v2"]
14+
weight_quantize_type: 'abs_max'
15+
activation_quantize_type: 'moving_average_abs_max'
16+
is_full_quantize: False
17+
batch_size: 8
18+
batch_nums: 10
1419

1520

0 commit comments

Comments
 (0)