Skip to content

Commit dbfc8c9

Browse files
authored
[MOT] fix deepsort yolov3 infer and deploy (PaddlePaddle#4277)
* add general yolov3 for deepsort * refine format * fix no target infer * fix readme and conflict
1 parent 6bf1b44 commit dbfc8c9

13 files changed

+278
-73
lines changed

configs/mot/deepsort/README.md

+31-11
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ If you use a stronger detection model, you can get better results. Each txt is t
5555
- `width,height` is the pixel width and height
5656
- `conf` is the object score with default value `1` (the results had been filtered out according to the detection score threshold)
5757

58-
- 2.Load the detection model and the ReID model at the same time. Here, the JDE version of YOLOv3 is selected. For more detail of configuration, see `configs/mot/deepsort/_base_/deepsort_yolov3_darknet53_pcb_pyramid_r101.yml`.
58+
- 2. Load the detection model and the ReID model at the same time. Here, the JDE version of YOLOv3 is selected. For more detail of configuration, see `configs/mot/deepsort/_base_/deepsort_jde_yolov3_darknet53_pcb_pyramid_r101.yml`. Load other general detection model, you can refer to `configs/mot/deepsort/_base_/deepsort_yolov3_darknet53_pcb_pyramid_r101.yml`.
5959

6060
## Getting Start
6161

@@ -65,40 +65,60 @@ If you use a stronger detection model, you can get better results. Each txt is t
6565
# Load the result file and ReID model to get the tracking result
6666
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_pcb_pyramid_r101.yml --det_results_dir {your detection results}
6767

68-
# Load the detection model and ReID model to get the tracking results
69-
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml
68+
# Load JDE YOLOv3 detector and ReID model to get the tracking results
69+
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid_r101.yml
70+
71+
# or Load genernal YOLOv3 detector and ReID model to get the tracking results
72+
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --scaled=True
7073
```
74+
**Notes:**
75+
JDE YOLOv3 pedestrian detector is trained with the same MOT dataset as JDE and FairMOT. In addition, the biggest difference between this model and general YOLOv3 model is that JDEBBoxPostProcess post-processing, and the output coordinates are not scaled back to the original image.
76+
General YOLOv3 pedestrian detector is not trained on MOT dataset, so the performance is lower. But the output coordinates are scaled back to the original image.
77+
`--scaled` means whether the coords after detector outputs are scaled back to the original image, False in JDE YOLOv3, True in general detector.
7178

7279
### 2. Inference
7380

7481
Inference a vidoe on single GPU with following command:
7582

7683
```bash
77-
# inference on video and save a video
78-
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --save_videos
84+
# load JDE YOLOv3 pedestrian detector and ReID model to get tracking results
85+
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --save_videos
86+
87+
# or load general YOLOv3 pedestrian detector and ReID model to get tracking results
88+
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --scaled=True --save_videos
7989
```
8090
**Notes:**
8191
Please make sure that [ffmpeg](https://ffmpeg.org/ffmpeg.html) is installed first, on Linux(Ubuntu) platform you can directly install it by the following command:`apt-get update && apt-get install -y ffmpeg`.
92+
`--scaled` means whether the coords after detector outputs are scaled back to the original image, False in JDE YOLOv3, True in general detector.
8293

8394
### 3. Export model
8495

8596
```bash
86-
1.export detection model
97+
# 1.export detection model
98+
# export JDE YOLOv3 pedestrian detector
8799
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/jde_yolov3_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_yolov3_darknet53_30e_1088x608.pdparams
88100

89-
2.export ReID model
90-
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
91-
or
101+
# or export general YOLOv3 pedestrian detector
102+
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/pedestrian/pedestrian_yolov3_darknet.yml -o weights=https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams
103+
104+
105+
# 2. export ReID model
92106
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/deepsort_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
93107
```
94108

95109
### 4. Using exported model for python inference
96110

97111
```bash
98-
python deploy/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608/ --reid_model_dir=output_inference/deepsort_yolov3_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts
112+
# using exported JDE YOLOv3 pedestrian detector
113+
python deploy/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts
114+
115+
# or using exported general YOLOv3 pedestrian detector
116+
python deploy/python/mot_sde_infer.py --model_dir=output_inference/pedestrian_yolov3_darknet/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts
99117
```
100118
**Notes:**
101-
The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts` to save the txt result file, or `--save_images` to save the visualization images.
119+
The tracking model is used to predict the video, and does not support the prediction of a single image. The visualization video of the tracking results is saved by default. You can add `--save_mot_txts`(save a txt for every video) or `--save_mot_txt_per_img`(save a txt for every image) to save the txt result file, or `--save_images` to save the visualization images.
120+
`--scaled` means whether the coords after detector outputs are scaled back to the original image, False in JDE YOLOv3, True in general detector.
121+
102122

103123
## Citations
104124
```

configs/mot/deepsort/README_cn.md

+30-11
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ wget https://dataset.bj.bcebos.com/mot/det_results_dir.zip
5656
- `width,height`是真实的像素宽高
5757
- `conf`是目标得分设置为`1`(已经按检测的得分阈值筛选出的检测结果)
5858

59-
- 第2种方式是同时加载检测模型和ReID模型,此处选用JDE版本的YOLOv3,具体配置见`configs/mot/deepsort/_base_/deepsort_yolov3_darknet53_pcb_pyramid_r101.yml`
59+
- 第2种方式是同时加载检测模型和ReID模型,此处选用JDE版本的YOLOv3,具体配置见`configs/mot/deepsort/_base_/deepsort_jde_yolov3_darknet53_pcb_pyramid_r101.yml`。加载其他通用检测模型可参照`configs/mot/deepsort/_base_/deepsort_yolov3_darknet53_pcb_pyramid_r101.yml`进行修改。
6060

6161
## 快速开始
6262

@@ -66,42 +66,61 @@ wget https://dataset.bj.bcebos.com/mot/det_results_dir.zip
6666
# 加载检测结果文件和ReID模型,得到跟踪结果
6767
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_pcb_pyramid_r101.yml --det_results_dir {your detection results}
6868

69-
# 加载检测模型和ReID模型,得到跟踪结果
70-
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml
69+
# 加载JDE YOLOv3行人检测模型和ReID模型,得到跟踪结果
70+
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid_r101.yml
71+
72+
# 或者加载普通YOLOv3行人检测模型和ReID模型,得到跟踪结果
73+
CUDA_VISIBLE_DEVICES=0 python tools/eval_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --scaled=True
7174
```
75+
**注意:**
76+
JDE YOLOv3行人检测模型是和JDE和FairMOT使用同样的MOT数据集训练的,这个模型与普通YOLOv3模型最大的区别是使用了JDEBBoxPostProcess后处理,结果输出坐标没有缩放回原图。
77+
普通YOLOv3行人检测模型不是用MOT数据集训练的,所以精度效果更低, 其模型输出坐标是缩放回原图的。
78+
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
7279

7380
### 2. 预测
7481

7582
使用单个GPU通过如下命令预测一个视频,并保存为视频
7683

7784
```bash
78-
# 加载检测模型和ReID模型,得到跟踪结果
79-
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --save_videos
85+
# 加载JDE YOLOv3行人检测模型和ReID模型,并保存为视频
86+
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_jde_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --save_videos
87+
88+
# 或者加载普通YOLOv3行人检测模型和ReID模型,并保存为视频
89+
CUDA_VISIBLE_DEVICES=0 python tools/infer_mot.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml --video_file={your video name}.mp4 --scaled=True --save_videos
8090
```
8191

8292
**注意:**
8393
请先确保已经安装了[ffmpeg](https://ffmpeg.org/ffmpeg.html), Linux(Ubuntu)平台可以直接用以下命令安装:`apt-get update && apt-get install -y ffmpeg`
94+
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
95+
8496

8597
### 3. 导出预测模型
8698

8799
```bash
88-
1.先导出检测模型
100+
# 1.先导出检测模型
101+
# 导出JDE YOLOv3行人检测模型
89102
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/jde_yolov3_darknet53_30e_1088x608.yml -o weights=https://paddledet.bj.bcebos.com/models/mot/jde_yolov3_darknet53_30e_1088x608.pdparams
90103

91-
2.再导出ReID模型
92-
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
104+
# 或导出普通YOLOv3行人检测模型
105+
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/pedestrian/pedestrian_yolov3_darknet.yml -o weights=https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams
106+
93107

94-
108+
# 2.再导出ReID模型
95109
CUDA_VISIBLE_DEVICES=0 python tools/export_model.py -c configs/mot/deepsort/deepsort_pcb_pyramid_r101.yml -o reid_weights=https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
96110
```
97111

98112
### 4. 用导出的模型基于Python去预测
99113

100114
```bash
101-
python deploy/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608/ --reid_model_dir=output_inference/deepsort_yolov3_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts
115+
# 用导出JDE YOLOv3行人检测模型
116+
python deploy/python/mot_sde_infer.py --model_dir=output_inference/jde_yolov3_darknet53_30e_1088x608/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --save_mot_txts
117+
118+
# 或用导出的普通yolov3行人检测模型
119+
python deploy/python/mot_sde_infer.py --model_dir=output_inference/pedestrian_yolov3_darknet/ --reid_model_dir=output_inference/deepsort_pcb_pyramid_r101/ --video_file={your video name}.mp4 --device=GPU --scaled=True --save_mot_txts
102120
```
103121
**注意:**
104-
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
122+
跟踪模型是对视频进行预测,不支持单张图的预测,默认保存跟踪结果可视化后的视频,可添加`--save_mot_txts`(对每个视频保存一个txt)或`--save_mot_txt_per_img`(对每张图片保存一个txt)表示保存跟踪结果的txt文件,或`--save_images`表示保存跟踪结果可视化图片。
123+
`--scaled`表示在模型输出结果的坐标是否已经是缩放回原图的,如果使用的检测模型是JDE的YOLOv3则为False,如果使用通用检测模型则为True。
105124

106125
## 引用
107126
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
architecture: DeepSORT
2+
pretrain_weights: None
3+
4+
DeepSORT:
5+
detector: YOLOv3 # JDE version
6+
reid: PCBPyramid
7+
tracker: DeepSORTTracker
8+
9+
PCBPyramid:
10+
num_conv_out_channels: 128
11+
num_classes: 751
12+
13+
DeepSORTTracker:
14+
budget: 100
15+
max_age: 70
16+
n_init: 3
17+
metric_type: cosine
18+
matching_threshold: 0.2
19+
max_iou_distance: 0.9
20+
motion: KalmanFilter
21+
22+
23+
# JDE version YOLOv3 detector for MOT dataset.
24+
# The most obvious difference is JDEBBoxPostProcess and the bboxes coordinates
25+
# output are not scaled to the original image.
26+
YOLOv3:
27+
backbone: DarkNet
28+
neck: YOLOv3FPN
29+
yolo_head: YOLOv3Head
30+
post_process: JDEBBoxPostProcess
31+
32+
DarkNet:
33+
depth: 53
34+
return_idx: [2, 3, 4]
35+
freeze_norm: True
36+
37+
YOLOv3FPN:
38+
freeze_norm: True
39+
40+
YOLOv3Head:
41+
anchors: [[128,384], [180,540], [256,640], [512,640],
42+
[32,96], [45,135], [64,192], [90,271],
43+
[8,24], [11,34], [16,48], [23,68]]
44+
anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
45+
loss: JDEDetectionLoss
46+
47+
JDEBBoxPostProcess:
48+
decode:
49+
name: JDEBox
50+
conf_thresh: 0.3
51+
downsample_ratio: 32
52+
nms:
53+
name: MultiClassNMS
54+
keep_top_k: 500
55+
score_threshold: 0.01
56+
nms_threshold: 0.5
57+
nms_top_k: 2000
58+
normalized: true
59+
return_idx: false

configs/mot/deepsort/_base_/deepsort_reader_1088x608.yml

+4
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
# DeepSORT does not need to train on MOT dataset, only used for evaluation.
2+
# MOT dataset needs to be trained on the detector(like YOLOv3) only using bboxes.
3+
# And gt IDs don't need to be trained.
4+
15
EvalMOTReader:
26
sample_transforms:
37
- Decode: {}

configs/mot/deepsort/_base_/deepsort_yolov3_darknet53_pcb_pyramid_r101.yml

+34-32
Original file line numberDiff line numberDiff line change
@@ -2,55 +2,57 @@ architecture: DeepSORT
22
pretrain_weights: None
33

44
DeepSORT:
5-
detector: YOLOv3 # JDE version
5+
detector: YOLOv3 # General version
66
reid: PCBPyramid
77
tracker: DeepSORTTracker
88

9-
# JDE version for MOT dataset
9+
PCBPyramid:
10+
num_conv_out_channels: 128
11+
num_classes: 751
12+
13+
DeepSORTTracker:
14+
budget: 100
15+
max_age: 70
16+
n_init: 3
17+
metric_type: cosine
18+
matching_threshold: 0.2
19+
max_iou_distance: 0.9
20+
motion: KalmanFilter
21+
22+
23+
# General version YOLOv3
24+
# Using BBoxPostProcess and the bboxes output are scaled to the original image.
1025
YOLOv3:
1126
backbone: DarkNet
1227
neck: YOLOv3FPN
1328
yolo_head: YOLOv3Head
14-
post_process: JDEBBoxPostProcess
29+
post_process: BBoxPostProcess
30+
31+
norm_type: sync_bn
1532

1633
DarkNet:
1734
depth: 53
1835
return_idx: [2, 3, 4]
19-
freeze_norm: True
2036

21-
YOLOv3FPN:
22-
freeze_norm: True
37+
# use default config
38+
# YOLOv3FPN:
2339

2440
YOLOv3Head:
25-
anchors: [[128,384], [180,540], [256,640], [512,640],
26-
[32,96], [45,135], [64,192], [90,271],
27-
[8,24], [11,34], [16,48], [23,68]]
28-
anchor_masks: [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]]
29-
loss: JDEDetectionLoss
41+
anchors: [[10, 13], [16, 30], [33, 23],
42+
[30, 61], [62, 45], [59, 119],
43+
[116, 90], [156, 198], [373, 326]]
44+
anchor_masks: [[6, 7, 8], [3, 4, 5], [0, 1, 2]]
45+
loss: YOLOv3Loss
3046

31-
JDEBBoxPostProcess:
47+
BBoxPostProcess:
3248
decode:
33-
name: JDEBox
34-
conf_thresh: 0.3
49+
name: YOLOBox
50+
conf_thresh: 0.005
3551
downsample_ratio: 32
52+
clip_bbox: true
3653
nms:
3754
name: MultiClassNMS
38-
keep_top_k: 500
55+
keep_top_k: 100
3956
score_threshold: 0.01
40-
nms_threshold: 0.5
41-
nms_top_k: 2000
42-
normalized: true
43-
return_idx: false
44-
45-
PCBPyramid:
46-
num_conv_out_channels: 128
47-
num_classes: 751
48-
49-
DeepSORTTracker:
50-
budget: 100
51-
max_age: 70
52-
n_init: 3
53-
metric_type: cosine
54-
matching_threshold: 0.2
55-
max_iou_distance: 0.9
56-
motion: KalmanFilter
57+
nms_threshold: 0.45
58+
nms_top_k: 1000
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
_BASE_: [
2+
'../../datasets/mot.yml',
3+
'../../runtime.yml',
4+
'_base_/deepsort_jde_yolov3_darknet53_pcb_pyramid_r101.yml',
5+
'_base_/deepsort_reader_1088x608.yml',
6+
]
7+
8+
EvalMOTDataset:
9+
!MOTImageFolder
10+
dataset_dir: dataset/mot
11+
data_root: MOT16/images/train
12+
keep_ori_im: True # set as True in DeepSORT
13+
14+
det_weights: https://paddledet.bj.bcebos.com/models/mot/jde_yolov3_darknet53_30e_1088x608.pdparams
15+
reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
16+
17+
DeepSORT:
18+
detector: YOLOv3
19+
reid: PCBPyramid
20+
tracker: DeepSORTTracker
21+
22+
# JDE version YOLOv3 detector for MOT dataset.
23+
# The most obvious difference is JDEBBoxPostProcess and the bboxes coordinates
24+
# output are not scaled to the original image.
25+
YOLOv3:
26+
backbone: DarkNet
27+
neck: YOLOv3FPN
28+
yolo_head: YOLOv3Head
29+
post_process: JDEBBoxPostProcess

configs/mot/deepsort/deepsort_yolov3_pcb_pyramid_r101.yml

+4-3
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,18 @@ EvalMOTDataset:
1111
data_root: MOT16/images/train
1212
keep_ori_im: True # set as True in DeepSORT
1313

14-
det_weights: https://paddledet.bj.bcebos.com/models/mot/jde_yolov3_darknet53_30e_1088x608.pdparams
14+
det_weights: https://paddledet.bj.bcebos.com/models/pedestrian_yolov3_darknet.pdparams
1515
reid_weights: https://paddledet.bj.bcebos.com/models/mot/deepsort_pcb_pyramid_r101.pdparams
1616

1717
DeepSORT:
1818
detector: YOLOv3
1919
reid: PCBPyramid
2020
tracker: DeepSORTTracker
2121

22-
# JDE version for MOT dataset
22+
# General version YOLOv3
23+
# Using BBoxPostProcess and the bboxes output are scaled to the original image.
2324
YOLOv3:
2425
backbone: DarkNet
2526
neck: YOLOv3FPN
2627
yolo_head: YOLOv3Head
27-
post_process: JDEBBoxPostProcess
28+
post_process: BBoxPostProcess

0 commit comments

Comments
 (0)