Skip to content

Commit 8ac704f

Browse files
committed
update
1 parent e6bbbdd commit 8ac704f

File tree

1 file changed

+176
-0
lines changed

1 file changed

+176
-0
lines changed

README.md

+176
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
2+
3+
# SparseTrack
4+
#### SparseTrack is a simply and strong multi-object tracker.
5+
6+
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sparsetrack-multi-object-tracking-by/multi-object-tracking-on-mot20-1)](https://paperswithcode.com/sota/multi-object-tracking-on-mot20-1?p=sparsetrack-multi-object-tracking-by)
7+
8+
[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sparsetrack-multi-object-tracking-by/multi-object-tracking-on-mot17)](https://paperswithcode.com/sota/multi-object-tracking-on-mot17?p=sparsetrack-multi-object-tracking-by)
9+
10+
> [**SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-Depth**](https://arxiv.org/abs/2306.05238)
11+
>
12+
> Zelin Liu, Xinggang Wang, Cheng Wang, Wenyu Liu, Xiang Bai
13+
>
14+
> *[arXiv 2306.05238](https://arxiv.org/abs/2306.05238)*
15+
16+
17+
## Abstract
18+
Exploring robust and efficient association methods has always been an important issue in multiple-object tracking (MOT). Although existing tracking methods have achieved impressive performance, congestion and frequent occlusions still pose challenging problems in multi-object tracking. We reveal that performing sparse decomposition on dense scenes is a crucial step to enhance the performance of associating occluded targets. To this end, we propose a pseudo-depth estimation method for obtaining the relative depth of targets from 2D images. Secondly, we design a depth cascading matching (DCM) algorithm, which can use the obtained depth information to convert a dense target set into multiple sparse target subsets and perform data association on these sparse target subsets in order from near to far. By integrating the pseudo-depth method and the DCM strategy into the data association process, we propose a new tracker, called SparseTrack. SparseTrack provides a new perspective for solving the challenging crowded scene MOT problem. Only using IoU matching, SparseTrack achieves comparable performance with the state-of-the-art (SOTA) methods on the MOT17 and MOT20 benchmarks.
19+
20+
<p align="center"><img src="assets/DCM.png" width="500"/></p>
21+
22+
## Tracking performance
23+
### Results on MOT challenge test set
24+
| Dataset | HOTA | MOTA | IDF1 | MT | ML | FP | FN | IDs |
25+
|------------|-------|-------|------|------|-------|-------|------|------|
26+
|MOT17 | 65.1 | 81.0 | 80.1 | 54.6% | 14.3% | 23904 | 81927 | 1170 |
27+
|MOT20 | 63.4 | 78.2 | 77.3 | 69.9% | 9.2% | 25108 | 86720 | 1116 |
28+
29+
### Comparison on DanceTrack test set
30+
| Method | HOTA | DetA | AssA | MOTA | IDF1 |
31+
|------------|-------|-------|------|------|-------|
32+
| SparseTrack | 55.5 (**+7.8**) | 78.9 (**+7.9**) | 39.1 (**+7.0**) | 91.3 (**+1.7**) | 58.3 (**+4.4**) |
33+
| ByteTrack | 47.7 | 71.0 | 32.1 | 89.6 | 53.9 |
34+
35+
**Notes**:
36+
- All the inference experiments are performed on 1 NVIDIA GeForce RTX 3090 GPUs.
37+
- Each experiment uses the **same detector and model weights** as [ByteTrack](https://github.com/ifzhang/ByteTrack) .
38+
- SparseTrack relies on IoU distance association only and do not use any appearance embedding, learnable motion, and attention components.
39+
40+
## Installation
41+
#### Dependence
42+
This project is an implementation version of [Detectron2](https://github.com/facebookresearch/detectron2) and requires the compilation of [OpenCV](https://opencv.org/), [Boost](https://www.boost.org).
43+
44+
#### Compile GMC(Globle Motion Compensation) module
45+
>step 1: Downloading [pbcvt](https://github.com/Algomorph/pyboostcvconverter), copy the [python_module.cpp](https://github.com/hustvl/SparseTrack/blob/main/python_module.cpp) to the path **<[pbcvt](https://github.com/Algomorph/pyboostcvconverter)/src/>**.
46+
>
47+
>step 2: Adding the relevant OpenCV modules in the pbcvt/CMakeLists.txt file. Here's what you should do: locate the line "find_package(OpenCV COMPONENTS REQUIRED)" in the CMakeLists.txt file and replace it with "find_package(OpenCV COMPONENTS core highgui video videoio videostab REQUIRED)".
48+
>
49+
>step 3: Modifying the compilation path in the Makefile file before compiling pbcvt. The main modifications include updating the following entries:CMAKE_SOURCE_DIR, CMAKE_BINARY_DIR, cmake_progress_start.
50+
>
51+
>step 4: Compiling [pbcvt](https://github.com/Algomorph/pyboostcvconverter).
52+
>
53+
>step 5: Please copy the "pbcvt.xxxxxx.so" file compiled via [pbcvt](https://github.com/Algomorph/pyboostcvconverter) to the **<ROOT/SparseTrack/tracker/>** directory.
54+
55+
#### Install
56+
```shell
57+
git clone https://github.com/hustvl/SparseTrack.git
58+
cd SparseTrack
59+
pip install -r requirements.txt
60+
pip install Cython
61+
pip install cython_bbox
62+
```
63+
64+
## Data preparation
65+
Download [MOT17](https://motchallenge.net/), [MOT20](https://motchallenge.net/), [CrowdHuman](https://www.crowdhuman.org/), [Cityperson](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md), [ETHZ](https://github.com/Zhongdao/Towards-Realtime-MOT/blob/master/DATASET_ZOO.md) and put them under ROOT/ in the following structure:
66+
```
67+
ROOT
68+
|
69+
|——————SparseTrack(repo)
70+
| └—————mix
71+
| └——————mix_17/annotations
72+
| └——————mix_20/annotations
73+
| └——————ablation_17/annotations
74+
| └——————ablation_20/annotations
75+
|——————MOT17
76+
| └——————train
77+
| └——————test
78+
└——————crowdhuman
79+
| └——————Crowdhuman_train
80+
| └——————Crowdhuman_val
81+
| └——————annotation_train.odgt
82+
| └——————annotation_val.odgt
83+
└——————MOT20
84+
| └——————train
85+
| └——————test
86+
└——————Citypersons
87+
| └——————images
88+
| └——————labels_with_ids
89+
└——————ETHZ
90+
└——————eth01
91+
└——————...
92+
└——————eth07
93+
```
94+
Then, you need to turn the datasets to COCO format and mix different training data:
95+
```
96+
cd <ROOT>/SparseTrack
97+
python3 tools/convert_mot17_to_coco.py
98+
python3 tools/convert_mot20_to_coco.py
99+
python3 tools/convert_crowdhuman_to_coco.py
100+
python3 tools/convert_cityperson_to_coco.py
101+
python3 tools/convert_ethz_to_coco.py
102+
```
103+
Creating different training mix_data:
104+
```
105+
cd <ROOT>/SparseTrack
106+
107+
# training on CrowdHuman and MOT17 half train, evaluate on MOT17 half val.
108+
python3 tools/mix_data_ablation.py
109+
110+
# training on CrowdHuman and MOT20 half train, evaluate on MOT20 half val.
111+
python3 tools/mix_data_ablation_20.py
112+
113+
# training on MOT17, CrowdHuman, ETHZ, Citypersons, evaluate on MOT17 train.
114+
python3 tools/mix_data_test_mot17.py
115+
116+
# training on MOT20 and CrowdHuman, evaluate on MOT20 train.
117+
python3 tools/mix_data_test_mot20.py
118+
```
119+
120+
## Model zoo
121+
See [ByteTrack.model_zoo](https://github.com/ifzhang/ByteTrack#model-zoo). We used the publicly available ByteTrack model zoo trained on MOT17, MOT20 and ablation study for YOLOX object detection.
122+
123+
Additionally, we conducted joint training on MOT20 train half and Crowdhuman, and evaluated on MOT20 val half. The model as follows: [yolox_x_mot20_ablation](https://drive.google.com/file/d/1F2XwyYKj1kefLPUFRHxgnpaAmEwyoocw/view?usp=drive_link)
124+
125+
The model trained on DanceTrack can be available at [yolox_x_dancetrack](https://drive.google.com/drive/folders/1-uxcNTi7dhuDNGC5MmzXyllLzmVbzXay?usp=sharing).
126+
127+
128+
## Training
129+
All training is conducted on a unified script. You need to change the **VAL_JSON** and **VAL_PATH** in [register_data.py](https://github.com/hustvl/SparseTrack/blob/main/register_data.py), and then run as follows:
130+
```
131+
# training on MOT17, CrowdHuman, ETHZ, Citypersons, evaluate on MOT17 train set.
132+
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --num-gpus 4 --config-file mot17_train_config.py
133+
134+
135+
# training on MOT20, CrowdHuman, evaluate on MOT20 train set.
136+
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --num-gpus 4 --config-file mot20_train_config.py
137+
```
138+
**Notes**:
139+
For MOT20, you need to clip the bounding boxes inside the image.
140+
141+
Add clip operation in line 138-139 in [data_augment.py](https://github.com/hustvl/SparseTrack/blob/main/datasets/data/data_augment.py), line 118-121 in [mosaicdetection.py](https://github.com/hustvl/SparseTrack/blob/main/datasets/data/datasets/mosaicdetection.py), line 213-221 in mosaicdetection.py, line 115-118 in [boxes.py](https://github.com/hustvl/SparseTrack/blob/main/utils/boxes.py).
142+
143+
## Tracking
144+
All tracking experimental scripts are run in the following manner. You first place the model weights in the **<ROOT/SparseTrack/pretrain/>**, and change the **VAL_JSON** and **VAL_PATH** in [register_data.py](https://github.com/hustvl/SparseTrack/blob/main/register_data.py).
145+
```
146+
# tracking on mot17 train set or test set
147+
CUDA_VISIBLE_DEVICES=0 python3 track.py --num-gpus 1 --config-file mot17_track_cfg.py
148+
149+
150+
# tracking on mot20 train set or test set
151+
CUDA_VISIBLE_DEVICES=0 python3 track.py --num-gpus 1 --config-file mot20_track_cfg.py
152+
153+
154+
# tracking on mot17 val_half set
155+
CUDA_VISIBLE_DEVICES=0 python3 track.py --num-gpus 1 --config-file mot17_ab_track_cfg.py
156+
157+
158+
# tracking on mot20 val_half set
159+
CUDA_VISIBLE_DEVICES=0 python3 track.py --num-gpus 1 --config-file mot20_ab_track_cfg.py
160+
```
161+
162+
## Citation -->
163+
If you find SparseTrack is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.
164+
```bibtex
165+
@inproceedings{SparseTrack,
166+
title={SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-Depth},
167+
author={Liu, Zelin and Wang, Xinggang and Wang, Cheng and Liu, Wenyu and Bai, Xiang},
168+
journal={arXiv preprint arXiv:2306.05238},
169+
year={2023}
170+
}
171+
```
172+
173+
## Acknowledgements
174+
A large part of the code is borrowed from [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX), [FairMOT](https://github.com/ifzhang/FairMOT), [ByteTrack](https://github.com/ifzhang/ByteTrack), [BoT-SORT](https://github.com/NirAharon/BOT-SORT), [Detectron2](https://github.com/facebookresearch/detectron2).
175+
Many thanks for their wonderful works.
176+

0 commit comments

Comments
 (0)