Skip to content

Commit 4bdb312

Browse files
[Docs] Normalize config README.md. (open-mmlab#7051)
* regularize README.md * Check README.md. Check backbone metafile * Check backbone metafile * Add metafile for some algorithm * Change pre-commit-hooks Change pre-commit-hooks to open-mmlab Co-authored-by: Zaida Zhou <[email protected]> * Add new metafile in model-index.yml Co-authored-by: Zaida Zhou <[email protected]>
1 parent 25a95d2 commit 4bdb312

File tree

93 files changed

+2646
-1802
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

93 files changed

+2646
-1802
lines changed

configs/albu_example/README.md

+10-14
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,26 @@
11
# Albu Example
22

3-
## Abstract
3+
> [Albumentations: fast and flexible image augmentations](https://arxiv.org/abs/1809.06839)
44
5-
<!-- [ABSTRACT] -->
5+
<!-- [OTHERS] -->
6+
7+
## Abstract
68

79
Data augmentation is a commonly used technique for increasing both the size and the diversity of labeled training sets by leveraging input transformations that preserve output labels. In computer vision domain, image augmentations have become a common implicit regularization technique to combat overfitting in deep convolutional neural networks and are ubiquitously used to improve performance. While most deep learning frameworks implement basic image transformations, the list is typically limited to some variations and combinations of flipping, rotating, scaling, and cropping. Moreover, the image processing speed varies in existing tools for image augmentation. We present Albumentations, a fast and flexible library for image augmentations with many various image transform operations available, that is also an easy-to-use wrapper around other augmentation libraries. We provide examples of image augmentations for different computer vision tasks and show that Albumentations is faster than other commonly used image augmentation tools on the most of commonly used image transformations.
810

9-
<!-- [IMAGE] -->
1011
<div align=center>
1112
<img src="https://user-images.githubusercontent.com/40661020/143870703-74f3ea3f-ae23-4035-9856-746bc3f88464.png" height="400" />
1213
</div>
1314

14-
<!-- [PAPER_TITLE: Albumentations: fast and flexible image augmentations] -->
15-
<!-- [PAPER_URL: https://arxiv.org/abs/1809.06839] -->
15+
## Results and Models
1616

17-
## Citation
17+
| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Config | Download |
18+
|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:-------:|:------:|:--------:|
19+
| R-50 | pytorch | 1x | 4.4 | 16.6 | 38.0 | 34.5 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/albu_example/mask_rcnn_r50_fpn_albu_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208-ab203bcd.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208_225520.log.json) |
1820

19-
<!-- [OTHERS] -->
21+
## Citation
2022

21-
```
23+
```latex
2224
@article{2018arXiv180906839B,
2325
author = {A. Buslaev, A. Parinov, E. Khvedchenya, V.~I. Iglovikov and A.~A. Kalinin},
2426
title = "{Albumentations: fast and flexible image augmentations}",
@@ -27,9 +29,3 @@ Data augmentation is a commonly used technique for increasing both the size and
2729
year = 2018
2830
}
2931
```
30-
31-
## Results and Models
32-
33-
| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | mask AP | Config | Download |
34-
|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:-------:|:------:|:--------:|
35-
| R-50 | pytorch | 1x | 4.4 | 16.6 | 38.0 | 34.5 |[config](https://github.com/open-mmlab/mmdetection/tree/master/configs/albu_example/mask_rcnn_r50_fpn_albu_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208-ab203bcd.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/albu_example/mask_rcnn_r50_fpn_albu_1x_coco/mask_rcnn_r50_fpn_albu_1x_coco_20200208_225520.log.json) |

configs/atss/README.md

+11-15
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,25 @@
1-
# Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection
1+
# ATSS
22

3-
## Abstract
3+
> [Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection](https://arxiv.org/abs/1912.02424)
44
5-
<!-- [ABSTRACT] -->
5+
<!-- [ALGORITHM] -->
6+
7+
## Abstract
68

79
Object detection has been dominated by anchor-based detectors for several years. Recently, anchor-free detectors have become popular due to the proposal of FPN and Focal Loss. In this paper, we first point out that the essential difference between anchor-based and anchor-free detection is actually how to define positive and negative training samples, which leads to the performance gap between them. If they adopt the same definition of positive and negative samples during training, there is no obvious difference in the final performance, no matter regressing from a box or a point. This shows that how to select positive and negative training samples is important for current object detectors. Then, we propose an Adaptive Training Sample Selection (ATSS) to automatically select positive and negative samples according to statistical characteristics of object. It significantly improves the performance of anchor-based and anchor-free detectors and bridges the gap between them. Finally, we discuss the necessity of tiling multiple anchors per location on the image to detect objects. Extensive experiments conducted on MS COCO support our aforementioned analysis and conclusions. With the newly introduced ATSS, we improve state-of-the-art detectors by a large margin to 50.7% AP without introducing any overhead.
810

9-
<!-- [IMAGE] -->
1011
<div align=center>
1112
<img src="https://user-images.githubusercontent.com/40661020/143870776-c81168f5-e8b2-44ee-978b-509e4372c5c9.png"/>
1213
</div>
1314

14-
<!-- [PAPER_TITLE: Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection] -->
15-
<!-- [PAPER_URL: https://arxiv.org/abs/1912.02424] -->
15+
## Results and Models
1616

17-
## Citation
17+
| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download |
18+
|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:------:|:--------:|
19+
| R-50 | pytorch | 1x | 3.7 | 19.7 | 39.4 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r50_fpn_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209-985f7bd0.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209_102539.log.json) |
20+
| R-101 | pytorch | 1x | 5.6 | 12.3 | 41.5 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r101_fpn_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.log.json) |
1821

19-
<!-- [ALGORITHM] -->
22+
## Citation
2023

2124
```latex
2225
@article{zhang2019bridging,
@@ -26,10 +29,3 @@ Object detection has been dominated by anchor-based detectors for several years.
2629
year = {2019}
2730
}
2831
```
29-
30-
## Results and Models
31-
32-
| Backbone | Style | Lr schd | Mem (GB) | Inf time (fps) | box AP | Config | Download |
33-
|:---------:|:-------:|:-------:|:--------:|:--------------:|:------:|:------:|:--------:|
34-
| R-50 | pytorch | 1x | 3.7 | 19.7 | 39.4 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r50_fpn_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209-985f7bd0.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r50_fpn_1x_coco/atss_r50_fpn_1x_coco_20200209_102539.log.json) |
35-
| R-101 | pytorch | 1x | 5.6 | 12.3 | 41.5 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/atss/atss_r101_fpn_1x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/atss/atss_r101_fpn_1x_coco/atss_r101_fpn_1x_20200825-dfcadd6f.log.json) |

configs/autoassign/README.md

+16-20
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,17 @@
1-
# AutoAssign: Differentiable Label Assignment for Dense Object Detection
1+
# AutoAssign
22

3-
## Abstract
3+
> [AutoAssign: Differentiable Label Assignment for Dense Object Detection](https://arxiv.org/abs/2007.03496)
4+
5+
<!-- [ALGORITHM] -->
46

5-
<!-- [ABSTRACT] -->
7+
## Abstract
68

79
Determining positive/negative samples for object detection is known as label assignment. Here we present an anchor-free detector named AutoAssign. It requires little human knowledge and achieves appearance-aware through a fully differentiable weighting mechanism. During training, to both satisfy the prior distribution of data and adapt to category characteristics, we present Center Weighting to adjust the category-specific prior distributions. To adapt to object appearances, Confidence Weighting is proposed to adjust the specific assign strategy of each instance. The two weighting modules are then combined to generate positive and negative weights to adjust each location's confidence. Extensive experiments on the MS COCO show that our method steadily surpasses other best sampling strategies by large margins with various backbones. Moreover, our best model achieves 52.1% AP, outperforming all existing one-stage detectors. Besides, experiments on other datasets, e.g., PASCAL VOC, Objects365, and WiderFace, demonstrate the broad applicability of AutoAssign.
810

9-
<!-- [IMAGE] -->
1011
<div align=center>
1112
<img src="https://user-images.githubusercontent.com/40661020/143870875-33567e44-0584-4470-9a90-0df0fb6c1fe2.png"/>
1213
</div>
1314

14-
<!-- [PAPER_TITLE: AutoAssign: Differentiable Label Assignment for Dense Object Detection] -->
15-
<!-- [PAPER_URL: https://arxiv.org/abs/2007.03496] -->
16-
17-
## Citation
18-
19-
<!-- [ALGORITHM] -->
20-
21-
```
22-
@article{zhu2020autoassign,
23-
title={AutoAssign: Differentiable Label Assignment for Dense Object Detection},
24-
author={Zhu, Benjin and Wang, Jianfeng and Jiang, Zhengkai and Zong, Fuhang and Liu, Songtao and Li, Zeming and Sun, Jian},
25-
journal={arXiv preprint arXiv:2007.03496},
26-
year={2020}
27-
}
28-
```
29-
3015
## Results and Models
3116

3217
| Backbone | Style | Lr schd | Mem (GB) | box AP | Config | Download |
@@ -37,3 +22,14 @@ Determining positive/negative samples for object detection is known as label ass
3722

3823
1. We find that the performance is unstable with 1x setting and may fluctuate by about 0.3 mAP. mAP 40.3 ~ 40.6 is acceptable. Such fluctuation can also be found in the original implementation.
3924
2. You can get a more stable results ~ mAP 40.6 with a schedule total 13 epoch, and learning rate is divided by 10 at 10th and 13th epoch.
25+
26+
## Citation
27+
28+
```latex
29+
@article{zhu2020autoassign,
30+
title={AutoAssign: Differentiable Label Assignment for Dense Object Detection},
31+
author={Zhu, Benjin and Wang, Jianfeng and Jiang, Zhengkai and Zong, Fuhang and Liu, Songtao and Li, Zeming and Sun, Jian},
32+
journal={arXiv preprint arXiv:2007.03496},
33+
year={2020}
34+
}
35+
```

configs/carafe/README.md

+19-23
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,17 @@
1-
# CARAFE: Content-Aware ReAssembly of FEatures
1+
# CARAFE
22

3-
## Abstract
3+
> [CARAFE: Content-Aware ReAssembly of FEatures](https://arxiv.org/abs/1905.02188)
44
5-
<!-- [ABSTRACT] -->
5+
<!-- [ALGORITHM] -->
6+
7+
## Abstract
68

79
Feature upsampling is a key operation in a number of modern convolutional network architectures, e.g. feature pyramids. Its design is critical for dense prediction tasks such as object detection and semantic/instance segmentation. In this work, we propose Content-Aware ReAssembly of FEatures (CARAFE), a universal, lightweight and highly effective operator to fulfill this goal. CARAFE has several appealing properties: (1) Large field of view. Unlike previous works (e.g. bilinear interpolation) that only exploit sub-pixel neighborhood, CARAFE can aggregate contextual information within a large receptive field. (2) Content-aware handling. Instead of using a fixed kernel for all samples (e.g. deconvolution), CARAFE enables instance-specific content-aware handling, which generates adaptive kernels on-the-fly. (3) Lightweight and fast to compute. CARAFE introduces little computational overhead and can be readily integrated into modern network architectures. We conduct comprehensive evaluations on standard benchmarks in object detection, instance/semantic segmentation and inpainting. CARAFE shows consistent and substantial gains across all the tasks (1.2%, 1.3%, 1.8%, 1.1db respectively) with negligible computational overhead. It has great potential to serve as a strong building block for future research. It has great potential to serve as a strong building block for future research.
810

9-
<!-- [IMAGE] -->
1011
<div align=center>
1112
<img src="https://user-images.githubusercontent.com/40661020/143872016-48225685-0e59-49cf-bd65-a50ee04ca8a2.png"/>
1213
</div>
1314

14-
<!-- [PAPER_TITLE: CARAFE: Content-Aware ReAssembly of FEatures] -->
15-
<!-- [PAPER_URL: https://arxiv.org/abs/1905.02188] -->
16-
17-
## Citation
18-
19-
<!-- [ALGORITHM] -->
20-
21-
We provide config files to reproduce the object detection & instance segmentation results in the ICCV 2019 Oral paper for [CARAFE: Content-Aware ReAssembly of FEatures](https://arxiv.org/abs/1905.02188).
22-
23-
```
24-
@inproceedings{Wang_2019_ICCV,
25-
title = {CARAFE: Content-Aware ReAssembly of FEatures},
26-
author = {Wang, Jiaqi and Chen, Kai and Xu, Rui and Liu, Ziwei and Loy, Chen Change and Lin, Dahua},
27-
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
28-
month = {October},
29-
year = {2019}
30-
}
31-
```
32-
3315
## Results and Models
3416

3517
The results on COCO 2017 val is shown in the below table.
@@ -44,3 +26,17 @@ The results on COCO 2017 val is shown in the below table.
4426
## Implementation
4527

4628
The CUDA implementation of CARAFE can be find at https://github.com/myownskyW7/CARAFE.
29+
30+
## Citation
31+
32+
We provide config files to reproduce the object detection & instance segmentation results in the ICCV 2019 Oral paper for [CARAFE: Content-Aware ReAssembly of FEatures](https://arxiv.org/abs/1905.02188).
33+
34+
```latex
35+
@inproceedings{Wang_2019_ICCV,
36+
title = {CARAFE: Content-Aware ReAssembly of FEatures},
37+
author = {Wang, Jiaqi and Chen, Kai and Xu, Rui and Liu, Ziwei and Loy, Chen Change and Lin, Dahua},
38+
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
39+
month = {October},
40+
year = {2019}
41+
}
42+
```

configs/cascade_rcnn/README.md

+22-26
Original file line numberDiff line numberDiff line change
@@ -1,38 +1,18 @@
1-
# Cascade R-CNN: High Quality Object Detection and Instance Segmentation
1+
# Cascade R-CNN
22

3-
## Abstract
3+
> [Cascade R-CNN: High Quality Object Detection and Instance Segmentation](https://arxiv.org/abs/1906.09756)
44
5-
<!-- [ABSTRACT] -->
5+
<!-- [ALGORITHM] -->
6+
7+
## Abstract
68

79
In object detection, the intersection over union (IoU) threshold is frequently used to define positives/negatives. The threshold used to train a detector defines its quality. While the commonly used threshold of 0.5 leads to noisy (low-quality) detections, detection performance frequently degrades for larger thresholds. This paradox of high-quality detection has two causes: 1) overfitting, due to vanishing positive samples for large thresholds, and 2) inference-time quality mismatch between detector and test hypotheses. A multi-stage object detection architecture, the Cascade R-CNN, composed of a sequence of detectors trained with increasing IoU thresholds, is proposed to address these problems. The detectors are trained sequentially, using the output of a detector as training set for the next. This resampling progressively improves hypotheses quality, guaranteeing a positive training set of equivalent size for all detectors and minimizing overfitting. The same cascade is applied at inference, to eliminate quality mismatches between hypotheses and detectors. An implementation of the Cascade R-CNN without bells or whistles achieves state-of-the-art performance on the COCO dataset, and significantly improves high-quality detection on generic and specific object detection datasets, including VOC, KITTI, CityPerson, and WiderFace. Finally, the Cascade R-CNN is generalized to instance segmentation, with nontrivial improvements over the Mask R-CNN.
810

9-
<!-- [IMAGE] -->
1011
<div align=center>
1112
<img src="https://user-images.githubusercontent.com/40661020/143872197-d99b90e4-4f05-4329-80a4-327ac862a051.png"/>
1213
</div>
1314

14-
<!-- [PAPER_TITLE: Cascade R-CNN: High Quality Object Detection and Instance Segmentation] -->
15-
<!-- [PAPER_URL: https://arxiv.org/abs/1906.09756] -->
16-
17-
## Citation
18-
19-
<!-- [ALGORITHM] -->
20-
21-
```latex
22-
@article{Cai_2019,
23-
title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
24-
ISSN={1939-3539},
25-
url={http://dx.doi.org/10.1109/tpami.2019.2956516},
26-
DOI={10.1109/tpami.2019.2956516},
27-
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
28-
publisher={Institute of Electrical and Electronics Engineers (IEEE)},
29-
author={Cai, Zhaowei and Vasconcelos, Nuno},
30-
year={2019},
31-
pages={1–1}
32-
}
33-
```
34-
35-
## Results and models
15+
## Results and Models
3616

3717
### Cascade R-CNN
3818

@@ -81,3 +61,19 @@ We also train some models with longer schedules and multi-scale training for Cas
8161
| X-101-32x4d-FPN | pytorch| 3x | 9.0 | | 46.3 | 40.1 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_mstrain_3x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_mstrain_3x_coco/cascade_mask_rcnn_x101_32x4d_fpn_mstrain_3x_coco_20210706_225234-40773067.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x4d_fpn_mstrain_3x_coco/cascade_mask_rcnn_x101_32x4d_fpn_mstrain_3x_coco_20210706_225234.log.json)
8262
| X-101-32x8d-FPN | pytorch| 3x | 12.1 | | 46.1 | 39.9 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn/cascade_mask_rcnn_x101_32x8d_fpn_mstrain_3x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x8d_fpn_mstrain_3x_coco/cascade_mask_rcnn_x101_32x8d_fpn_mstrain_3x_coco_20210719_180640-9ff7e76f.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_32x8d_fpn_mstrain_3x_coco/cascade_mask_rcnn_x101_32x8d_fpn_mstrain_3x_coco_20210719_180640.log.json)
8363
| X-101-64x4d-FPN | pytorch| 3x | 12.0 | | 46.6 | 40.3 | [config](https://github.com/open-mmlab/mmdetection/tree/master/configs/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_mstrain_3x_coco.py) | [model](https://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_mstrain_3x_coco/cascade_mask_rcnn_x101_64x4d_fpn_mstrain_3x_coco_20210719_210311-d3e64ba0.pth) &#124; [log](https://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_x101_64x4d_fpn_mstrain_3x_coco/cascade_mask_rcnn_x101_64x4d_fpn_mstrain_3x_coco_20210719_210311.log.json)
64+
65+
## Citation
66+
67+
```latex
68+
@article{Cai_2019,
69+
title={Cascade R-CNN: High Quality Object Detection and Instance Segmentation},
70+
ISSN={1939-3539},
71+
url={http://dx.doi.org/10.1109/tpami.2019.2956516},
72+
DOI={10.1109/tpami.2019.2956516},
73+
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
74+
publisher={Institute of Electrical and Electronics Engineers (IEEE)},
75+
author={Cai, Zhaowei and Vasconcelos, Nuno},
76+
year={2019},
77+
pages={1–1}
78+
}
79+
```

0 commit comments

Comments
 (0)