Skip to content

Commit c6f8924

Browse files
authored
[HumanSeg] Add Semantic Connectivity-aware Learning and release teleconferencing dataset (PaddlePaddle#1685)
1 parent cf56298 commit c6f8924

17 files changed

+433
-3
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ English | [简体中文](README_CN.md)
88
![support os](https://img.shields.io/badge/os-linux%2C%20win%2C%20mac-yellow.svg)
99
## PaddleSeg has released the new version including the following features:
1010

11+
* We published a paper on portrait segmentation named [PP-HumanSeg](./contrib/PP-HumanSeg/paper.md), and release Semantic Connectivity-aware Learning (SCL) framework and a Large-Scale Teleconferencing Video Dataset.
1112
* We published a paper on interactive segmentation named [EdgeFlow](https://arxiv.org/abs/2109.09406), in which the proposed approach achieved SOTA performance on several well-known datasets, and upgraded the interactive annotation tool, [EISeg](./EISeg).
1213
* We released two [Matting](./contrib/Matting) algorithms, DIM and MODNet, which achieve extremely fine-grained segmentation.
1314
* We provided advanced features on segmentation model compression, [Knowlede Distillation](./slim/distill) and [Model Quantization](./slim/quant), which accelerate model inference on multi-devices deployment.

README_CN.md

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ PaddleSeg团队将举办主题为《产业图像分割应用与实战》的两
1616

1717
## PaddleSeg发布2.3版本,欢迎体验
1818

19+
* PaddleSeg团队发表人像分割论文[PP-HumanSeg](./contrib/PP-HumanSeg/paper.md),并开源连通性学习(SCL)方法和大规模视频会议数据集。
1920
* PaddleSeg团队发表交互式分割论文[EdgeFlow](https://arxiv.org/abs/2109.09406),已在多个数据集实现SOTA性能,并升级了交互式分割工具[EISeg](./EISeg)
2021
* 开源两种[Matting](./contrib/Matting)算法,经典方法DIM,和实时性方法MODNet,实现精细化人像分割。
2122
* 发布图像分割高阶功能,[模型蒸馏](./slim/distill)[模型量化](./slim/quant)方案,进一步提升模型的部署效率。
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
_base_: '../_base_/cityscapes_1024x1024.yml'
2+
3+
batch_size: 4
4+
iters: 40000
5+
6+
loss:
7+
types:
8+
- type: CrossEntropyLoss
9+
coef: [1.0, 0.4]
10+
11+
lr_scheduler:
12+
type: PolynomialDecay
13+
learning_rate: 0.025
14+
end_lr: 1.0e-4
15+
power: 0.9
16+
17+
model:
18+
type: FastSCNN
19+
num_classes: 19
20+
enable_auxiliary_loss: True
21+
pretrained: null
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
_base_: '../_base_/cityscapes_1024x1024.yml'
2+
3+
batch_size: 4
4+
iters: 40000
5+
6+
loss:
7+
types:
8+
- type: MixedLoss
9+
losses:
10+
- type: CrossEntropyLoss
11+
- type: SemanticConnectivityLearning
12+
coef: [1, 0.01]
13+
- type: CrossEntropyLoss
14+
coef: [1.0, 0.4]
15+
16+
lr_scheduler:
17+
type: PolynomialDecay
18+
learning_rate: 0.025
19+
end_lr: 1.0e-4
20+
power: 0.9
21+
22+
model:
23+
type: FastSCNN
24+
num_classes: 19
25+
enable_auxiliary_loss: True
26+
pretrained: null
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
_base_: '../_base_/cityscapes.yml'
2+
3+
model:
4+
type: FCN
5+
backbone:
6+
type: HRNet_W18
7+
align_corners: False
8+
pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w18_ssld.tar.gz
9+
num_classes: 19
10+
pretrained: Null
11+
backbone_indices: [-1]
12+
13+
optimizer:
14+
weight_decay: 0.0005
15+
16+
iters: 80000
17+
batch_size: 4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
_base_: '../_base_/cityscapes.yml'
2+
3+
model:
4+
type: FCN
5+
backbone:
6+
type: HRNet_W18
7+
align_corners: False
8+
pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w18_ssld.tar.gz
9+
num_classes: 19
10+
pretrained: Null
11+
backbone_indices: [-1]
12+
13+
optimizer:
14+
weight_decay: 0.0005
15+
16+
iters: 80000
17+
batch_size: 4
18+
19+
loss:
20+
types:
21+
- type: MixedLoss
22+
losses:
23+
- type: CrossEntropyLoss
24+
- type: SemanticConnectivityLearning
25+
coef: [1, 0.05]
26+
coef: [1]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
2+
_base_: '../_base_/cityscapes.yml'
3+
4+
batch_size: 2
5+
iters: 40000
6+
7+
model:
8+
type: OCRNet
9+
backbone:
10+
type: HRNet_W48
11+
12+
pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w48_ssld.tar.gz
13+
num_classes: 19
14+
backbone_indices: [0]
15+
16+
optimizer:
17+
type: sgd
18+
19+
lr_scheduler:
20+
type: PolynomialDecay
21+
learning_rate: 0.01
22+
power: 0.9
23+
24+
loss:
25+
types:
26+
- type: CrossEntropyLoss
27+
- type: CrossEntropyLoss
28+
coef: [1, 0.4]
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
2+
_base_: '../_base_/cityscapes.yml'
3+
4+
batch_size: 2
5+
iters: 40000
6+
7+
model:
8+
type: OCRNet
9+
backbone:
10+
type: HRNet_W48
11+
12+
pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w48_ssld.tar.gz
13+
num_classes: 19
14+
backbone_indices: [0]
15+
16+
optimizer:
17+
type: sgd
18+
19+
lr_scheduler:
20+
type: PolynomialDecay
21+
learning_rate: 0.01
22+
power: 0.9
23+
24+
loss:
25+
types:
26+
- type: MixedLoss
27+
losses:
28+
- type: CrossEntropyLoss
29+
- type: SemanticConnectivityLearning
30+
coef: [1, 0.1]
31+
- type: CrossEntropyLoss
32+
coef: [1, 0.4]

configs/pp_humanseg_lite/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# PP-HumanSeg-Lite
22

3-
自研超轻量级模型,适用于Web端或移动端实时分割场景。
3+
自研超轻量级模型ConnectNet,适用于Web端或移动端实时分割场景。
44

55
## Network Structure
66
![](pphumanseg_lite.png)

contrib/PP-HumanSeg/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,9 @@
88
<img src="https://github.com/LutaoChu/transfer_station/raw/master/conference.gif" width="70%" height="70%">
99
</p>
1010

11+
## 最新动向
12+
- [2022-1-4] 人像分割论文[PP-HumanSeg](./paper.md)发表于WACV 2022 Workshop,并开源连通性学习(SCL)方法和大规模视频会议数据集。
13+
1114
## 目录
1215
- [人像分割模型](#人像分割模型)
1316
- [通用人像分割](#通用人像分割)

contrib/PP-HumanSeg/paper.md

+35
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Connectivity-Aware Portrait Segmentation With a Large-Scale Teleconferencing Video Dataset
2+
Official resource for the paper PP-HumanSeg: Connectivity-Aware Portrait Segmentation With a Large-Scale Teleconferencing Video Dataset. [[Paper](https://arxiv.org/abs/2112.07146) | [Poster](https://paddleseg.bj.bcebos.com/dygraph/humanseg/paper/12-HAD-poster.pdf) | [YouTube](https://www.youtube.com/watch?v=FlK8R5cdD7E)]
3+
4+
## Semantic Connectivity-aware Learning
5+
SCL (Semantic Connectivity-aware Learning) framework, which introduces a SC Loss (Semantic Connectivity-aware Loss) to improve the quality of segmentation results from the perspective of connectivity. Support multi-class segmentation. [[Source code](../../paddleseg/models/losses/semantic_connectivity_learning.py)]
6+
7+
SCL can improve the integrity of segmentation objects and increase segmentation accuracy. The experimental results on our Teleconferencing Video Dataset are shown in paper, and the experimental results on Cityscapes are as follows:
8+
9+
### Perfermance on Cityscapes
10+
| Model | Backbone | Learning Strategy | GPUs * Batch Size(Per Card)| Training Iters | mIoU (%) | Config |
11+
|:-:|:-:|:-:|:-:|:-:|:-:|:-:|
12+
|OCRNet|HRNet-W48|-|2*2|40000|76.23| [config](../../configs/ocrnet/ocrnet_hrnetw48_cityscapes_1024x512_40k.yml) |
13+
|OCRNet|HRNet-W48|SCL|2*2|40000|78.29(**+2.06**)|[config](../../configs/ocrnet/ocrnet_hrnetw48_cityscapes_1024x512_40k_SCL.yml) |
14+
|FCN|HRNet-W18|-|2*4|80000|77.81|[config](../../configs/fcn/fcn_hrnetw18_cityscapes_1024x512_80k_bs4.yml)|
15+
|FCN|HRNet-W18|SCL|2*4|80000|78.68(**+0.87**)|[config](../../configs/fcn/fcn_hrnetw18_cityscapes_1024x512_80k_bs4_SCL.yml)|
16+
|Fast SCNN|-|-|2*4|40000|56.41|[config](../../configs/fastscnn/fastscnn_cityscapes_1024x1024_40k.yml)|
17+
|Fast SCNN|-|SCL|2*4|40000|57.37(**+0.96**)|[config](../../configs/fastscnn/fastscnn_cityscapes_1024x1024_40k_SCL.yml)|
18+
19+
## Large-Scale Teleconferencing Video Dataset
20+
A large-scale video portrait dataset that contains 291 videos from 23 conference scenes with 14K fine-labeled frames. The data can be obtained by sending an application email to [email protected].
21+
22+
23+
## Citation
24+
If our project is useful in your research, please citing:
25+
26+
```latex
27+
@InProceedings{Chu_2022_WACV,
28+
author = {Chu, Lutao and Liu, Yi and Wu, Zewu and Tang, Shiyu and Chen, Guowei and Hao, Yuying and Peng, Juncai and Yu, Zhiliang and Chen, Zeyu and Lai, Baohua and Xiong, Haoyi},
29+
title = {PP-HumanSeg: Connectivity-Aware Portrait Segmentation With a Large-Scale Teleconferencing Video Dataset},
30+
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshops},
31+
month = {January},
32+
year = {2022},
33+
pages = {202-209}
34+
}
35+
```
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
简体中文 | [English](SemanticConnectivityLearning_en.md)
2+
## [SemanticConnectivityLearning](../../../paddleseg/models/losses/semantic_connectivity_learning.py)
3+
SCL(Semantic Connectivity-aware Learning)框架,它引入了SC Loss (Semantic Connectivity-aware Loss),从连通性的角度提升分割结果的质量。支持多类别分割。
4+
5+
论文信息:
6+
Lutao Chu, Yi Liu, Zewu Wu, Shiyu Tang, Guowei Chen, Yuying Hao, Juncai Peng, Zhiliang Yu, Zeyu Chen, Baohua Lai, Haoyi Xiong.
7+
"PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset"
8+
In WACV 2022 workshop
9+
https://arxiv.org/abs/2112.07146
10+
11+
执行步骤:
12+
步骤1,连通域计算
13+
步骤2,连通域匹配与SC Loss计算
14+
```python
15+
class paddleseg.models.losses.SemanticConnectivityLearning(
16+
ignore_index = 255,
17+
max_pred_num_conn = 10,
18+
use_argmax = True
19+
)
20+
```
21+
22+
## 语义连通性学习(SCL) 使用指南
23+
24+
### 参数
25+
* **ignore_index** (int): 指定一个在标注图中要忽略的像素值,其对输入梯度不产生贡献。当标注图中存在无法标注(或很难标注)的像素时,可以将其标注为某特定灰度值。在计算损失值时,其与原图像对应位置的像素将不作为损失函数的自变量。 *默认:``255``*
26+
* **max_pred_num_conn** (int): 预测连通域的最大数量。在训练开始时,往往存在大量连通域,导致计算非常耗时。因此,有必要限制预测连通域的最大数量,超出最大数量的连通域将不参与计算。
27+
* **use_argmax** (bool): 是否对logits进行argmax操作。
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
English | [简体中文](SemanticConnectivityLearning_cn.md)
2+
## [SemanticConnectivityLearning](../../../paddleseg/models/losses/semantic_connectivity_learning.py)
3+
SCL (Semantic Connectivity-aware Learning) framework, which introduces a SC Loss (Semantic Connectivity-aware Loss)
4+
to improve the quality of segmentation results from the perspective of connectivity. Support multi-class segmentation.
5+
6+
The original article refers to
7+
Lutao Chu, Yi Liu, Zewu Wu, Shiyu Tang, Guowei Chen, Yuying Hao, Juncai Peng, Zhiliang Yu, Zeyu Chen, Baohua Lai, Haoyi Xiong.
8+
"PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset"
9+
In WACV 2022 workshop
10+
https://arxiv.org/abs/2112.07146
11+
12+
Running process:
13+
Step 1. Connected Components Calculation
14+
Step 2. Connected Components Matching and SC Loss Calculation
15+
16+
```python
17+
class paddleseg.models.losses.SemanticConnectivityLearning(
18+
ignore_index = 255,
19+
max_pred_num_conn = 10,
20+
use_argmax = True
21+
)
22+
```
23+
24+
## Semantic Connectivity Learning usage guidance
25+
26+
### Args
27+
* **ignore_index** (int): Specify a pixel value to be ignored in the annotated image
28+
and does not contribute to the input gradient.When there are pixels that cannot be marked (or difficult to be marked) in the marked image, they can be marked as a specific gray value. When calculating the loss value, the pixel corresponding to the original image will not be used as the independent variable of the loss function. *Default:``255``*
29+
* **max_pred_num_conn** (int): Maximum number of predicted connected components. At the beginning of training,
30+
there will be a large number of connected components, and the calculation is very time-consuming.
31+
Therefore, it is necessary to limit the maximum number of predicted connected components,
32+
and the rest will not participate in the calculation.
33+
* **use_argmax** (bool): Whether to use argmax for logits.

docs/module/loss/losses_cn.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,6 @@
2323

2424
* ## [paddleseg.models.losses.ohem_cross_entropy_loss](./OhemCrossEntropyLoss_cn.md)
2525

26-
* ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_cn.md)
26+
* ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_cn.md)
27+
28+
* ## [paddleseg.models.losses.semantic_connectivity_learning](./SemanticConnectivityLearning_cn.md)

docs/module/loss/losses_en.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,6 @@ English | [简体中文](losses_cn.md)
2323

2424
* ## [paddleseg.models.losses.ohem_cross_entropy_loss](./OhemCrossEntropyLoss_en.md)
2525

26-
* ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_en.md)
26+
* ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_en.md)
27+
28+
* ## [paddleseg.models.losses.semantic_connectivity_learning](./SemanticConnectivityLearning_en.md)

paddleseg/models/losses/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -33,3 +33,4 @@
3333
from .point_cross_entropy_loss import PointCrossEntropyLoss
3434
from .pixel_contrast_cross_entropy_loss import PixelContrastCrossEntropyLoss
3535
from .semantic_encode_cross_entropy_loss import SECrossEntropyLoss
36+
from .semantic_connectivity_learning import SemanticConnectivityLearning

0 commit comments

Comments
 (0)