[HumanSeg] Add PP-HumanSeg14K dataset (PaddlePaddle#1708)

LutaoChu · web-flow · commit e13498b34a52 · 2022-01-12T11:52:48.000+08:00
diff --git a/README.md b/README.md
@@ -129,7 +129,7 @@ PaddleSeg is an end-to-end high-efficent development toolkit for image segmentat
           <ul>
             <li>Cross Entropy</li>
             <li>Binary CE</li>
-            <li>Bootstrapped CE</li> 
+            <li>Bootstrapped CE</li>
             <li>Point CE</li>
             <li>OHEM CE</li>
             <li>Pixel Contrast CE</li>
@@ -167,7 +167,8 @@ PaddleSeg is an end-to-end high-efficent development toolkit for image segmentat
             <li>HRF</li>
             <li>DRIVE</li>
             <li>STARE</li>
-          </ul> 
+            <li>PP-HumanSeg14K</li>
+          </ul>
         <b>Data Augmentation</b><br>
         <ul>
           <li>Flipping</li>  
@@ -182,7 +183,7 @@ PaddleSeg is an end-to-end high-efficent development toolkit for image segmentat
           <li>PaddingByAspectRatio</li>
           <li>RandomPaddingCrop</li>  
           <li>RandomCenterCrop</li>
-          <li>ScalePadding</li> 
+          <li>ScalePadding</li>
           <li>RandomNoise</li>  
           <li>RandomBlur</li>  
           <li>RandomRotation</li>  
diff --git a/README_CN.md b/README_CN.md
@@ -137,7 +137,7 @@ PaddleSeg是基于飞桨PaddlePaddle开发的端到端图像分割开发套件
           <ul>
             <li>Cross Entropy</li>
             <li>Binary CE</li>
-            <li>Bootstrapped CE</li> 
+            <li>Bootstrapped CE</li>
             <li>Point CE</li>
             <li>OHEM CE</li>
             <li>Pixel Contrast CE</li>
@@ -175,7 +175,8 @@ PaddleSeg是基于飞桨PaddlePaddle开发的端到端图像分割开发套件
             <li>HRF</li>
             <li>DRIVE</li>
             <li>STARE</li>
-          </ul> 
+            <li>PP-HumanSeg14K</li>
+          </ul>
         <b>数据增强</b><br>
         <ul>
           <li>Flipping</li>  
@@ -190,7 +191,7 @@ PaddleSeg是基于飞桨PaddlePaddle开发的端到端图像分割开发套件
           <li>PaddingByAspectRatio</li>
           <li>RandomPaddingCrop</li>  
           <li>RandomCenterCrop</li>
-          <li>ScalePadding</li> 
+          <li>ScalePadding</li>
           <li>RandomNoise</li>  
           <li>RandomBlur</li>  
           <li>RandomRotation</li>  
diff --git a/configs/fastscnn/fastscnn_cityscapes_1024x1024_40k_SCL.yml b/configs/fastscnn/fastscnn_cityscapes_1024x1024_40k_SCL.yml
@@ -8,7 +8,7 @@ loss:
     - type: MixedLoss
       losses:
         - type: CrossEntropyLoss
-        - type: SemanticConnectivityLearning
+        - type: SemanticConnectivityLoss
       coef: [1, 0.01]
     - type: CrossEntropyLoss
   coef: [1.0, 0.4]
diff --git a/configs/fcn/fcn_hrnetw18_cityscapes_1024x512_80k_bs4_SCL.yml b/configs/fcn/fcn_hrnetw18_cityscapes_1024x512_80k_bs4_SCL.yml
@@ -21,6 +21,6 @@ loss:
     - type: MixedLoss
       losses:
         - type: CrossEntropyLoss
-        - type: SemanticConnectivityLearning
+        - type: SemanticConnectivityLoss
       coef: [1, 0.05]
   coef: [1]
diff --git a/configs/fcn/fcn_hrnetw18_pphumanseg14k.yml b/configs/fcn/fcn_hrnetw18_pphumanseg14k.yml
@@ -0,0 +1,53 @@
+train_dataset:
+  type: PPHumanSeg14K
+  dataset_root: data/PP-HumanSeg14K
+  transforms:
+    - type: ResizeStepScaling
+      min_scale_factor: 0.5
+      max_scale_factor: 2.0
+      scale_step_size: 0.25
+    - type: RandomPaddingCrop
+      crop_size: [398, 224]
+    - type: RandomHorizontalFlip
+    - type: RandomDistort
+      brightness_range: 0.4
+      contrast_range: 0.4
+      saturation_range: 0.4
+    - type: Normalize
+  mode: train
+
+val_dataset:
+  type: PPHumanSeg14K
+  dataset_root: data/PP-HumanSeg14K
+  transforms:
+    - type: Normalize
+  mode: val
+
+model:
+  type: FCN
+  backbone:
+    type: HRNet_W18
+    align_corners: False
+    pretrained: https://bj.bcebos.com/paddleseg/dygraph/hrnet_w18_ssld.tar.gz
+  num_classes: 2
+  pretrained: Null
+  backbone_indices: [-1]
+
+optimizer:
+  type: sgd
+  momentum: 0.9
+  weight_decay: 0.0005
+
+lr_scheduler:
+  type: PolynomialDecay
+  learning_rate: 0.05
+  end_lr: 0
+  power: 0.9
+
+loss:
+  types:
+    - type: CrossEntropyLoss
+  coef: [1]
+
+iters: 10000
+batch_size: 64
diff --git a/configs/ocrnet/ocrnet_hrnetw48_cityscapes_1024x512_40k_SCL.yml b/configs/ocrnet/ocrnet_hrnetw48_cityscapes_1024x512_40k_SCL.yml
@@ -26,7 +26,7 @@ loss:
     - type: MixedLoss
       losses:
         - type: CrossEntropyLoss
-        - type: SemanticConnectivityLearning
+        - type: SemanticConnectivityLoss
       coef: [1, 0.1]
     - type: CrossEntropyLoss
   coef: [1, 0.4]
diff --git a/contrib/PP-HumanSeg/paper.md b/contrib/PP-HumanSeg/paper.md
@@ -2,11 +2,28 @@
 Official resource for the paper *PP-HumanSeg: Connectivity-Aware Portrait Segmentation With a Large-Scale Teleconferencing Video Dataset*. [[Paper](https://arxiv.org/abs/2112.07146) | [Poster](https://paddleseg.bj.bcebos.com/dygraph/humanseg/paper/12-HAD-poster.pdf) | [YouTube](https://www.youtube.com/watch?v=FlK8R5cdD7E)]
 
 ## Semantic Connectivity-aware Learning
-SCL (Semantic Connectivity-aware Learning) framework, which introduces a SC Loss (Semantic Connectivity-aware Loss) to improve the quality of segmentation results from the perspective of connectivity. Support multi-class segmentation. [[Source code](../../paddleseg/models/losses/semantic_connectivity_learning.py)]
+SCL (Semantic Connectivity-aware Learning) framework, which introduces a SC Loss (Semantic Connectivity-aware Loss) to improve the quality of segmentation results from the perspective of connectivity. SCL can improve the integrity of segmentation objects and increase segmentation accuracy. Support multi-class segmentation. [[Source code](../../paddleseg/models/losses/semantic_connectivity_loss.py)]
+
+<p align="center">
+<img src="https://user-images.githubusercontent.com/30695251/148921096-29a4f90f-2113-4f97-87b5-19364e83b454.png" width="40%" height="40%">
+</p>
+
+### Connected Components Calculation and Matching
+<p align="center">
+<img src="https://user-images.githubusercontent.com/30695251/148931627-bfaeeecb-c260-4d00-9393-a7e52a56ce18.png" width="40%" height="40%">
+</p>
+(a) It indicates prediction and ground truth, i.e. P and G. (b) Connected components are generated through the CCL algorithm, respectively. (c) Connected components are matched using the IoU value.
+
+### Segmentation Results
+
+<p align="center">
+<img src="https://user-images.githubusercontent.com/30695251/148931612-bfc5a7f2-f6b7-4666-b2dd-86926ea7bfd7.png" width="60%" height="60%">
+</p>
 
-SCL can improve the integrity of segmentation objects and increase segmentation accuracy. The experimental results on our Teleconferencing Video Dataset are shown in paper, and the experimental results on Cityscapes are as follows:
 
 ### Perfermance on Cityscapes
+The experimental results on our Teleconferencing Video Dataset are shown in paper, and the experimental results on Cityscapes are as follows:
+
 | Model | Backbone | Learning Strategy | GPUs * Batch Size(Per Card)| Training Iters | mIoU (%) | Config |
 |:-:|:-:|:-:|:-:|:-:|:-:|:-:|
 |OCRNet|HRNet-W48|-|2*2|40000|76.23| [config](../../configs/ocrnet/ocrnet_hrnetw48_cityscapes_1024x512_40k.yml) |
@@ -16,10 +33,15 @@ SCL can improve the integrity of segmentation objects and increase segmentation
 |Fast SCNN|-|-|2*4|40000|56.41|[config](../../configs/fastscnn/fastscnn_cityscapes_1024x1024_40k.yml)|
 |Fast SCNN|-|SCL|2*4|40000|57.37(**+0.96**)|[config](../../configs/fastscnn/fastscnn_cityscapes_1024x1024_40k_SCL.yml)|
 
-## Large-Scale Teleconferencing Video Dataset
-A large-scale video portrait dataset that contains 291 videos from 23 conference scenes with 14K fine-labeled frames. The data can be obtained by sending an application email to paddleseg@baidu.com.
 
 
+## PP-HumanSeg14K: A Large-Scale Teleconferencing Video Dataset
+A large-scale video portrait dataset that contains 291 videos from 23 conference scenes with 14K fine-labeled frames. This dataset contains various teleconferencing scenes, various actions of the participants, interference of passers-by and illumination change. The data can be obtained by sending an application email to paddleseg@baidu.com.
+
+<p align="center">
+<img src="https://user-images.githubusercontent.com/30695251/148931684-cc10c994-3bd4-4d0c-9bcc-283f9bbc6ac9.png" width="80%" height="80%">
+</p>
+
 ## Citation
 If our project is useful in your research, please citing:
 
diff --git a/docs/module/loss/SemanticConnectivityLoss_cn.md b/docs/module/loss/SemanticConnectivityLoss_cn.md
@@ -1,5 +1,5 @@
-简体中文 | [English](SemanticConnectivityLearning_en.md)
-## [SemanticConnectivityLearning](../../../paddleseg/models/losses/semantic_connectivity_learning.py)
+简体中文 | [English](SemanticConnectivityLoss_en.md)
+## [SemanticConnectivityLoss](../../../paddleseg/models/losses/semantic_connectivity_loss.py)
 SCL（Semantic Connectivity-aware Learning）框架，它引入了SC Loss (Semantic Connectivity-aware Loss)，从连通性的角度提升分割结果的质量。支持多类别分割。
 
 论文信息：
@@ -12,7 +12,7 @@ SCL（Semantic Connectivity-aware Learning）框架，它引入了SC Loss (Seman
 步骤1，连通域计算
 步骤2，连通域匹配与SC Loss计算
 ```python
-class paddleseg.models.losses.SemanticConnectivityLearning(
+class paddleseg.models.losses.SemanticConnectivityLoss(
             ignore_index = 255,
             max_pred_num_conn = 10,
             use_argmax = True
diff --git a/docs/module/loss/SemanticConnectivityLoss_en.md b/docs/module/loss/SemanticConnectivityLoss_en.md
@@ -1,5 +1,5 @@
-English | [简体中文](SemanticConnectivityLearning_cn.md)
-## [SemanticConnectivityLearning](../../../paddleseg/models/losses/semantic_connectivity_learning.py)
+English | [简体中文](SemanticConnectivityLoss_cn.md)
+## [SemanticConnectivityLoss](../../../paddleseg/models/losses/semantic_connectivity_loss.py)
 SCL (Semantic Connectivity-aware Learning) framework, which introduces a SC Loss (Semantic Connectivity-aware Loss)
 to improve the quality of segmentation results from the perspective of connectivity. Support multi-class segmentation.
 
@@ -14,7 +14,7 @@ Step 1. Connected Components Calculation
 Step 2. Connected Components Matching and SC Loss Calculation
 
 ```python
-class paddleseg.models.losses.SemanticConnectivityLearning(
+class paddleseg.models.losses.SemanticConnectivityLoss(
             ignore_index = 255,
             max_pred_num_conn = 10,
             use_argmax = True
diff --git a/docs/module/loss/losses_cn.md b/docs/module/loss/losses_cn.md
@@ -25,4 +25,4 @@
 
 * ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_cn.md)
 
-* ## [paddleseg.models.losses.semantic_connectivity_learning](./SemanticConnectivityLearning_cn.md)
+* ## [paddleseg.models.losses.semantic_connectivity_loss](./SemanticConnectivityLoss_cn.md)
diff --git a/docs/module/loss/losses_en.md b/docs/module/loss/losses_en.md
@@ -25,4 +25,4 @@ English | [简体中文](losses_cn.md)
 
 * ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_en.md)
 
-* ## [paddleseg.models.losses.semantic_connectivity_learning](./SemanticConnectivityLearning_en.md)
+* ## [paddleseg.models.losses.semantic_connectivity_loss](./SemanticConnectivityLoss_en.md)
diff --git a/paddleseg/datasets/__init__.py b/paddleseg/datasets/__init__.py
@@ -26,3 +26,4 @@
 from .drive import DRIVE
 from .hrf import HRF
 from .chase_db1 import CHASEDB1
+from .pp_humanseg14k import PPHumanSeg14K
diff --git a/paddleseg/datasets/pp_humanseg14k.py b/paddleseg/datasets/pp_humanseg14k.py
@@ -0,0 +1,82 @@
+# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+from .dataset import Dataset
+from paddleseg.cvlibs import manager
+from paddleseg.transforms import Compose
+
+
+@manager.DATASETS.add_component
+class PPHumanSeg14K(Dataset):
+    """
+    This is the PP-HumanSeg14K Dataset.
+
+    This dataset was introduced in the work:
+    Chu, Lutao, et al. "PP-HumanSeg: Connectivity-Aware Portrait Segmentation with a Large-Scale Teleconferencing Video Dataset." Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2022.
+
+    This dataset is divided into training set, validation set and test set. The training set includes 8770 pictures, the validation set includes 2431 pictures, and the test set includes 2482 pictures.
+
+    Args:
+        dataset_root (str, optional): The dataset directory. Default: None.
+        transforms (list, optional): Transforms for image. Default: None.
+        mode (str, optional): Which part of dataset to use. It is one of ('train', 'val'). Default: 'train'.
+        edge (bool, optional): Whether to compute edge while training. Default: False.
+    """
+    NUM_CLASSES = 2
+
+    def __init__(self,
+                 dataset_root=None,
+                 transforms=None,
+                 mode='train',
+                 edge=False):
+        self.dataset_root = dataset_root
+        self.transforms = Compose(transforms)
+        mode = mode.lower()
+        self.mode = mode
+        self.file_list = list()
+        self.num_classes = self.NUM_CLASSES
+        self.ignore_index = 255
+        self.edge = edge
+
+        if mode not in ['train', 'val', 'test']:
+            raise ValueError(
+                "`mode` should be 'train', 'val' or 'test', but got {}.".format(
+                    mode))
+
+        if self.transforms is None:
+            raise ValueError("`transforms` is necessary, but it is None.")
+
+        if mode == 'train':
+            file_path = os.path.join(self.dataset_root, 'train.txt')
+        elif mode == 'val':
+            file_path = os.path.join(self.dataset_root, 'val.txt')
+        else:
+            file_path = os.path.join(self.dataset_root, 'test.txt')
+
+        with open(file_path, 'r') as f:
+            for line in f:
+                items = line.strip().split(' ')
+                if len(items) != 2:
+                    if mode == 'train' or mode == 'val':
+                        raise Exception(
+                            "File list format incorrect! It should be"
+                            " image_name label_name\\n")
+                    image_path = os.path.join(self.dataset_root, items[0])
+                    grt_path = None
+                else:
+                    image_path = os.path.join(self.dataset_root, items[0])
+                    grt_path = os.path.join(self.dataset_root, items[1])
+                self.file_list.append([image_path, grt_path])
diff --git a/paddleseg/models/losses/__init__.py b/paddleseg/models/losses/__init__.py
@@ -33,4 +33,4 @@
 from .point_cross_entropy_loss import PointCrossEntropyLoss
 from .pixel_contrast_cross_entropy_loss import PixelContrastCrossEntropyLoss
 from .semantic_encode_cross_entropy_loss import SECrossEntropyLoss
-from .semantic_connectivity_learning import SemanticConnectivityLearning
+from .semantic_connectivity_loss import SemanticConnectivityLoss
diff --git a/paddleseg/models/losses/semantic_connectivity_loss.py b/paddleseg/models/losses/semantic_connectivity_loss.py
@@ -22,7 +22,7 @@
 
 
 @manager.LOSSES.add_component
-class SemanticConnectivityLearning(nn.Layer):
+class SemanticConnectivityLoss(nn.Layer):
     '''
     SCL (Semantic Connectivity-aware Learning) framework, which introduces a SC Loss (Semantic Connectivity-aware Loss)
     to improve the quality of segmentation results from the perspective of connectivity. Support multi-class segmentation.

Original file line number	Diff line number	Diff line change
`@@ -25,4 +25,4 @@`
`25`	`25`
`26`	`26`	`* ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_cn.md)`
`27`	`27`
`28`		`-* ## [paddleseg.models.losses.semantic_connectivity_learning](./SemanticConnectivityLearning_cn.md)`
	`28`	`+* ## [paddleseg.models.losses.semantic_connectivity_loss](./SemanticConnectivityLoss_cn.md)`
Original file line number	Diff line number	Diff line change
`@@ -25,4 +25,4 @@ English \| [简体中文](losses_cn.md)`
`25`	`25`
`26`	`26`	`* ## [paddleseg.models.losses.ohem_edge_attention_loss](./OhemEdgeAttentionLoss_en.md)`
`27`	`27`
`28`		`-* ## [paddleseg.models.losses.semantic_connectivity_learning](./SemanticConnectivityLearning_en.md)`
	`28`	`+* ## [paddleseg.models.losses.semantic_connectivity_loss](./SemanticConnectivityLoss_en.md)`