You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-3
Original file line number
Diff line number
Diff line change
@@ -60,10 +60,13 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
60
60
-[Paper Reading in 2019](https://towardsdatascience.com/the-200-deep-learning-papers-i-read-in-2019-7fb7034f05f7?source=friends_link&sk=7628c5be39f876b2c05e43c13d0b48a3)
61
61
62
62
63
-
## 2021-09 (2)
63
+
## 2021-09 (3)
64
64
-[DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection?](https://arxiv.org/abs/2108.06417)[[Notes](paper_notes/dd3d.md)][mono3D, Toyota]
65
+
-[EfficientDet: Scalable and Efficient Object Detection](https://arxiv.org/abs/1911.09070)[[Notes](paper_notes/efficientdet.md)] <kbd>CVPR 2020</kbd> [BiFPN, Tesla AI day]
65
66
-[PnPNet: End-to-End Perception and Prediction with Tracking in the Loop](https://arxiv.org/abs/2005.14711)[[Notes](paper_notes/pnpnet.md)] <kbd>CVPR 2020</kbd> [Uber ATG]
66
67
-[MP3: A Unified Model to Map, Perceive, Predict and Plan](https://arxiv.org/abs/2101.06806)[[Notes](paper_notes/mp3.md)] <kbd>CVPR 2021</kbd> [Uber, planning]
68
+
-[PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection](https://arxiv.org/abs/1912.13192) <kbd>CVPR 2020</kbd> [Waymo challenge 2nd place]
69
+
-[LiDAR R-CNN: An Efficient and Universal 3D Object Detector](https://arxiv.org/abs/2103.15297) <kbd>CVPR 2021</kbd> [TuSimple, Naiyan Wang]
-[PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation](https://arxiv.org/abs/1711.10871) <kbd>CVPR 2018</kbd> [sensor fusion, Zoox]
905
907
-[Deep Hough Voting for 3D Object Detection in Point Clouds](https://arxiv.org/abs/1904.09664) <kbd>ICCV 2019</kbd> [Charles Qi]
906
908
-[StixelNet: A Deep Convolutional Network for Obstacle Detection and Road Segmentation](http://www.bmva.org/bmvc/2015/papers/paper109/paper109.pdf)
907
-
-[PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection](https://arxiv.org/abs/1912.13192) <kbd>CVPR 2020</kbd> [Waymo challenge 2nd place]
908
909
-[PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation](https://arxiv.org/abs/2003.14032) <kbd>CVPR 2020</kbd>
# [EfficientDet: Scalable and Efficient Object Detection](https://arxiv.org/abs/1911.09070)
2
+
3
+
_September 2021_
4
+
5
+
tl;dr: BiFPN and multidimensional scaling of object detection.
6
+
7
+
#### Overall impression
8
+
This paper follows up on the work of [EfficientNet](efficientnet.md). The FPN neck essentially is a multi-scale feature fusion that aims to find a transformation that can effectively aggregate different features and output a list of new features.
9
+
10
+
#### Key ideas
11
+
- BiFPN (bidirectional FPN) (<-- PANet <-- FPN)
12
+
- [PANet](panet.md) to introduce bottom up pathway again.
13
+
- **Remove nodes** from PANet that has only has one input edge.
14
+
- **Add skip connection** from origiinal input to the output node if they are at the same level
15
+
- **Repeat** blocks of the above BiFPN block.
16
+
- Weighted feature fusion
17
+
- Baseline is to resize and sum up. Each feature may have different weight contribution (feature level attention).
18
+
- Softmax works, but a linear weighting normalization may work as well.
19
+
- Multidimensional/compound scaling up is more effective than single dimension scaling. Resolution, depth and width.
20
+
21
+
#### Technical details
22
+
-[NAS-FPN](nas_fpn.md) has repeated irregular blocks.
23
+
- Simply repeating FNP blocks will not lead to much benefit. Repeating PANet blocks will be better, and repeated BiFPN yields similar results but with much less computation.
24
+
- This still needs object assignemtns, like [RetinaNet](retinanet.md).
Copy file name to clipboardExpand all lines: paper_notes/efficientnet.md
+2
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,8 @@ The paper proposed a simple yet principled method to scale up networks. The main
11
11
12
12
On the other hand, the mobilenets papers ([v1](mobilenets_v1.md), [v2](mobilenets_v2.md) and [v3](mobilenets_v3.md)) goes the other way round. They start with an efficient network and scales it down further. The channel and resolution scaling factors are usually smaller than 1. Note that **MobileNetv3-Large optimizes based on MnasNet**. Therefore EfficientNet-B* is really all about how to scale up MobileNet, and tells us that a beefed-up MobileNet is better than ResNet. In the original [MobileNetsv1](mobilenets_v1.md)
13
13
14
+
This paper inspired follow-up work [EfficientDet](efficientdet.md), also by Quoc Le's team.
15
+
14
16
#### Key ideas
15
17
- The balance of width/depth/resolution can be achieved by simply scaling each of them with constant ratio.
16
18
- Deeper network captures richer and more complex features
0 commit comments