Reformat README

patrick-llgc · patrick-llgc · commit a8e38a543d1f · 2022-07-10T23:15:15.000+08:00
diff --git a/README.md b/README.md
@@ -12,37 +12,6 @@ If you are new to deep learning in computer vision and don't know where to start
 Here is [a list of trustworthy sources of papers](trusty.md) in case I ran out of papers to read.
 
 
-### Github repos
-- [MMAction2](https://github.com/open-mmlab/mmaction2) [268 stars]
-- [Kalman and Bayesian Fitlers](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python) [8.7k stars] [ipynb book](https://nbviewer.jupyter.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb)
-- [simple-faster-rcnn-pytorch](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/) (2.1k stars) [[Notes](code_notes/simple_faster_rcnn.md)]
-- [YOLACT/YOLACT++](https://github.com/dbolya/yolact) [2.1k stars]
-- [Yolov3 ultralytic](https://github.com/ultralytics/yolov3) [4.7k stars]
-- [MonoLoco](https://github.com/vita-epfl/monoloco) [131 stars]
-- [A Baseline for 3D Multi-Object Tracking](https://github.com/xinshuoweng/AB3DMOT) [548 stars]
-- [ROLO: recurrent YOLO](https://github.com/Guanghan/ROLO)
-- [point rend](https://github.com/facebookresearch/detectron2/tree/master/projects/PointRend/point_rend)
-- [Carla data export](https://github.com/Ozzyz/carla-data-export)
-- [openpilot](https://github.com/commaai/openpilot/blob/devel/selfdrive/modeld/models/driving.cc)
-- [3D Lane Dataset](https://github.com/yuliangguo/3D_Lane_Synthetic_Dataset)
-- [MicroGrad](https://github.com/karpathy/micrograd)
-- [OpenVSLAM](https://github.com/xdspacelab/openvslam) (2.3k stars)
-- [ORB SLAM2](https://github.com/raulmur/ORB_SLAM2) and [Docker version](https://github.com/yuyou/ORB_SLAM2#build-docker-image)
-- [PySLAM v2](https://github.com/luigifreda/pyslam)
-
-### Youtube channels
-- [Modern C++ for computer vision](https://www.youtube.com/playlist?list=PLgnQpQtFTOGR50iIOtO36nK6aNPtVq98C)
-- [SLAM by Cyrill Stachniss](https://www.youtube.com/playlist?list=PLgnQpQtFTOGQrZ4O5QzbIHgl3b1JHimN_)
-- [Understanding Sensor Fusion and Tracking by Matlab](https://www.youtube.com/watch?v=6qV3YjFppuc&t=0s)
-- [Understanding Kalman Filters by Matlab](https://www.youtube.com/watch?v=mwn8xhgNpFY&list=PLn8PRpmsu08pzi6EMiYnR-076Mh-q3tWr)
-
-### Talks
-- [Andrej Karpathy's Talks](./talk_notes/andrej.md)
-- [Drago Anguelov: Scale AI’s TransformX Presentation](https://www.youtube.com/watch?v=oJ96bgmSaW0)
-
-### What autonomous driving companies do
-- [Zoox](https://www.youtube.com/watch?&v=BVRMh9NO9Cs)
-
 ## My Review Posts by Topics
 I regularly update [my blog in Toward Data Science](https://medium.com/@patrickllgc).
 
@@ -66,7 +35,7 @@ I regularly update [my blog in Toward Data Science](https://medium.com/@patrickl
 ## 2022-07 (3)
 - [PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark](https://arxiv.org/abs/2203.11089) [[Notes](paper_notes/persformer.md)] [BEVNet, lane line]
 - [VectorMapNet: End-to-end Vectorized HD Map Learning](https://arxiv.org/abs/2206.08920) [[Notes](paper_notes/vectormapnet.md)] [BEVNet, LLD, Hang Zhao]
-- [PETR: Position Embedding Transformation for Multi-View 3D Object Detection](https://arxiv.org/abs/2203.05625) [[Notes](paper_notes/petr.md)] [BEVNet]
+- [PETR: Position Embedding Transformation for Multi-View 3D Object Detection](https://arxiv.org/abs/2203.05625) [[Notes](paper_notes/petr.md)] <kbd>ECCV 2022</kbd> [BEVNet]
 - [PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images](https://arxiv.org/abs/2206.01256) [BEVNet, MegVii]
 - [LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation](https://arxiv.org/abs/2206.13294) [Valeo]
 - [CVT: Cross-view Transformers for real-time Map-view Semantic Segmentation](https://arxiv.org/abs/2205.02833) <kbd>CVPR 2022 oral</kbd> [UTAustin, Philipp]
diff --git a/paper_notes/petr.md b/paper_notes/petr.md
@@ -22,8 +22,8 @@ The PETR idea resembles [CoordConv](coord_conv.md) and [CamConv](cam_conv.md), b
 - [PETR](petr.md) converges slower than [DETR3D](detr3d.md). The authors argue that PETR learns the 3D correlation through global attention while DETR3D perceives 3D scene within local regions (with the help of explicit 3D-2D feature projection).
 
 #### Technical details
-- The authors argue that in [DETR3D](detr3d.md) only the image feature at the projected point will be collected, which fails to perform the representation learning from global view. --> Actually this may not be that of a big issue for BEV perception, especially for object detection, which requires very localized attention. **I would rather consider this as an advantage** of [DETR3D](detr3d.md) and methods alike, such as [BEVFormer](bevformer.md).
-- The parameter settings in many of the experiments does not matter that much, and may in some part reflects the lack of domain knowledge of the authors in 3D object detection for autonomous driving. (The authors are top experts in the 2D object detection, admittedly.) For example, Table 4 ablation study is not necessary, in particular the Z range of -10 to 10 meters.
+- The authors argue that in [DETR3D](detr3d.md) only the image feature at the projected point will be collected, which fails to perform the representation learning from global view. --> Actually this may not be that of a big issue for BEV perception, especially for object detection, which requires very localized attention. **I would rather consider this as an advantage** of [DETR3D](detr3d.md) and methods alike, such as [BEVFormer](bevformer.md). --> Maybe adding this 2D-3D explicit link will boost the performance even further, with faster convergence?
+- The parameter settings in many of the experiments does not matter that much. For example, Table 4 ablation study is not necessary, in particular the Z range of -10 to 10 meters.
 - In Fig.3, the FC seems to stand for "fully convolutional". It is actually chosen to be 1x1 in the ablation study in Table5. **What is surprising is that if 3x3 is used instead of 1x1 in the feature blending, the network cannot converge.** --> The authors argue that this breaks the correspondence between 2D feature and 3D position. This is fishy.
 
 #### Notes
diff --git a/trusty.md b/trusty.md
@@ -22,4 +22,36 @@
 
 ## Others
 - [Arxiv sanity](http://www.arxiv-sanity.com/search?q=autonomous+driving)
-- [A first glimpse into Autonomous driving's technical stack](https://autonomous-driving.org/front/resources/)
+- [A first glimpse into Autonomous driving's technical stack](https://autonomous-driving.org/front/resources/)
+
+
+## Github repos
+- [MMAction2](https://github.com/open-mmlab/mmaction2) [268 stars]
+- [Kalman and Bayesian Fitlers](https://github.com/rlabbe/Kalman-and-Bayesian-Filters-in-Python) [8.7k stars] [ipynb book](https://nbviewer.jupyter.org/github/rlabbe/Kalman-and-Bayesian-Filters-in-Python/blob/master/table_of_contents.ipynb)
+- [simple-faster-rcnn-pytorch](https://github.com/chenyuntc/simple-faster-rcnn-pytorch/) (2.1k stars) [[Notes](code_notes/simple_faster_rcnn.md)]
+- [YOLACT/YOLACT++](https://github.com/dbolya/yolact) [2.1k stars]
+- [Yolov3 ultralytic](https://github.com/ultralytics/yolov3) [4.7k stars]
+- [MonoLoco](https://github.com/vita-epfl/monoloco) [131 stars]
+- [A Baseline for 3D Multi-Object Tracking](https://github.com/xinshuoweng/AB3DMOT) [548 stars]
+- [ROLO: recurrent YOLO](https://github.com/Guanghan/ROLO)
+- [point rend](https://github.com/facebookresearch/detectron2/tree/master/projects/PointRend/point_rend)
+- [Carla data export](https://github.com/Ozzyz/carla-data-export)
+- [openpilot](https://github.com/commaai/openpilot/blob/devel/selfdrive/modeld/models/driving.cc)
+- [3D Lane Dataset](https://github.com/yuliangguo/3D_Lane_Synthetic_Dataset)
+- [MicroGrad](https://github.com/karpathy/micrograd)
+- [OpenVSLAM](https://github.com/xdspacelab/openvslam) (2.3k stars)
+- [ORB SLAM2](https://github.com/raulmur/ORB_SLAM2) and [Docker version](https://github.com/yuyou/ORB_SLAM2#build-docker-image)
+- [PySLAM v2](https://github.com/luigifreda/pyslam)
+
+## Youtube channels
+- [Modern C++ for computer vision](https://www.youtube.com/playlist?list=PLgnQpQtFTOGR50iIOtO36nK6aNPtVq98C)
+- [SLAM by Cyrill Stachniss](https://www.youtube.com/playlist?list=PLgnQpQtFTOGQrZ4O5QzbIHgl3b1JHimN_)
+- [Understanding Sensor Fusion and Tracking by Matlab](https://www.youtube.com/watch?v=6qV3YjFppuc&t=0s)
+- [Understanding Kalman Filters by Matlab](https://www.youtube.com/watch?v=mwn8xhgNpFY&list=PLn8PRpmsu08pzi6EMiYnR-076Mh-q3tWr)
+
+## Talks
+- [Andrej Karpathy's Talks](./talk_notes/andrej.md)
+- [Drago Anguelov: Scale AI’s TransformX Presentation](https://www.youtube.com/watch?v=oJ96bgmSaW0)
+
+## What autonomous driving companies do
+- [Zoox](https://www.youtube.com/watch?&v=BVRMh9NO9Cs)