This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
-
Updated
Sep 23, 2024
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
A list of recent papers, libraries and datasets about 3D shape/scene analysis (by topics, updating).
Lightweight models for real-time semantic segmentationon PyTorch (include SQNet, LinkNet, SegNet, UNet, ENet, ERFNet, EDANet, ESPNet, ESPNetv2, LEDNet, ESNet, FSSNet, CGNet, DABNet, Fast-SCNN, ContextNet, FPENet, etc.)
PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).
😎 A list of awesome scene understanding papers.
Official PyTorch implementation of FB-BEV & FB-OCC - Forward-backward view transformation for vision-centric autonomous driving perception
🔥🔥Official Repository for Multi-Human-Parsing (MHP)🔥🔥
[ECCV'20] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
Benchmarking Panoptic Scene Graph Generation (PSG), ECCV'22
Implementation of CVPR'20 Oral: Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image
A paper list of RGBD semantic segmentation (processing)
Code of ICLR2023 paper "TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding" and ECCV2022 paper "Inverted Pyramid Multi-task Transformer for Dense Scene Understanding"
[AAAI 2020] Towards Ghost-free Shadow Removal via Dual Hierarchical Aggregation Network and Shadow Matting GAN
[CVPR 2024] "LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning"; an interactive Large Language 3D Assistant.
[CVPR 2024] DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis
[ECCV'18] 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation
Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction
Pytorch code for ECCV'22 paper. ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization
Generative Camera Dolly: Extreme Monocular Dynamic Novel View Synthesis (ECCV 2024 Oral) - Official Implementation
Add a description, image, and links to the scene-understanding topic page so that developers can more easily learn about it.
To associate your repository with the scene-understanding topic, visit your repo's landing page and select "manage topics."