vision-rd

Research papers on temporal ML models for wildfire smoke detection and related topics.

Getting Started

Prerequisites

uv for Python dependency management
AWS credentials configured for access to the S3 bucket (s3://pyro-survey-research/dvc/)

Installation

git clone <repo-url>
cd papers
make install        # installs DVC and dependencies from uv.lock
make pull           # downloads PDFs and notes from S3

Adding a new paper

Drop the PDF into pdfs/ using the naming convention Year-Short-Title-Author.pdf
Add a row to papers.csv
Write reading notes in notes/ as year-short-title.md
Update SUMMARY.md with a description
Track and push the changes:

uv run dvc add pdfs/ notes/
make push
git add pdfs.dvc notes.dvc papers.csv SUMMARY.md README.md
git commit -m "Add paper: <title>"

Available commands

make install        Install dependencies from uv.lock
make pull           Pull PDF data and notes from S3 via DVC
make push           Push PDF data and notes to S3 via DVC

Structure

vision-rd/
├── README.md        # This file
├── SUMMARY.md       # Narrative summary grouped by theme
├── papers.csv       # Structured metadata for all papers
├── pdfs/            # PDF files (DVC-tracked)
├── notes/           # Per-paper reading notes (DVC-tracked)
├── Makefile         # install / pull / push
├── pyproject.toml   # Python dependencies (dvc[s3])
└── uv.lock          # Lockfile

Papers (28)

Year	Paper	Category	Architecture / Focus	PDF	Notes
2020	Lightweight Student LSTM (Jeong et al.)	Temporal	YOLOv3 + LSTM, teacher-student distillation	pdf	notes
2020	ELASTIC-YOLOv3 + Fire-Tube (Park & Ko)	Temporal	YOLOv3 + fire-tube + BoF + random forest	pdf	notes
2021	TimeSformer (Bertasius et al.)	Video Foundation	Divided space-time attention	pdf	notes
2021	ViViT (Arnab et al.)	Video Foundation	Video Vision Transformer, 4 factorizations	pdf	notes
2021	LSTR (Xu et al.)	Online Detection	Long short-term memory Transformer	pdf	notes
2022	Nemo / DETR (Yazdi et al.)	Spatial	DETR for wildfire smoke, open-source benchmark	pdf	notes
2022	SlowFastMTB (Choi et al.)	Temporal	SlowFast + MTB bounding box algorithm	pdf	notes
2022	SmokeyNet (Dewangan et al.)	Temporal	CNN (ResNet34) + LSTM + ViT on tiled frames	pdf	notes
2022	TeSTra (Zhao & Krahenbuhl)	Online Detection	Temporal smoothing kernels, O(1) per frame	pdf	notes
2022	VideoMAE (Tong et al.)	Video Foundation	Masked video autoencoder, data-efficient	pdf	notes
2023	VideoMAE V2 (Wang et al.)	Video Foundation	Dual masking, billion-scale, progressive training	pdf	notes
2024	Beyond Few-Shot OD Survey (Li et al.)	Few-Shot	5 categories of few-shot detection	pdf	notes
2024	FLAME (Gragnaniello et al.)	Temporal	DNN + GMM background subtraction + tracking FSM	pdf	notes
2024	MATR (Song et al.)	Online Detection	Memory-augmented Transformer for streaming	pdf	notes
2024	PyroNear2025 Dataset (Lostanlen et al.)	Dataset	150k annotations, 50k images, 640 wildfires	pdf	notes
2024	Smoke-DETR (Sun & Cheng)	Spatial	RT-DETR + ECPConv + EMA + MFFPN	pdf	notes
2024	SmokeBench (Qi et al.)	Benchmark	Multimodal LLM evaluation on FIgLib	pdf	notes
2024	Ultra-lightweight (Chaturvedi et al.)	Spatial	Conv-Transformer, 0.6M params, edge deploy	pdf	notes
2024	Video Anomaly Survey (Liu et al.)	Survey	10-year survey, reconstruction + MIL methods	pdf	notes
2024	YOLOv10 (Wang et al.)	General	NMS-free YOLO, edge-friendly	pdf	notes
2025	CCPE Swin (Wang et al.)	Spatial	Swin + Cross Contrast Patch Embedding	pdf	notes
2025	Comprehensive DL Review (Elhanashi et al.)	Survey	CNNs, RNNs, YOLO, transformers, spatiotemporal	pdf	notes
2025	Datasets 20-Year Review (Haeri Boroujeni et al.)	Survey	29 fire/smoke datasets across modalities	pdf	notes
2025	Few-Shot Remote Sensing (Zhang et al.)	Few-Shot	Domain adaptation with limited labels	pdf	notes
2025	RT-DETR-Smoke (Wang et al.)	Spatial	RT-DETR + CoordAtt + WShapeIoU, 445 FPS	pdf	notes
2025	Small Object Detection Survey	Survey	Multi-scale, super-resolution, attention	pdf	notes
2025	ViT on the Edge Survey	Survey	Pruning, quantization, knowledge distillation	pdf	notes
2026	ViT + 3D-CNN (Lilhore et al.)	Temporal	ViT + 3D-CNN + Transformer encoder	pdf	notes

Related Repos

time-wildfire -- Temporal smoke detection with EfficientNet, 3D ResNet, VideoMAE, ViViT, CNN+Transformer backbones + SAM3 tracking

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

vision-rd

Getting Started

Prerequisites

Installation

Adding a new paper

Available commands

Structure

Papers (28)

Related Repos

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.dvc		.dvc
.dvcignore		.dvcignore
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SUMMARY.md		SUMMARY.md
notes.dvc		notes.dvc
papers.csv		papers.csv
pdfs.dvc		pdfs.dvc
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

vision-rd

Getting Started

Prerequisites

Installation

Adding a new paper

Available commands

Structure

Papers (28)

Related Repos

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages