Community tool for long-video batch processing with SAM2  (simple-sam2)

Hi SAM2 team,

I built a lightweight wrapper around SAM2 video predictor that solves 
long-video segmentation - SAM2 currently loads the entire video into 
memory which makes it unusable for longer clips.

simple-sam2 adds:
- Batch processing :- only `batch_size` frames in memory at once, keeping 
  GPU/CPU memory usage constant regardless of video length

- Unified prompt API :- mask, points, and box in a single call (SAM2 natively 
  requires mask OR points+box separately)

- Direct video file input:-  pass any .mp4/.avi/.mov directly, frames are 
  extracted automatically via OpenCV, no manual preprocessing needed

- Selective frame range processing :- specify `start_frame_idx` and `end_frame_idx` 
  to segment only a portion of the video instead of processing the entire clip, 
  useful for long videos where only a specific segment is of interest

- Persistent storage layout :- extracted frames and output masks are organized 
  under a canonical directory structure, so re-runs skip re-extraction and 
  masks are always easy to locate

- Carry-over mask mechanism :-  object identity is maintained across batch 
  boundaries by propagating the last frame's mask as the seed for the next batch

PyPI: https://pypi.org/project/simple-sam2/
GitHub: https://github.com/varun-kolluru/simple_sam2

Sharing here in case it's useful to others in the community. Happy to 
answer questions or take feedback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Community tool for long-video batch processing with SAM2 (simple-sam2) #757

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Community tool for long-video batch processing with SAM2 (simple-sam2) #757

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions