Skip to content

Community tool for long-video batch processing with SAM2 (simple-sam2) #757

Description

@varun-kolluru

Hi SAM2 team,

I built a lightweight wrapper around SAM2 video predictor that solves
long-video segmentation - SAM2 currently loads the entire video into
memory which makes it unusable for longer clips.

simple-sam2 adds:

  • Batch processing :- only batch_size frames in memory at once, keeping
    GPU/CPU memory usage constant regardless of video length

  • Unified prompt API :- mask, points, and box in a single call (SAM2 natively
    requires mask OR points+box separately)

  • Direct video file input:- pass any .mp4/.avi/.mov directly, frames are
    extracted automatically via OpenCV, no manual preprocessing needed

  • Selective frame range processing :- specify start_frame_idx and end_frame_idx
    to segment only a portion of the video instead of processing the entire clip,
    useful for long videos where only a specific segment is of interest

  • Persistent storage layout :- extracted frames and output masks are organized
    under a canonical directory structure, so re-runs skip re-extraction and
    masks are always easy to locate

  • Carry-over mask mechanism :- object identity is maintained across batch
    boundaries by propagating the last frame's mask as the seed for the next batch

PyPI: https://pypi.org/project/simple-sam2/
GitHub: https://github.com/varun-kolluru/simple_sam2

Sharing here in case it's useful to others in the community. Happy to
answer questions or take feedback.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions