Steaming video understanding #8

DubiousCactus · 2025-01-27T15:59:04Z

Hi, thank you for sharing your code! It's a great value to the community.
I am interested in using your model for online video understanding, using a stream of images as opposed to batch processing a video file. Is there some way to achieve this without retraining your model? For example by decomposing a sequence and forwarding latent tokens for context? Essentially I want to be able to comment video streams with TTS.

Cheers,
Theo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Steaming video understanding #8

Steaming video understanding #8

DubiousCactus commented Jan 27, 2025

Steaming video understanding #8

Steaming video understanding #8

Comments

DubiousCactus commented Jan 27, 2025