Skip to content

Boombox in Python MVP #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Noarkhh opened this issue Mar 26, 2025 · 2 comments · May be fixed by #88 or #92
Open

Boombox in Python MVP #71

Noarkhh opened this issue Mar 26, 2025 · 2 comments · May be fixed by #88 or #92
Assignees

Comments

@Noarkhh
Copy link
Contributor

Noarkhh commented Mar 26, 2025

The objective is to create some sort of Python package that could start and interact with Boombox.

Running Elixir in Python

This interoperability is crucial for our goal of having a Python API for Boombox.

Existing solutions:

  • ErlPort - used for running Python as an erlang port, probably not good for our case, since this solution starts Python from Erlang, not the other way.
  • Pyrlang - used for running Python as an erlang node

Python multimedia libraries

One of the most valuable contributions that boombox could provide would be getting some input and getting raw video from it as some data structure (e.g. numpy arrays). They could be then easily passed to a model.
Boombox would also allow for getting raw frames from python and storing / streaming them in some way.

Existing solutions:

  • scikit-video - this library provides a ffmpeg wrapper that allows for anything that ffmpeg can do. Thier API is a generator for raw video frames as input and regular funcion calls as output.
  • opencv - it's pretty powerful, basically can handle any popular container formats as well as some more established network protocols like RTMP, RTSP and most likely any other one that ffmpeg can handle.
  • PyAV - another ffmpeg wrapper, also allows for audio operations.

I think the selling point of boombox could be support for things that ffmpeg struggles with, most notably WebRTC.

We concluded that creating a POC would be a good idea

@Noarkhh Noarkhh added this to Smackore Mar 26, 2025
@Noarkhh Noarkhh moved this to Todo in Smackore Mar 26, 2025
@Noarkhh Noarkhh self-assigned this Mar 26, 2025
@Noarkhh Noarkhh moved this from Todo to In Progress in Smackore Mar 26, 2025
@Noarkhh Noarkhh changed the title Boombox in Python Boombox in Python POC Apr 1, 2025
@darthez
Copy link
Contributor

darthez commented Apr 1, 2025

Let's have first evaluation of where we are by April 15th.

First experiment to conduct:

  • Run Boombox.run(input_mp4, output_mp4) in Python

@Noarkhh
Copy link
Contributor Author

Noarkhh commented Apr 15, 2025

Progress

  • A simple proxy for communicating with Boombox with messages. It's based on the Stream interface and it's not ideal, but for now it will do. One shortcoming worth noting is that right now, when boombox produces packets, there is a lag of a single frame. For most use cases it won't be an issue, but should probably be rewritten in the future.
  • A Boombox class in Python that utilizes Pyrlang to communicate with the Erlang node that Boombox is running on. It operates by exchanging messages with the previously mentioned proxy (Gateway).
  • Successful experiment of streaming some generated NumPy arrays to Boombox and broadcasting them with HLS.

To do

Improve how the messages between Pyrlang and Boombox gateway are structured.

There has to be some sort of serialization of the image data. The default data structure of images in Python will most likely be a NumPy array, and the in Elixir we'll most likely stick to the currently used Vips.Vix.Image.t() struct. When an image is sent from Python to Elixir we need to serialize it to binary data, since we can't just send a NumPy array and we can't create the Vips struct.

The binary data is not enough on it's own, we also need to send some metadata:

  • width
  • height
  • channels
  • timestamp - it's not crucial for deserialization of the image, but Boombox requires timestamps, so it also has to be provided. We can think whether we want the Python user to set timestamps manually, assume a fixed framerate or generate the timestamps based on the time of processing the image.

We can just send a map with these fields and the raw data, it should suffice.

This structure would only be applicable for video, but it'll be my focus for now, since most use cases use only video.

Allow for sending a stream from Boombox to Python

I have only tested generating a stream in Python and sending it to Boombox, not the other way around. It should work fine, however it still needs to be tested. The previous paragraph also applies, I think images should have the same structure regardless which way they are sent.

Add an API in Python that will accurately map to the Elixir API

We need an API that will allow Python users to create a Boombox instance with specified input and output. This API should resemble the Elixir one as closely as possible while still being "pythonic". The Elixir API is based completely on primitive types, so we won't need to worry creating any struct equivalents. The only problematic type is the atom, since there is no obvious analogue in Python. Most likely a string would be a good substitute. That creates a minor inconvenience for type translation, since depending on a field we would like to either decode it to a string or an atom. This inconvenience has already been solved in some sense when creating CLI for Boombox, so maybe it can be generalized and reused here.

I'm also wondering whether we don't want to utilize some data structures with static fields to improve the experience, but that may be beyond the scope of this POC.

Release

I need to figure out how to release this as a python package. Pyrlang is not on PyPI, this may be a problem. We also need to have Boombox compile during installation or have it already precompiled. This needs some more research and cosulting,

@Noarkhh Noarkhh changed the title Boombox in Python POC Boombox in Python MVP Apr 22, 2025
@Noarkhh Noarkhh linked a pull request May 13, 2025 that will close this issue
@Noarkhh Noarkhh linked a pull request May 13, 2025 that will close this issue
@Noarkhh Noarkhh removed a link to a pull request May 13, 2025
@Noarkhh Noarkhh linked a pull request May 13, 2025 that will close this issue
@Noarkhh Noarkhh removed a link to a pull request May 13, 2025
This was linked to pull requests May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

2 participants