Degirum Detector Integration #17159

ChirayuRai · 2025-03-15T01:31:20Z

Added support for use of DeGirum and PySDK within Frigate.

Proposed change

Type of change

Dependency upgrade
Bugfix (non-breaking change which fixes an issue)
New feature
Breaking change (fix/feature causing existing functionality to break)
Code quality improvements to existing code
Documentation Update

Additional information

This PR fixes or closes issue: fixes #
This PR is related to issue:

Checklist

The code change is tested and works locally.
Local tests pass. Your PR cannot be merged unless tests pass
There is no commented out code in this PR.
The code has been formatted using Ruff (ruff format frigate)

…e, updated requirements with degirum_headless

netlify · 2025-03-15T01:31:47Z

✅ Deploy Preview for frigate-docs ready!

Name	Link
🔨 Latest commit	`13cd5eb`
🔍 Latest deploy log	https://app.netlify.com/sites/frigate-docs/deploys/67d85aa2f453170008baffbf
😎 Deploy Preview	https://deploy-preview-17159--frigate-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

NickM-27 · 2025-03-15T02:03:08Z

Running a detector on separate hardware is not recommended for frigate as it prioritizes real time detection. What inference speed are you seeing?

ChirayuRai · 2025-03-15T02:14:04Z

Running a detector on separate hardware is not recommended for frigate as it prioritizes real time detection. What inference speed are you seeing?

If a user wants to use some local hardware, they just have to run a local AI server, and it'll be handled locally.

But for usage with cloud, I'm getting pretty good fps overall. For instance, with a mobilenet detection model and orca as the hardware, I'm getting on average about 690 FPS and an average frame time of 1.3 ms. When using a slower model like an openvino CPU model, frame times are about 10 ms with roughly 100 fps.

NickM-27 · 2025-03-15T02:35:26Z

it would probably be good to see some screenshots of the frigate interface with this detector running.

ChirayuRai · 2025-03-17T17:20:39Z

it would probably be good to see some screenshots of the frigate interface with this detector running.

Sure! Is there any specific page or setup you want to see? Or literally just the live view running while the DeGirum detector is selected?

hawkeye217 · 2025-03-17T17:28:22Z

To start, screenshots of the system metrics and camera metrics pages would be helpful.

ChirayuRai · 2025-03-18T00:01:29Z

To start, screenshots of the system metrics and camera metrics pages would be helpful.

Sure! I have a few combinations here:

Using the cloud location with an orca processor (the in house degirum AI accelerator) along with a mobilenet detection model:

Using the cloud location with a hailo processor along with a yolov8s model:

Using the cloud location with an openvino NPU, along with a yolov11n model:

Using an AI server available on the local network (but not on my local machine), which has an openvino NPU and is running a yolov11n model:

Using an AI server on my local machine which has an openvino NPU (off the core ultra 7 155h, to be exact), and is running a yolov11n model:

I would like to note that detections had random spikes because I or someone else would stand in front of the camera for whatever reason. Let me know if you want to see anything else!

hawkeye217 · 2025-03-18T00:36:27Z

Your inference times are lower than most local hardware based GPUs, which makes me think the values you're seeing are suspect.

Can you clarify what you meant when you said "I would like to note that detections had random spikes because I or someone else would stand in front of the camera for whatever reason."?

ChirayuRai · 2025-03-18T00:43:47Z

Your inference times are lower than most local hardware based GPUs, which makes me think the values you're seeing are suspect.

Meaning the inference times are much faster than expected so it's suspect, or it's much slower than expected, so it's hard to understand how I got those FPS numbers for the benchmarking I did?

Can you clarify what you meant when you said "I would like to note that detections had random spikes because I or someone else would stand in front of the camera for whatever reason."?

I noticed that in the camera portion, the "detections" line would remain at about half of the FPS. However, when a person was in frame, there were spikes in that detection number. For instance, in my hailo screenshots, right at the end you can see that the detections line just spikes and stays up for a little bit. Someone was in view of the camera for that.

hawkeye217 · 2025-03-18T01:13:49Z

The inference times are much faster than expected. Even with a very good internet connection (< 5ms ping time), I'd expect network latency to take much more than 2ms.

I'm not familiar with how Degirum works, but I'm guessing that because of the way you've written the detector and put everything into the detect_raw function, predict_batch has no new work to do and then next(self.predict_batch) runs and returns None immediately. Then, when there is actual work to do (someone standing in front of the camera), your inference time spikes massively.

NickM-27 · 2025-03-18T01:14:10Z

Inference times vary depending on model and hardware of course, you can see some examples listed under the recommended hardware page https://docs.frigate.video/frigate/hardware

The fact that the spikes are so large during activity suggest that the testing is not representative of real world performance.

ChirayuRai · 2025-03-18T21:35:50Z

I can try to just stand in front of the camera the entire time to see if detections start spiking up the latency of the inferences? Maybe that will indicate a bit more real world performance? Also, as a side note, I was testing all these on site, right next to the cloud servers/AI servers. So my ping was probably much lower than someone who is testing these inferences from further away.

NickM-27 · 2025-03-18T21:45:48Z

The concern is that if objects are not being detected then inference speed shouldn't differ so much.

That would be a good starting point to understanding what performance would be, but the concern is that there is an incorrect behavior here. You may want to look at the deepstack detector for more reference on a similar implementation.

ChirayuRai · 2025-03-19T00:52:56Z

The concern is that if objects are not being detected then inference speed shouldn't differ so much.

Ok, so if I understand the frames/detections graph correctly, if a person was to stand in front of the camera for the entire time we're using a properly functioning detector, then the FPS line and detections line should be the same? And if detections are properly occurring, that means the inference speed would be exactly what is expected in real world performance. So, if I can show a graph with 30 fps, 30 detections, then the inference speed graph should be close to real world, right?

That would be a good starting point to understanding what performance would be, but the concern is that there is an incorrect behavior here. You may want to look at the deepstack detector for more reference on a similar implementation.

I could write another PR that uses just a model.predict call instead of predict batch, and that would completely mimic the behavior of deepstack. If using a regular model.predict, or an API call like with deepstack, we have to establish a connection to whatever server is being pinged, get a response, and then close the connection once per frame.

With predict_batch, it just has one websocket connection that is established when we call predict batch on the first element of the generator/iterator that's passed in, and this connection stays open as long as the passed in generator/iterator continues to yield/return objects. On top of that, the results are potentially inaccurate since we're not forcing synchronization at all. So the results returned for frame x might actually have been the results for frame x - 2 or something.

I can add a tracking algorithm to help overcome that? Or I can stick to trying to make it more of a synchronous approach if that's preferred.

NickM-27 · 2025-03-19T01:17:24Z

Ok, so if I understand the frames/detections graph correctly, if a person was to stand in front of the camera for the entire time we're using a properly functioning detector, then the FPS line and detections line should be the same? And if detections are properly occurring, that means the inference speed would be exactly what is expected in real world performance. So, if I can show a graph with 30 fps, 30 detections, then the inference speed graph should be close to real world, right?

No, multiple detections can be run on the same frame, so detections can be higher than camera fps. The main problem you want to avoid is skipped fps, if the detections fps is high but skipped fps is high that indicates a problem

So the results returned for frame x might actually have been the results for frame x - 2 or something.

that should not be possible given that detectors in Frigate are synchronous

ChirayuRai · 2025-03-19T17:56:03Z

that should not be possible given that detectors in Frigate are synchronous

We overcome this by just returning an empty detection result if nothing was returned by predict_batch in time, but ultimately all that's running in a separate thread. So it ends up being async. I could enforce syncing if needed though.

if the detections fps is high but skipped fps is high that indicates a problem

Got it, will work on trying to minimize skipped FPS then.

NickM-27 · 2025-03-19T17:58:34Z

We overcome this by just returning an empty detection result if nothing was returned by predict_batch in time, but ultimately all that's running in a separate thread. So it ends up being async. I could enforce syncing if needed though.

the frame that is passed in as the input_tensor absolutely must match the data that is returned with the detected objects, otherwise object tracking will be very wrong

ChirayuRai · 2025-03-19T18:03:20Z

Currently I don't have any object tracking implemented, it's purely async. If I was to add some object tracking, then ya I would have to align the frames. Or I could try to just make it block until we have a response. Either way, I understand that some form of syncing needs to occur, and I'm working on implementing exactly that!

NickM-27 · 2025-03-19T18:08:27Z

Currently I don't have any object tracking implemented, it's purely async. If I was to add some object tracking, then ya I would have to align the frames. Or I could try to just make it block until we have a response. Either way, I understand that some form of syncing needs to occur, and I'm working on implementing exactly that!

I think you misunderstand, Frigate does the object tracking already.

NickM-27 · 2025-03-19T18:10:15Z

also can you elaborate on We overcome this by just returning an empty detection result if nothing was returned by predict_batch in time? We should be making a best effort to detect objects on every frame, otherwise this will also affect object tracking.

ChirayuRai · 2025-03-19T18:47:29Z

also can you elaborate on We overcome this by just returning an empty detection result if nothing was returned by predict_batch in time?

Essentially, the detect_raw is currently just putting truncated_input into a queue, and then that queue is being fed into predict_batch. Right after that happens, we're asking for predict_batch to return whatever inference results it can from this queue. However, if there's no inference results, predict_batch just returns None, which then returns the empty detection result (which is that numpy zero array of size 20 x 6). If we do have a proper inference result, then we reformat the result to then be returned by detect_raw. But this approach isn't blocking detect_raw and making sure that res actually has inference results.

As an example, lets say for an inference result to be properly returned, it would take about 5 frames. On our first frame, let's call it frame X, we go through detect_raw, and put the truncated_input into our queue. From there, that queue is passed into predict_batch, which then starts the whole inference cycle on the truncated_input from that frame. Now, it would only be 5 frames AFTER the initial frame X that we get the proper inference results for frame X. At that point, res would then evaluate to the inference results for frame X. Otherwise, it would evaluate to None. So essentially, predict_batch operates asynchronously from detect_raw, meaning our frames could be out of sync if the inference results from predict_batch don't get returned by the time next(predict_batch) is called. I hope that clears things up. Let me know if any more explanation is needed.

NickM-27 · 2025-03-19T18:49:43Z

Okay thanks, that makes sense. detect_raw absolutely needs to be blocked such that the tensor_input that is passed in and the returned object data matches

Added degirum plugin, updated documentation for degirum detector usag…

e8c88b4

…e, updated requirements with degirum_headless

Fixed broken link

13cd5eb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Degirum Detector Integration #17159

Degirum Detector Integration #17159

ChirayuRai commented Mar 15, 2025

netlify bot commented Mar 15, 2025 •

edited

Loading

NickM-27 commented Mar 15, 2025

ChirayuRai commented Mar 15, 2025

NickM-27 commented Mar 15, 2025

ChirayuRai commented Mar 17, 2025 •

edited

Loading

hawkeye217 commented Mar 17, 2025

ChirayuRai commented Mar 18, 2025 •

edited

Loading

hawkeye217 commented Mar 18, 2025

ChirayuRai commented Mar 18, 2025 •

edited

Loading

hawkeye217 commented Mar 18, 2025

NickM-27 commented Mar 18, 2025

ChirayuRai commented Mar 18, 2025 •

edited

Loading

NickM-27 commented Mar 18, 2025

ChirayuRai commented Mar 19, 2025 •

edited

Loading

NickM-27 commented Mar 19, 2025

ChirayuRai commented Mar 19, 2025

NickM-27 commented Mar 19, 2025

ChirayuRai commented Mar 19, 2025 •

edited

Loading

NickM-27 commented Mar 19, 2025

NickM-27 commented Mar 19, 2025

ChirayuRai commented Mar 19, 2025

NickM-27 commented Mar 19, 2025

Degirum Detector Integration #17159

Are you sure you want to change the base?

Degirum Detector Integration #17159

Conversation

ChirayuRai commented Mar 15, 2025

Proposed change

Type of change

Additional information

Checklist

netlify bot commented Mar 15, 2025 • edited Loading

✅ Deploy Preview for frigate-docs ready!

NickM-27 commented Mar 15, 2025

ChirayuRai commented Mar 15, 2025

NickM-27 commented Mar 15, 2025

ChirayuRai commented Mar 17, 2025 • edited Loading

hawkeye217 commented Mar 17, 2025

ChirayuRai commented Mar 18, 2025 • edited Loading

hawkeye217 commented Mar 18, 2025

ChirayuRai commented Mar 18, 2025 • edited Loading

hawkeye217 commented Mar 18, 2025

NickM-27 commented Mar 18, 2025

ChirayuRai commented Mar 18, 2025 • edited Loading

NickM-27 commented Mar 18, 2025

ChirayuRai commented Mar 19, 2025 • edited Loading

NickM-27 commented Mar 19, 2025

ChirayuRai commented Mar 19, 2025

NickM-27 commented Mar 19, 2025

ChirayuRai commented Mar 19, 2025 • edited Loading

NickM-27 commented Mar 19, 2025

NickM-27 commented Mar 19, 2025

ChirayuRai commented Mar 19, 2025

NickM-27 commented Mar 19, 2025

netlify bot commented Mar 15, 2025 •

edited

Loading

ChirayuRai commented Mar 17, 2025 •

edited

Loading

ChirayuRai commented Mar 18, 2025 •

edited

Loading

ChirayuRai commented Mar 18, 2025 •

edited

Loading

ChirayuRai commented Mar 18, 2025 •

edited

Loading

ChirayuRai commented Mar 19, 2025 •

edited

Loading

ChirayuRai commented Mar 19, 2025 •

edited

Loading