-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Degirum Detector Integration #17159
base: dev
Are you sure you want to change the base?
Degirum Detector Integration #17159
Conversation
…e, updated requirements with degirum_headless
✅ Deploy Preview for frigate-docs ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Running a detector on separate hardware is not recommended for frigate as it prioritizes real time detection. What inference speed are you seeing? |
If a user wants to use some local hardware, they just have to run a local AI server, and it'll be handled locally. But for usage with cloud, I'm getting pretty good fps overall. For instance, with a mobilenet detection model and orca as the hardware, I'm getting on average about 690 FPS and an average frame time of 1.3 ms. When using a slower model like an openvino CPU model, frame times are about 10 ms with roughly 100 fps. |
it would probably be good to see some screenshots of the frigate interface with this detector running. |
Sure! Is there any specific page or setup you want to see? Or literally just the live view running while the DeGirum detector is selected? |
To start, screenshots of the system metrics and camera metrics pages would be helpful. |
Your inference times are lower than most local hardware based GPUs, which makes me think the values you're seeing are suspect. Can you clarify what you meant when you said "I would like to note that detections had random spikes because I or someone else would stand in front of the camera for whatever reason."? |
Meaning the inference times are much faster than expected so it's suspect, or it's much slower than expected, so it's hard to understand how I got those FPS numbers for the benchmarking I did?
I noticed that in the camera portion, the "detections" line would remain at about half of the FPS. However, when a person was in frame, there were spikes in that detection number. For instance, in my hailo screenshots, right at the end you can see that the detections line just spikes and stays up for a little bit. Someone was in view of the camera for that. |
The inference times are much faster than expected. Even with a very good internet connection (< 5ms ping time), I'd expect network latency to take much more than 2ms. I'm not familiar with how Degirum works, but I'm guessing that because of the way you've written the detector and put everything into the |
Inference times vary depending on model and hardware of course, you can see some examples listed under the recommended hardware page https://docs.frigate.video/frigate/hardware The fact that the spikes are so large during activity suggest that the testing is not representative of real world performance. |
I can try to just stand in front of the camera the entire time to see if detections start spiking up the latency of the inferences? Maybe that will indicate a bit more real world performance? Also, as a side note, I was testing all these on site, right next to the cloud servers/AI servers. So my ping was probably much lower than someone who is testing these inferences from further away. |
The concern is that if objects are not being detected then inference speed shouldn't differ so much. That would be a good starting point to understanding what performance would be, but the concern is that there is an incorrect behavior here. You may want to look at the deepstack detector for more reference on a similar implementation. |
Ok, so if I understand the frames/detections graph correctly, if a person was to stand in front of the camera for the entire time we're using a properly functioning detector, then the FPS line and detections line should be the same? And if detections are properly occurring, that means the inference speed would be exactly what is expected in real world performance. So, if I can show a graph with 30 fps, 30 detections, then the inference speed graph should be close to real world, right?
I could write another PR that uses just a model.predict call instead of predict batch, and that would completely mimic the behavior of deepstack. If using a regular model.predict, or an API call like with deepstack, we have to establish a connection to whatever server is being pinged, get a response, and then close the connection once per frame. With predict_batch, it just has one websocket connection that is established when we call predict batch on the first element of the generator/iterator that's passed in, and this connection stays open as long as the passed in generator/iterator continues to yield/return objects. On top of that, the results are potentially inaccurate since we're not forcing synchronization at all. So the results returned for frame x might actually have been the results for frame x - 2 or something. I can add a tracking algorithm to help overcome that? Or I can stick to trying to make it more of a synchronous approach if that's preferred. |
No, multiple detections can be run on the same frame, so detections can be higher than camera fps. The main problem you want to avoid is skipped fps, if the detections fps is high but skipped fps is high that indicates a problem
that should not be possible given that detectors in Frigate are synchronous |
We overcome this by just returning an empty detection result if nothing was returned by predict_batch in time, but ultimately all that's running in a separate thread. So it ends up being async. I could enforce syncing if needed though.
Got it, will work on trying to minimize skipped FPS then. |
the frame that is passed in as the input_tensor absolutely must match the data that is returned with the detected objects, otherwise object tracking will be very wrong |
Currently I don't have any object tracking implemented, it's purely async. If I was to add some object tracking, then ya I would have to align the frames. Or I could try to just make it block until we have a response. Either way, I understand that some form of syncing needs to occur, and I'm working on implementing exactly that! |
I think you misunderstand, Frigate does the object tracking already. |
also can you elaborate on |
Essentially, the As an example, lets say for an inference result to be properly returned, it would take about 5 frames. On our first frame, let's call it frame X, we go through |
Okay thanks, that makes sense. detect_raw absolutely needs to be blocked such that the tensor_input that is passed in and the returned object data matches |
Added support for use of DeGirum and PySDK within Frigate.
Proposed change
Type of change
Additional information
Checklist
ruff format frigate
)