-
-
Notifications
You must be signed in to change notification settings - Fork 954
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better Raspberry Pi server performance #2172
base: master
Are you sure you want to change the base?
Conversation
{}, | ||
// Fallback options | ||
{}, | ||
std::make_optional<encoder_t::option_t>("qp"s, &config::video.qp), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the encoder really not support CBR/VBR bitrate control? QP shouldn't be provided if CBR or VBR is available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, I was just copying what the others did as a first step. I'll give it a try.
This is a Pi 4, I assume? I don't think the Pi 5 has any hardware encoders anymore.
Yeah, it's all in the RGB->YUV color conversion code, which is expected since it's doing all the color conversion on the CPU. I guess it's nice that's multi-threaded now. You can adjust the "Minimum CPU Thread Count" on the Advanced tab in the UI if you want to play with the amount of concurrency there. What your encoding pipeline looks like now: What you want is more like what we do with VAAPI: Most of that pipeline is simple and already written in Sunshine. The tricky part will be getting that second DMA-BUF to write into and/or exporting the render target as a DMA-BUF. Since there's no standard way to create a DMA-BUF, that part tends to be highly API-specific. For VAAPI, we import the underlying DMA-BUF of the VA surface as the render target for our color conversion. For CUDA, we create a blank texture to use as the render target and use the CUDA-GL interop APIs to import that texture as a CUDA resource for NVENC to read. Where to start is probably writing something like this for Then for your encoder definition you probably want something like this:
Since FFmpeg's hwcontext_drm.c doesn't support frame allocation, you'll need to figure out how to do that and provide a buffer pool for frame allocation. Finally, for encoding side, you'll want to do something similar to what I did in 8182f59 for supporting KMS->GL->CUDA with the |
Many thanks for the detailed reply. Sounds like this could be an interesting exercise. I may be wrong, but I think playback scenarios have managed to avoid GL altogether. What Kodi calls Direct to Plane and mpv calls HW-overlay? Is that not possible here? |
I think that color conversion hardware is only accessible on the scanout path (and it's YUV->RGB, not RGB->YUV). Some encoders do have the ability to accept RGB frames and perform the conversion to YUV internally (using dedicated hardware or a shader), but I don't think the Pi's encoder supports RGB input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the details of the encoder but the following will also need to be updated.
Sunshine/docs/source/about/advanced_usage.rst
Lines 1142 to 1161 in 42aec26
`encoder <https://localhost:47990/config/#encoder>`__ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ **Description** Force a specific encoder. **Choices** .. table:: :widths: auto ========= =========== Value Description ========= =========== nvenc For NVIDIA graphics cards quicksync For Intel graphics cards amdvce For AMD graphics cards vaapi Use Linux VA-API (AMD, Intel) software Encoding occurs on the CPU ========= =========== Sunshine/src_assets/common/assets/web/configs/tabs/Advanced.vue
Lines 93 to 96 in 42aec26
<template #linux> <option value="nvenc">NVIDIA NVENC</option> <option value="vaapi">VA-API</option> </template> Sunshine/tests/unit/test_video.cpp
Line 55 in 42aec26
std::make_tuple(video::vaapi.name, &video::vaapi),
For the final 2 bullet points, we probably need a way to detect if running on raspberry pi. This may give some hints: https://stackoverflow.com/questions/70395696/predefined-macro-to-determine-if-running-on-a-raspberry
Lastly, do we need to make any changes to our ffmpeg build-deps repo? Edit: I think it's already enabled via: https://github.com/LizardByte/build-deps/blob/1e16ab273175976a4623e248fde89ff20b549a1f/.github/workflows/build-ffmpeg.yml#L39
Thanks for those pointers. My initial PoC seemingly needed John Cox's ffmpeg patchset for the Raspberry P. It's a rather heavy patchset, but Gentoo isn't the only party invested in keeping it updated. John does a good job by himself anyway. Whether it will be needed in the end will depend on the architecture we go for. I did spend quite a long time looking into this after cgutman gave me some pointers. I was really struggling with the DMA-BUF part of it, as v4l2m2m seems to work quite differently to VAAPI. I also considered doing it a different way, using the Pi's ISP for the pixel format conversion. ffmpeg has some support for it already. This might be a simpler and even more efficient, but it would also be Pi-specific. v4l2m2m seems preferable, as it is supported by many SoCs. It's been a while since I had time to work on this. It's something I'd really like to go, but Gentoo maintenance usually takes priority. |
Understood. I will convert this to a draft for now, whenever you are ready feel free to mark it as ready for review again. |
Description
Now I reveal what I really want to use Sunshine for. As a server on the Raspberry Pi! Why would I want such a thing? Surely it makes more sense as a client? Normally yes, but when combined with the PiStorm project, things get very interesting.
As you might imagine, PiStorm is very CPU-intensive, so for this to be feasible, Sunshine needs to use as little CPU as possible. The first step here was obviously to get hardware video encoding to work. The Pi does not support VAAPI or CUDA, but fortunately, this still turned out to be very easy.
These initial changes to add a V4L2M2M encoder did not work for me at first, as Sunshine claimed that an IDR frame was not produced. Digging around in the internals, it looked very much to me like requesting IDR frames should work on the Pi. As a shot in the dark, I applied John Cox's ffmpeg patchset for the Raspberry Pi. This patchset, which I recently applied to Gentoo's ffmpeg package, enables efficient zero-copy video playback on the Pi. With this, I have seen 1080p videos go from a stuttery mess to being buttery smooth. Being playback-focused, I really didn't expect it to help, but I was delighted when it suddenly sprang to life!
The quality isn't fantastic though, and it's still using 275% CPU. I utilised gprof to find where it's spending all the effort.
This is not my area of expertise, but it looks like finding the right format might be the key here. I'd appreciate any help you can provide here. I know that John Cox's patchset adds support for Pi-specific SAND formats, but I don't know whether they are usable in this context.
Type of Change
.github/...
)Checklist
Branch Updates
LizardByte requires that branches be up-to-date before merging. This means that after any PR is merged, this branch
must be updated before it can be merged. You must also
Allow edits from maintainers.