Skip to content

Encoder psnr and detach fixes#218

Merged
zlatinski merged 3 commits intomainfrom
encoder-psnr-and-detach-fixes
Mar 30, 2026
Merged

Encoder psnr and detach fixes#218
zlatinski merged 3 commits intomainfrom
encoder-psnr-and-detach-fixes

Conversation

@zlatinski
Copy link
Copy Markdown
Contributor

No description provided.

…o prevent UAF

QueueFramesForAssembly traverses the dependantFrames linked list and
queues each frame as an independent AssemblyWorkItem. Two worker threads
can process parent (A) and child (B) concurrently. When worker T2
finishes B, it calls Reset(true) which frees B's resources and control
block. When worker T1 later finishes A, its Reset(true) calls
ReleaseChildrenFrames(A->dependantFrames) which drops A's shared_ptr
reference to B — but B's control block was already freed by T2.

This manifests as a heap-use-after-free in std::_Sp_counted_base::
_M_release(), detected by ASAN under the Khronos VVS encoder test suite.

Fix: set item.frameInfo->dependantFrames = nullptr before calling
Reset(true) in ReleaseAssemblyItem. Since children are independently
managed as separate work items in the assembly queue, the parent
should not attempt to release them during its own cleanup.

Verified: full Khronos VVS test suite with ASAN (halt_on_error=1) —
zero ASAN errors across all 73 tests.

Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
VkVideoEncoderPsnr was added in a32ba59 with the old COM-style
intrusive refcounting (AddRef/Release/GetRefCount/m_refCount) which
no longer exists after the VkVideoRefCountBase shared_ptr migration.

- Remove AddRef(), Release(), GetRefCount() overrides and m_refCount
- Make destructor public (shared_ptr needs accessible dtor)
- Change `psnr = new VkVideoEncoderPsnr()` to `psnr.reset(new ...)`
- Fix .Get() → .get() (4 sites in VkVideoEncoder.cpp,
  VkVideoEncoderAV1.cpp, VkVideoEncoderPsnr.cpp)

Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
Address review findings from the PSNR commit (a32ba59):

1. Disable debug frame dump (VkVideoEncoderPsnr.cpp:297):
   VKVENC_DEBUG_DUMP_PSNR_FRAMES set from 1 to 0. With the flag
   enabled, every encoded frame wrote two raw YUV files to disk
   (input_frame_NNNNN.yuv + recon_frame_NNNNN.yuv), causing massive
   disk I/O and filesystem pollution in production runs.

2. Add staging image layout transition (VkVideoEncoderPsnr.cpp,
   CaptureOutput): inserted VkImageMemoryBarrier2 transitioning the
   staging image from UNDEFINED → TRANSFER_DST_OPTIMAL before
   CmdCopyImage. The staging image was acquired with UNDEFINED layout
   but the copy assumed TRANSFER_DST_OPTIMAL — undefined behavior per
   Vulkan spec (VUID-vkCmdCopyImage-dstImageLayout). Works by accident
   on NVIDIA but would break on other implementations.

3. Store m_encoderConfig as shared_ptr (VkVideoEncoderPsnr.h:68):
   was raw EncoderConfig* from shared_ptr::get(). If the caller's
   shared_ptr is released during encoder reconfiguration or shutdown,
   m_encoderConfig becomes dangling. Every subsequent call to Enabled(),
   CaptureInput(), ComputeFramePsnr() would dereference freed memory.
   Holding a shared_ptr copy guarantees the EncoderConfig stays alive.

4. Remove race-prone m_psnrInput* member fallback
   (VkVideoEncoderPsnr.cpp, CaptureInput + ComputeFramePsnr):
   CaptureInput() copied input data to BOTH member vectors
   (m_psnrInputY/U/V) AND per-frame vectors (psnrFrameData). With
   pipelining, CaptureInput() for frame N+1 overwrites the member
   vectors while ComputeFramePsnr() for frame N hasn't run yet —
   the fallback would compare frame N's recon against frame N+1's
   input, producing wrong PSNR. Fix: write directly to per-frame
   psnrFrameData only; return early if per-frame data is missing.
   Member input vectors removed from the class.

5. Initialize accumulator members (VkVideoEncoderPsnr.h:76-79):
   m_psnrSum/U/V and m_psnrFrameCount had no default initializers.
   Reading them before Configure() (e.g., during early shutdown or
   from GetAveragePsnrY()) is undefined behavior per C++ standard.
   Added = 0.0 / = 0 default initializers.

6. Move #includes above forward declarations (VkVideoEncoder.h:47):
   VkVideoEncoderPsnr.h and VkVideoCrc.h were included after the
   forward declarations of VkVideoEncoderH264/H265/AV1. Placed
   includes first per standard C++ header ordering convention.

7. Guard m_crc.SignalFrameEnd() (VkVideoEncoder.cpp:1162):
   was called unconditionally on every frame even when CRC is
   disabled. Added m_crc.Enabled() check to match the pattern
   used elsewhere (e.g., m_psnr && m_psnr->Enabled() guards).

8. Restore private visibility (VkVideoEncoderPsnr.h):
   member variables were left public after the previous commit
   changed protected→public for the destructor. Internal state
   (m_vkDevCtx, m_encoderConfig, m_psnrReconImagePool, etc.) should
   not be accessible from outside the class.

Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>
@zlatinski zlatinski merged commit baba158 into main Mar 30, 2026
10 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant