tests: add Khronos VVS test framework (cherry-picked from upstream) by zlatinski · Pull Request #215 · nvpro-samples/vk_video_samples

zlatinski · 2026-03-23T18:32:30Z

Cherry-pick the complete Vulkan Video Samples test framework from the Khronos upstream repository (KhronosGroup/Vulkan-Video-Samples, branch main, commits 9d588d9e..fc17607f).

This brings in the unified Python test runner and all incremental improvements, without modifying any library/decoder/encoder/filter C++ code (those changes were reverted to avoid mixing test infrastructure with code changes that need separate review).

tests/vvs_test_runner.py — unified entry point for encode + decode tests tests/decode_samples.json — ~40 decoder test definitions (H.264/H.265/AV1/VP9) tests/encode_samples.json — ~15 encoder test definitions (H.264/H.265/AV1) tests/skipped_samples.json — per-driver skip list (nvidia, nvk, anv, radv, amd) tests/libs/ — framework library modules:

video_test_framework_base.py (base classes)
video_test_framework_decode.py (decoder framework)
video_test_framework_encode.py (encoder framework)
video_test_driver_detect.py (GPU driver auto-detection)
video_test_fetch_sample.py (asset downloading + SHA256 verification)
video_test_result_reporter.py (result reporting + JSON export)
video_test_config_base.py, video_test_platform_utils.py, video_test_utils.py tests/unit_tests/ — pytest self-tests (CLI, skip list, filtering, configs, status) tests/generate_sample_md5.py — MD5 generation for new test samples tests/manage_samples_list.py — sample list management tests/README.md — comprehensive documentation (532 lines) tests/conftest.py — pytest configuration
.github/workflows/test.yml — CI workflow (lint + unit tests + codec tests)
9d588d9e tests: introduce testing framework
b6088f42 tests: rename video_test_framework_codec.py to vvs_test_runner.py
ad7ed0e4 tests: add extended test framework support
4a657d88 tests: add encode resolution boundary tests
4748f8ee tests: only download resources for tests that will actually run
f50a3ebb tests: bypass skip list when test is explicitly requested with -t
b815a5d8 tests: display skipped tests in running list and fix summary counts
a57bc453 tests: add codec filter support to --list-samples
c2343556 tests: update skip list after film grain and error handling fixes (and 21 other incremental improvements)
fc17607f filter: fix YCBCR2RGBA shader compilation error
6114990a common: Fix 10-bit/12-bit sample normalization
0dbd2ba7 Config: rename --no-device-fallback to --noDeviceFallback (test-side changes from these commits ARE included)

The Khronos test framework (vvs_test_runner.py) hard-codes --verbose in every decoder and encoder command via video_test_framework_base.py and video_test_framework_encode.py. The NVIDIA fork's binaries did not recognize this flag, causing all 76 Khronos test samples to fail with "Unknown argument --verbose" (exit code 255). Both binaries already had the `verbose` member variable and used it throughout for conditional output — they just lacked the CLI argument entry to set it. Decoder (DecoderConfig.h): - Add {"--verbose", ...} entry after --verboseValidate, matching the Khronos DecoderConfig.h layout exactly. Encoder (VkEncoderConfig.cpp): - Add --verbose to the printHelp() usage string. - Add argument parsing (args[i] == "--verbose") before the catch-all else block. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The Khronos test framework (vvs_test_runner.py) hard-codes --noDeviceFallback in every decoder and encoder command to prevent GPU fallback when testing with --deviceID. The NVIDIA fork's binaries did not recognize this flag, causing all 73 Khronos tests to fail with "Unknown argument --noDeviceFallback" (exit 255) — same class of issue as the --verbose fix in the previous commit. Decoder (DecoderConfig.h): - Add noDeviceFallback member variable (uint32_t : 1) - Initialize to false in reset() - Add CLI flag entry {"--noDeviceFallback", ...} after --deviceUuid Encoder (VkEncoderConfig.h + VkEncoderConfig.cpp): - Add noDeviceFallback member variable (uint32_t : 1) - Initialize to false in constructor - Add argument parsing and help text entry Note: The actual device fallback logic from the Khronos VulkanDeviceContext is not ported — the flag is accepted and stored but the selection behavior remains unchanged. This is sufficient for Khronos test framework compatibility since the flag's purpose is to prevent fallback on multi-GPU systems, which the NVIDIA fork does not implement. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

On GPU-less CI runners (GitHub Actions), every Khronos test failed with exit 255 because the NVIDIA fork's binaries returned -1 (=255) on Vulkan init failure. The test framework maps exit 69 to NOT_SUPPORTED (pass), but treats 255 as FAIL. Port the exit code mechanism from the Khronos repo: VkVSCommon.h (new file): - VVS_EXIT_UNSUPPORTED = EX_UNAVAILABLE (69) - IsVideoUnsupportedResult() — checks for VK_ERROR_FEATURE_NOT_PRESENT, VK_ERROR_INCOMPATIBLE_DRIVER, VK_ERROR_EXTENSION_NOT_PRESENT, and all video-specific KHR errors - ExitCodeFromVkResult() — maps VkResult to exit code - CHECK_VULKAN_FEATURE macro Encoder Main.cpp: - Include VkVSCommon.h - Replace all 'return -1' with proper exit codes: - VVS_EXIT_UNSUPPORTED for VkResult indicating missing HW/driver - EXIT_FAILURE for other errors - Replace assert()s with fprintf(stderr, ...) for CI-friendly output - 7 VVS_EXIT_UNSUPPORTED return points matching Khronos layout Decoder Main.cpp: - Include VkVSCommon.h - Same pattern: IsVideoUnsupportedResult check at every VkResult failure → VVS_EXIT_UNSUPPORTED - 4 VVS_EXIT_UNSUPPORTED return points (InitVulkanDecoderDevice, InitPhysicalDevice display path, InitPhysicalDevice headless path, CreateVulkanDevice headless path) - Replace assert()s with fprintf(stderr, ...) This ensures that on CI without a GPU, the test framework sees exit 69 and reports NOT_SUPPORTED instead of FAIL, making the CI green for GPU-less runners. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The VVS_EXIT_UNSUPPORTED (exit 69) fix from commit 39435dc was only applied to the demo apps (vk-video-dec-test, vk-video-enc-test). The officially supported test apps still returned -1 (=255) on Vulkan init failure, causing GPU-less CI to report FAIL instead of NOT_SUPPORTED. Apply the same pattern to all 3 test apps: - vulkan-video-dec-test: 10 return points fixed - vulkan-video-simple-dec-test: 4 return points fixed - vulkan-video-enc-test: 1 return point fixed + early exit on failure Pattern: IsVideoUnsupportedResult(result) → VVS_EXIT_UNSUPPORTED, other errors → EXIT_FAILURE. Replaced assert() with fprintf(stderr) for CI-friendly output. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

Cherry-pick the complete Vulkan Video Samples test framework from the Khronos upstream repository (KhronosGroup/Vulkan-Video-Samples, branch main, commits 9d588d9e..fc17607f). This brings in the unified Python test runner and all incremental improvements, without modifying any library/decoder/encoder/filter C++ code (those changes were reverted to avoid mixing test infrastructure with code changes that need separate review). tests/vvs_test_runner.py — unified entry point for encode + decode tests tests/decode_samples.json — ~40 decoder test definitions (H.264/H.265/AV1/VP9) tests/encode_samples.json — ~15 encoder test definitions (H.264/H.265/AV1) tests/skipped_samples.json — per-driver skip list (nvidia, nvk, anv, radv, amd) tests/libs/ — framework library modules: - video_test_framework_base.py (base classes) - video_test_framework_decode.py (decoder framework) - video_test_framework_encode.py (encoder framework) - video_test_driver_detect.py (GPU driver auto-detection) - video_test_fetch_sample.py (asset downloading + SHA256 verification) - video_test_result_reporter.py (result reporting + JSON export) - video_test_config_base.py, video_test_platform_utils.py, video_test_utils.py tests/unit_tests/ — pytest self-tests (CLI, skip list, filtering, configs, status) tests/generate_sample_md5.py — MD5 generation for new test samples tests/manage_samples_list.py — sample list management tests/README.md — comprehensive documentation (532 lines) tests/conftest.py — pytest configuration .github/workflows/test.yml — CI workflow (lint + unit tests + codec tests) - 9d588d9e tests: introduce testing framework - b6088f42 tests: rename video_test_framework_codec.py to vvs_test_runner.py - ad7ed0e4 tests: add extended test framework support - 4a657d88 tests: add encode resolution boundary tests - 4748f8ee tests: only download resources for tests that will actually run - f50a3ebb tests: bypass skip list when test is explicitly requested with -t - b815a5d8 tests: display skipped tests in running list and fix summary counts - a57bc453 tests: add codec filter support to --list-samples - c2343556 tests: update skip list after film grain and error handling fixes (and 21 other incremental improvements) - fc17607f filter: fix YCBCR2RGBA shader compilation error - 6114990a common: Fix 10-bit/12-bit sample normalization - 0dbd2ba7 Config: rename --no-device-fallback to --noDeviceFallback (test-side changes from these commits ARE included) Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The Khronos test framework passes --profile <name> for all non-default encoder tests. The H.264 and H.265 codec-specific parsers (DoParseArguments) did not handle this flag, causing 11/21 encoder tests to fail with exit 255. AV1 already had --profile parsing (no change needed). H.264 (VkEncoderConfigH264.cpp): Accept baseline/main/high/high444 (or 0/1/2/3) and set profileIdc. H.265 (VkEncoderConfigH265.cpp): Accept main/main10/mainstill/range/scc (or 0/1/2/3/4) and set profile. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The H.264 encoder's adaptiveTransformMode defaulted to ENABLE, which set transform_8x8_mode_flag=true in the PPS regardless of profile. H.264 Main profile only supports 4x4 transform (spec Table A-2). This triggered NVIDIA driver assertion: "Main profile doesn't support Adaptive 8x8 transform" "Invalid PPS ID used when fetching the encoded PPS" Fix: check profile before setting transform_8x8_mode_flag. For profiles below High, force transform_8x8_mode_flag=false. Also changed InitProfileLevel() default from use8x8Transform=true to false, with autoselect only enabling it for High profile and above. Fixes encode_h264_main_profile, encode_h264_ip_gop, and encode_h264_small_frame Khronos test failures. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

…d::terminate VkVideoEncoder::DeinitEncoder() now calls WaitForThreadsToComplete() before destroying resources. Without this, when the encoder test app exits, the shared_ptr destructor chain destroys the std::vector<std::thread> with joinable threads still running, which per C++ spec calls std::terminate(). The demo encoder app was unaffected because its main() explicitly calls DeinitEncoder() before the shared_ptr drops. The test app (vulkan-video-enc-test) relies on shared_ptr destructor ordering, which never reached DeinitEncoder() before the thread vector destructor. This fixes all 17 encoder test crashes in the Khronos VVS test suite when running with the officially supported test binaries. Tested: H.264/H.265/AV1 encoder tests × 5 runs each — all clean exits. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

For single-tile OBU_FRAME types where tile_start_and_end_present_flag is not set, consumed_bits() returns 0 (valid). Changed assert > 0 to assert >= 0. consumedBytes=0 means entire payload is tile data. Fixes decode_av1_argon_test787 and decode_av1_720x480_tile_group. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The YCBCR2RGBA compute shader failed to compile with multiple errors: - 'outputImageRGB' undeclared identifier - 'normalizeYCbCr' no matching overloaded function - 'shiftCbCr' no matching overloaded function Root cause: refactored InitYCBCR2RGBA had two bugs: 1. Missing output format override (GetOutputFormat): The YCBCR2RGBA filter converts YCbCr→RGBA, so the output format must be RGBA (R8G8B8A8_UNORM or R16G16B16A16_UNORM). Without the override, the output format was the decoder's NV12, causing ShaderGenerateImagePlaneDescriptors to generate 'outputImageY' and 'outputImageCbCr' plane bindings instead of 'outputImageRGB'. 2. normalizeYCbCr(uvec3) vs vec3 mismatch: The refactored shader generated normalizeYCbCr(uvec3 yuv) but called it with vec3 from imageLoad (which returns float). The Khronos version uses vec3 throughout for the image-based YCBCR2RGBA path. After the shader compilation failed, vkCreateComputePipelines returned a null shader module, and the driver dereferenced it at offset 0x118 → SIGSEGV in __VkShaderModule::GetShaderCodeHash(). Fix: - Add GetOutputFormat() that overrides output to RGBA for YCBCR2RGBA - Call it in the constructor to set m_outputFormat correctly - Rewrite InitYCBCR2RGBA main() to match the proven Khronos version: inline fetch/convert/store, vec3 normalizeYCbCr signature Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

Replace fence-based file dump sync with timeline semaphores and add an async threaded dump pool to eliminate decode pipeline stalls. Problem: The file dump (VkVideoFrameToFile) waited on frameCompleteFence, which gets wait+reset by QueuePictureForDecode when the DPB recycles a slot. If the decoder pipeline runs deep enough, the fence is reset for a new frame before the dump reads — causing an infinite wait on the wrong frame's fence. Fix 1 — TL semaphore for dump sync: Wait on the forward timeline semaphore (frameCompleteDoneSemValue) instead of the fence. TL values are monotonically increasing and tied to decode order — they cannot be reset by slot recycling. After reading the frame, signal the consumer-done TL semaphore so the decoder knows the frame is released. Fix 2 — Threaded dump pool (VkVideoDumpPool): Add a thread pool (4 workers) for async frame dumping, matching the TRV FileDumper pattern: - Non-blocking queueFrame() returns immediately from decode loop - Workers: TL semaphore wait → pixel read → ordered file write → signal release TL semaphore - Display order enforced via m_nextWriteOrder + condition variable - Dedicated TL semaphore registered as external consumer via AddExternalConsumer for proper slot reuse waiting Fix 3 — ClearParent() pool node state reset: ClearParent() unconditionally set m_cmdBufState = CmdBufStateReset, discarding CmdBufStateSubmitted without resetting the fence. On pool node reacquire, ResetCommandBuffer() saw Reset and skipped the fence wait+reset, leaving it signaled from the previous submission. This triggered the videoDecodeCompleteFence assertion. Fix: preserve actual command buffer state across pool release/reacquire cycles. Fix 4 — Consumer semaphore deadlock in dump-only path: The dump pool incorrectly set hasConsummerSignalSemaphore, causing the decode submit to wait on consumerCompleteSemaphore. In the dump-only path (--noPresent), no graphics consumer signals this semaphore → infinite deadlock. The dump pool uses its own dedicated TL release semaphore via AddExternalConsumer; the hasConsummerSignalSemaphore flag is only needed when graphics presentation is active. Fix 5 — Debug serialization flags rewritten for TL semaphores: - checkDecodeIdleSync: now waits on both the decode fence and TL value (previously skipped fence wait when TL semaphore was non-null) - syncCpuAfterStaging: now waits on frameCompleteFence + filter TL value (previously waited on the wrong fence — pool node fence signaled by decode, not filter) Files changed: - VkVideoFrameToFile.cpp: TL semaphore wait instead of fence - VkVideoDumpPool.h: New threaded dump pool class - VulkanVideoProcessor.cpp: Dump pool creation + external consumer - VulkanCommandBufferPool.h: ClearParent() state fix - VulkanVideoFrameBuffer.cpp: Phase 2 comments - VkVideoDecoder.cpp: Debug flag rewrites, remove instrumentation Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

…phore The graphics presentation now creates its own timeline semaphore and registers it via AddExternalConsumer(SEM_SYNC_TYPE_IDX_PRESENTER). QueuePictureForDecode waits on this semaphore before reusing a DPB slot, ensuring the graphics pipeline has finished reading the frame. This follows the same pattern as the dump pool's external consumer registration, providing consistent slot reuse protection for all consumers (dump, presentation, and potentially encoder). Changes: - VulkanFrame.h: added m_presenterReleaseSemaphore and consumer index - VulkanFrame.cpp: create TL semaphore in AttachShell (not AttachQueue, so --noPresent mode doesn't register a semaphore nobody signals). Signal it from the graphics queue submit using the value from externalConsumerDoneValues[]. - VkVideoQueue.h: added virtual AddExternalConsumer() to interface - VulkanVideoProcessor.h: override with uint64_t signature Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

Fence routing: - videoDecodeCompleteFence = VK_NULL_HANDLE when filter enabled - Filter signals frameCompleteFence as the last producer - Filter pool node fence not used by decoder (syncWithHost=false) - Fence assertion guarded for VK_NULL_HANDLE - fieldPic debug uses frameCompleteFence Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The compute filter output images were created with only a combined image view (no per-plane views). The compute shader writes Y and CbCr through separate bindings (5=Y, 6=CbCr) which require per-plane storage views. Without these, the shader wrote through the combined view for both bindings, corrupting the CbCr channel (greenish/random chroma on display). Fix: create a YCbCr conversion for the filter output image spec with VK_IMAGE_CREATE_EXTENDED_USAGE_BIT | VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT. This enables the 6-param VkImageResourceView::Create to produce both: 1. A combined sampled view (for display YCbCr sampling) 2. Per-plane storage views (for compute shader write) Also pass planeUsageOverride=VK_IMAGE_USAGE_STORAGE_BIT when the image spec has ycbcrConversion AND storage usage, so the per-plane views are created with the correct usage flags. Validation layer: no new errors introduced. Pre-existing VUID errors (tiling-08717, image-01762) unchanged. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

…al fix) The decoder and filter shared a single TL semaphore but signaled from different queues (decode family 3, compute family 0). With pipelining, the decode queue ran ahead, pushing the TL value past what the filter needed to signal next. This backward signal (e.g., filter tries to signal 114 but TL is already at 337) was silently dropped, leaving frameCompleteFence unsignaled forever. Fix: use the filter pool node's BINARY semaphore for decode→filter handoff. The decoder signals it, the filter waits on it. Each decode/filter pair has its own binary semaphore — no ordering conflict. The TL semaphore is now signaled ONLY by the filter (compute queue), so values are always monotonically increasing. Decode submit → signals binary semaphore (pool node GetSemaphore()) Filter submit → waits binary semaphore → signals TL @ filterCompleteTimelineValue Without the filter, the decoder signals the TL directly (unchanged). Verified on NVIDIA RTX 5080: - av1_superres 1080p: PASS (481 frames, 188 FPS) — was ALWAYS CRASH - h265_itu_slist_a: PASS (66 frames, 828 FPS) — was ALWAYS CRASH - h264_4k: PASS (27 frames, 73 FPS) - h265_2160p: PASS (31 frames, 1416 FPS) - HEVC 10-bit 4K: PASS (30 frames, 8.5 FPS) - vp9_tile_1x2, vp9_svc: PASS - h264_clip_a: PASS (31 frames, 3725 FPS) Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

Replace assert with bounds check for ColorPrimaries, TransferCharacteristics, and MatrixCoefficients arrays. AV1 streams (e.g., argon_test1019) can have values beyond the array bounds. Now prints "Unknown" instead of crashing. Fixes decode_av1_argon_test1019 (was OOB crash, now PASS). Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

Fixes: 1. VkVideoDecoder.cpp: removed stale checkDecodeIdleSync block (was incorrectly placed before filter submit). Added pipeline stall (checkIdleSync=true) at end of DecodePictureWithParameters — waits on frameCompleteFence + filter TL after all stages complete using WaitAndResetFence. Added SYNC-FAIL diagnostic for fence timeout. 2. VkVideoDumpPool.h: moved bytesWritten inside the callback block, added zero-write guard with printf warning. With pipeline stall enabled, filter+dump produces correct frames but at reduced FPS (serialized). The stall confirms the sync issue is in the pipelined path — the display/dump reads incomplete filter output when no CPU wait is present. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

…pute The filter output images started with VK_IMAGE_LAYOUT_UNDEFINED and were never transitioned to VK_IMAGE_LAYOUT_GENERAL before the compute shader wrote to them. On NVIDIA, a transition from UNDEFINED clears/invalidates image contents, causing the green/corrupt CbCr seen in display and dump. Every frame triggered [LAYOUT-BUG] — all 17 DPB slots had UNDEFINED layout on the filter output image. Fix: record a VkImageMemoryBarrier2 (UNDEFINED → GENERAL) in the filter's command buffer (compute queue) before RecordCommandBuffer. Only triggered when currentImageLayout == UNDEFINED (first use or after InvalidateImageLayout on reconfigure). The DPB images had proper UNDEFINED → DECODE_DPB transitions (line 1046), but the filter output images were missed. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

Replace the COM-style VkVideoRefCountBase intrusive ref-counting with standard C++ std::shared_ptr across the entire codebase (68 files, 448 insertions, 1001 deletions). Core changes: - VkSharedBaseObj<T> is now a using-alias for std::shared_ptr<T> - VkVideoRefCountBase reduced to a plain polymorphic base with virtual destructor (no enable_shared_from_this, no refcount) - AddRef/Release/m_refCount removed from all ~21 subclasses Pool lifecycle: - Custom deleters return nodes to pool via weak_ptr<Pool> - Explicit bitmask tracking replaces use_count() polling for node availability (use_count is formally approximate per C++ std) - Pool nodes hold strong parent ref during checkout to prevent use-after-free if pool is destroyed while nodes are checked out - All pool custom deleters hardened: node access moved inside poolWeak.lock() guard to prevent use-after-free API boundary cleanup: - Eliminated makeSharedFromRaw() bridge at encoder and decoder call sites using shared_ptr aliasing constructors - ReferencedObjectsInfo changed from raw pointers to shared_ptrs - FindByRawPtr() searches all registered parameter objects instead of just the last-of-type array (fixes lossy PPS/SPS lookup) - dependency_data_s: replaced memset with value-initialization (struct now contains shared_ptr, memset was UB) Circular reference fix: - VkParserVideoPictureParameters → StdVideoPictureParametersSet → client back-reference formed a cycle that leaked on shutdown - Fix: ReleaseClientObject() virtual breaks the cycle in Reset() - VkVideoDecoder::Deinitialize() calls Reset() during shutdown - Verified zero NVDBG_MALLOC leaks for all codecs Pure virtual destructor fix: - VulkanVideoDecoder::~VulkanVideoDecoder() called Deinitialize() which invokes pure virtual FreeContext() — UB after vtable unwind - Fix: call Deinitialize() in each derived class destructor (H264, H265, AV1, VP9) where the vtable is still intact - Same fix applied to IVulkanVideoParser base class - Fixes 10KB context leak from failed FreeContext() dispatch Parser PPS ownership: - H264 parser m_pps replaced from non-owning no-op-deleter shared_ptr to proper ownership copy from PPS table Minor: - DecoderFrameProcessorState exposes typed VulkanFrame accessor - FilterTestApp: revert dynamic_cast back to static_cast (type is compile-time invariant, no RTTI needed) - DeviceWaitIdle() added to VulkanDeviceContext destructor Validated with ASan: zero leaks across H264, H265, AV1, VP9 decode and all encoder paths. Filter tests 49/49 pass. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

The decoder library (vulkan_video_decoder.cpp) previously created its own VulkanDeviceContext internally, which called LoadVk() to load a separate Vulkan loader dispatch table. When the test app passed its own VkDevice handle — created through the app's loader — the library tried to call vkGetDeviceQueue through its own incompatible dispatch table, causing a VUID-vkGetDeviceQueue-device-parameter crash. This dual-loader problem manifested as the decode_h264_clip_a_hw_load_balancing test crash, where --enableHwLoadBalancing triggered numDecodeQueues=-1, exercising the vkGetDeviceQueue path more aggressively. Fix: extend CreateVulkanVideoDecoder() with a new first parameter VulkanDeviceContext* pVkDevCtxt. When non-null, the library uses the caller's device context directly — sharing the Vulkan loader dispatch table, device, queues, and all state. When null, the library creates its own VulkanDeviceContext internally (preserving the old behavior for standalone usage like vulkan-video-simple-dec). Implementation details: - VulkanVideoDecoderImpl::m_vkDevCtxt changed from a by-value member to a pointer (m_pVkDevCtxt) + std::unique_ptr<VulkanDeviceContext> (m_ownDevCtxt) for the self-managed case - Constructor takes optional VulkanDeviceContext*; when null, allocates its own and points m_pVkDevCtxt at it - Initialize() checks m_pVkDevCtxt->getDevice() != VK_NULL_HANDLE to skip device creation when the context is already fully initialized - All internal usage changed from m_vkDevCtxt. to m_pVkDevCtxt-> Callers updated: - vulkan-video-dec-test: passes &vkDevCtxt (app-created device) - vulkan-video-simple-dec: passes nullptr (library creates device) Tested: Khronos VVS test suite with test apps — 98.6% pass rate (0 crashes, up from 71.6% before encoder+decoder fixes). The hw_load_balancing test now passes. Signed-off-by: Tony Zlatinski <tzlatinski@nvidia.com>

zlatinski force-pushed the khronos-test-framework-cherry-picked branch from 5a12ae5 to 979b869 Compare March 23, 2026 23:28

zlatinski added 20 commits March 27, 2026 11:37

zlatinski force-pushed the khronos-test-framework-cherry-picked branch from 979b869 to f74ac39 Compare March 27, 2026 19:35

zlatinski merged commit 72c8a8e into main Mar 30, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tests: add Khronos VVS test framework (cherry-picked from upstream)#215

tests: add Khronos VVS test framework (cherry-picked from upstream)#215
zlatinski merged 20 commits intomainfrom
khronos-test-framework-cherry-picked

zlatinski commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zlatinski commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant