Skip to content

vulkan int8 packing quantize dequantize requantize#3731

Merged
nihui merged 25 commits intoTencent:masterfrom
nihui:vulkan-int
Jun 26, 2025
Merged

vulkan int8 packing quantize dequantize requantize#3731
nihui merged 25 commits intoTencent:masterfrom
nihui:vulkan-int

Conversation

@nihui
Copy link
Copy Markdown
Member

@nihui nihui commented Apr 24, 2022

No description provided.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 24, 2022

Codecov Report

Attention: Patch coverage is 94.41118% with 28 lines in your changes missing coverage. Please review.

Project coverage is 95.81%. Comparing base (7557f5c) to head (2110e77).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
src/gpu.cpp 83.78% 12 Missing ⚠️
src/allocator.cpp 9.09% 10 Missing ⚠️
src/layer/vulkan/dequantize_vulkan.cpp 98.44% 2 Missing ⚠️
src/layer/vulkan/quantize_vulkan.cpp 98.36% 2 Missing ⚠️
src/layer/vulkan/requantize_vulkan.cpp 98.54% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3731      +/-   ##
==========================================
- Coverage   95.81%   95.81%   -0.01%     
==========================================
  Files         831      834       +3     
  Lines      264948   265433     +485     
==========================================
+ Hits       253851   254315     +464     
- Misses      11097    11118      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@nihui nihui changed the title [WIP] vulkan int8 packing quantize [WIP] vulkan int8 packing quantize dequantize Apr 24, 2022
@nihui nihui changed the title [WIP] vulkan int8 packing quantize dequantize [WIP] vulkan int8 packing quantize dequantize requantize Apr 25, 2022
@github-actions github-actions bot added the test label Jun 23, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Jun 23, 2025

The binary size change of libncnn.so (bytes)

architecture base size pr size difference
x86_64 15593968 15661624 +67656 ⚠️
armhf 6602932 6657988 +55056 ⚠️
aarch64 9983872 9989392 +5520 ⚠️

@nihui nihui requested a review from Copilot June 25, 2025 09:29

This comment was marked as outdated.

@nihui nihui changed the title [WIP] vulkan int8 packing quantize dequantize requantize vulkan int8 packing quantize dequantize requantize Jun 25, 2025
@nihui nihui requested a review from Copilot June 25, 2025 09:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request adds comprehensive int8 support to Vulkan operations across ncnn including quantization, dequantization, requantization, and packing. Key changes include new/int8 pipelines and shader modules, modifications in GPU utility functions to handle int8, and corresponding test updates for quantize/dequantize and packing operations.

Reviewed Changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated no comments.

Show a summary per file
File Description
tests/test_quantize.cpp & oom variant Replace existing scale_data creation for 1D tests (use constant 1)
tests/test_packing.cpp Renaming and restructuring of packing functions to separate FP32 and int8 tests
src/net.cpp Added handling for int8 arithmetic adjustments
Multiple shader files (e.g., requantize_.comp, quantize_.comp) New shader code supporting int8 operations
src/gpu.cpp, src/allocator.cpp Updates to utility operators, image format selection, and int8 support
src/command.cpp Adjustments to record int8 transfers
Other Vulkan layer files (requantize_vulkan., quantize_vulkan., dequantize_vulkan.*, packing_vulkan.cpp) Pipeline creation and operator updates to incorporate int8 support
Comments suppressed due to low confidence (3)

tests/test_quantize.cpp:27

  • The change from using a.w to a constant '1' for creating scale_data in 1D cases is significant; please add an inline comment clarifying the reasoning for this modification to aid future maintainers.
        if (a.dims == 1) scale_data.create(1);

src/allocator.cpp:900

  • For elempack == 8, using VK_FORMAT_R8G8B8A8_SINT may not correctly represent 8 individual int8 channels since this format typically supports 4 channels; please verify that the chosen image format meets the int8 packing requirements or update the format selection accordingly.
        if (elempack == 8) format = VK_FORMAT_R8G8B8A8_SINT;

tests/test_packing.cpp:217

  • [nitpick] The renaming from 'test_packing_gpu_buffer' to 'test_packing_gpu_fp32' and the addition of an int8 variant are nontrivial changes; please ensure that all test calls and related documentation are updated to reflect this new naming and functionality split.
}

@nihui nihui merged commit 9f832c1 into Tencent:master Jun 26, 2025
105 of 108 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants