Fix stack buffer overflows in NCHW reduce rewrite and ynnpack shim functions#9851
Open
mohammadmseet-hue wants to merge 1 commit intogoogle:masterfrom
Open
Conversation
…nctions Bug 1: rewrite_reduction_axes_for_nchw (src/subgraph/static-reduce.c) uses int64_t original_reduction_axes[4] (32 bytes) but copies num_reduction_axes * sizeof(int64_t) bytes via memcpy. The function is only called for 4-dim NCHW tensors, but num_reduction_axes can be up to XNN_MAX_TENSOR_DIMS (6), since xnn_define_static_reduce_v2 only checks num_reduction_axes <= 6, not num_reduction_axes <= input_num_dims. With num_reduction_axes=5 or 6, memcpy writes 40-48 bytes into a 32-byte stack buffer, corrupting adjacent stack memory. Bugs 2-4: ynnpack/xnnpack/subgraph.cc shim functions xnn_define_static_expand_dims, xnn_define_static_reduce, and xnn_define_static_reduce_v2 copy user-provided axes into stack-allocated arrays of size XNN_MAX_TENSOR_DIMS (6) without any bounds check on the count parameter. The original C implementations in src/subgraph/copy.c and src/subgraph/static-reduce.c validate num_dims/num_reduction_axes <= XNN_MAX_TENSOR_DIMS before the copy, but the ynnpack shims omit these checks. Passing num_new_axes or num_reduction_axes > 6 writes past the stack buffer. Fix: Add bounds checks before all affected memcpy/loop operations.
Contributor
Author
ASAN TracesBug 1:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix 4 stack buffer overflow vulnerabilities caused by missing bounds checks on array copy operations.
Bug 1:
rewrite_reduction_axes_for_nchw(src/subgraph/static-reduce.c:26)original_reduction_axesis declared asint64_t[4](32 bytes) butnum_reduction_axescan be up toXNN_MAX_TENSOR_DIMS(6). The function is only called for 4-dim NCHW tensors (guarded at line 132), butxnn_define_static_reduce_v2validatesnum_reduction_axes <= 6, notnum_reduction_axes <= input_num_dims. Withnum_reduction_axes = 5or6,memcpywrites 40-48 bytes into a 32-byte stack buffer.Additionally, the loop at line 31 indexes
NCHW_AXES_MAPPING[4]andmask[4]with values derived from the overflowed buffer, causing further OOB reads and writes.Fix: Clamp
num_reduction_axesto 4 at the start of the function, since it only operates on 4-dimensional NCHW tensors.Bugs 2-4: ynnpack shim missing bounds checks (
ynnpack/xnnpack/subgraph.cc)Three shim functions copy user-provided axes into stack-allocated arrays of size
XNN_MAX_TENSOR_DIMS(6) without validating the count parameter:xnn_define_static_expand_dimsint32_t[6]num_new_axes <= 6xnn_define_static_reduceint64_t[6]num_reduction_axes <= 6xnn_define_static_reduce_v2int32_t[6]num_reduction_axes <= 6The original C implementations (
src/subgraph/copy.c:429,src/subgraph/static-reduce.c:257,342) validate these bounds before the copy. The ynnpack shims omit these checks, allowing stack buffer overflow when the count exceeds 6.Fix: Add
> XNN_MAX_TENSOR_DIMSbounds checks before the copy loops, matching the original C implementations.