Fix Conv1d w8a32 operator (#16607) by mgiordy · Pull Request #16607 · pytorch/executorch

mgiordy · 2026-01-14T23:04:34Z

Summary:

Summary

This diff fixes the Conv1d w8a32 operator by adding a transformation to the val attribute of the other_inputs[0].meta dictionary. Specifically, the permute operation is applied to the original_val tensor with the fake_mode context, and the resulting transposed_val is assigned to transposed_inputs.meta["val"].

Reviewed By: mcremon-meta

Differential Revision: D89863750

pytorch-bot · 2026-01-14T23:04:38Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16607

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 81fc23b with merge base e638059 ():

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / unittest-editable / windows / windows-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

meta-codesync · 2026-01-14T23:04:49Z

@mgiordy has exported this pull request. If you are a Meta employee, you can view the originating Diff in D89863750.

github-actions · 2026-01-14T23:05:24Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Copilot

Pull request overview

This PR fixes the Conv1d w8a32 operator by adding metadata propagation for transposed tensors and adding input validation to prevent unsupported configurations.

Changes:

Added metadata propagation for the val attribute when creating transposed inputs and weights in the Conv1d w8a32 operator
Added validation in patterns.py to bail early when input length doesn't equal kernel size (marked as "not yet supported")

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
backends/cadence/aot/quantizer/fusion_pass.py	Adds proper fake_mode-aware metadata propagation for transposed_inputs and transposed_weights when transforming Conv1d tensors from NCL to NLC format
backends/cadence/aot/quantizer/patterns.py	Adds validation to reject Conv1d operations where input length doesn't equal kernel size (3), marking this configuration as not yet supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-29T14:55:36Z

backends/cadence/aot/quantizer/patterns.py

+                # Bail if length != kernel size - Not yet supported
+                if inputs_shape[-1] != cnn_weights_shape[2]:
+                    return (
+                        PartitionAnchors(
+                            empty=True,
+                        ),
+                        conv_layer,
+                    )


This check restricts the w8a32_conv pattern to only match when the input length equals the kernel size (3). While the comment indicates this is intentionally not yet supported, this is quite restrictive. Standard convolution operations typically support input lengths greater than or equal to the kernel size. The reference implementation in ref_implementations.py (lines 926-970) and the test in test_ref_implementations.py (lines 1156-1166 show length=5 with kernel=3) both support arbitrary input lengths. Consider whether this restriction is necessary, or if it should be relaxed to allow input_length >= kernel_size to enable the optimization in more cases.

Copilot · 2026-01-29T14:55:36Z

backends/cadence/aot/quantizer/fusion_pass.py

+    # Propagate val metadata for transposed_inputs
+    if "val" in other_inputs[0].meta:
+        original_val = other_inputs[0].meta["val"]
+        fake_mode = original_val.fake_mode
+        if fake_mode is not None:
+            with fake_mode:
+                transposed_val = torch.ops.aten.permute.default(
+                    original_val, [0, 2, 1]
+                )
+            transposed_inputs.meta["val"] = transposed_val
+        else:
+            transposed_inputs.meta["val"] = torch.ops.aten.permute.default(
+                original_val, [0, 2, 1]
+            )
    copy_node_metadata(transposed_inputs, other_inputs[0])

    transposed_weights = graph_module.graph.call_function(
        torch.ops.aten.permute.default,
        (weights_inputs[0], [2, 0, 1]),  # NCL -> LNC
    )
+    # Propagate val metadata for transposed_weights
+    if "val" in weights_inputs[0].meta:
+        original_val = weights_inputs[0].meta["val"]
+        fake_mode = original_val.fake_mode
+        if fake_mode is not None:
+            with fake_mode:
+                transposed_val = torch.ops.aten.permute.default(
+                    original_val, [2, 0, 1]
+                )
+            transposed_weights.meta["val"] = transposed_val
+        else:
+            transposed_weights.meta["val"] = torch.ops.aten.permute.default(
+                original_val, [2, 0, 1]
+            )


The metadata propagation logic for transposed_inputs (lines 435-448) and transposed_weights (lines 455-468) is duplicated with only minor variations. This pattern also appears elsewhere in the codebase (e.g., lines 164-176, 189-200, 376-385, 653-671). Consider extracting this into a helper function to reduce code duplication and improve maintainability. The helper function could take parameters like the node, transformation operation, and transformation arguments.

Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T08:20:45Z

backends/cadence/aot/quantizer/patterns.py

+                # Bail if length != kernel size - Not yet supported
+                if inputs_shape[-1] != cnn_weights_shape[2]:


The new input-length guard inputs_shape[-1] != cnn_weights_shape[2] looks incorrect for Conv1d: inputs_shape[-1] is the sequence length (L), while cnn_weights_shape[2] is the kernel size (K=3). cadence::quantized_w8a32_conv (and its meta/ref implementations) support L > K (output length is L - K + 1), and existing tests cover L=5, K=3. This condition would incorrectly bail out for normal Conv1d inputs and prevent the pattern from ever matching.

Consider removing this guard, or (if needed) only bailing when L < K (or other truly unsupported cases).

Suggested change

# Bail if length != kernel size - Not yet supported

if inputs_shape[-1] != cnn_weights_shape[2]:

# Bail only when the input length is smaller than the kernel size.

# Conv1d supports input lengths greater than the kernel size.

if inputs_shape[-1] < cnn_weights_shape[2]:

Copilot · 2026-04-08T08:20:45Z

backends/cadence/aot/quantizer/patterns.py

+            inputs = conv_layer.args[0]
+            if "tensor_meta" in inputs.meta:
+                inputs_shape = inputs.meta["tensor_meta"].shape
+                # Bail if length != kernel size - Not yet supported


This new shape-validation block is gated by if hasattr(cnn_weights.meta, "tensor_meta") above. Since fx.Node.meta is a dict, hasattr(..., "tensor_meta") will always be false, so none of the weight/input shape checks (including the new input-length check) will run.

Use a dict-key check (e.g., "tensor_meta" in cnn_weights.meta) so the validations actually execute when metadata is available.

Copilot · 2026-04-08T08:20:45Z

backends/cadence/aot/quantizer/fusion_pass.py

+    # Propagate val metadata for transposed_inputs
+    if "val" in other_inputs[0].meta:
+        original_val = other_inputs[0].meta["val"]
+        fake_mode = original_val.fake_mode
+        if fake_mode is not None:
+            with fake_mode:
+                transposed_val = torch.ops.aten.permute.default(
+                    original_val, [0, 2, 1]
+                )
+            transposed_inputs.meta["val"] = transposed_val


get_args_and_kwargs_mixed_w8a32_conv now conditionally propagates meta["val"] for the inserted permute nodes (and adds a fake_mode fallback). There doesn't appear to be any unit/integration test coverage exercising QuantFusion on a Conv1d->quantized_w8a32_conv rewrite, so regressions here (e.g., missing/incorrect meta causing later passes to fail) may go unnoticed.

Add a small test that runs QuantFusion on a minimal Conv1d graph and asserts the resulting graph contains the expected permutes + cadence::quantized_w8a32_conv, and that the graph can run FakeTensor/meta propagation without errors.

Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Summary: #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Summary: Pull Request resolved: pytorch#16607 #### Summary This diff fixes the Conv1d w8a32 operator by adding a transformation to the `val` attribute of the `other_inputs[0].meta` dictionary. Specifically, the `permute` operation is applied to the `original_val` tensor with the `fake_mode` context, and the resulting `transposed_val` is assigned to `transposed_inputs.meta["val"]`. Reviewed By: mcremon-meta Differential Revision: D89863750

Copilot

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-08T15:41:46Z

backends/cadence/aot/quantizer/patterns.py

+            inputs = conv_layer.args[0]
+            if "tensor_meta" in inputs.meta:
+                inputs_shape = inputs.meta["tensor_meta"].shape
+                # Bail if length != kernel size - Not yet supported
+                if inputs_shape[-1] != cnn_weights_shape[2]:
+                    return (
+                        PartitionAnchors(
+                            empty=True,
+                        ),
+                        conv_layer,
+                    )
+


The new anchor-bail condition inputs_shape[-1] != cnn_weights_shape[2] incorrectly restricts quantized_w8a32 Conv1d fusion to cases where input length equals kernel size (3). The operator’s fake/meta kernel and reference implementation support general input lengths (output length in_length - kernel + 1), and existing tests exercise in_length=5 with kernel=3 (see backends/cadence/aot/tests/test_ref_implementations.py:1170+). This check will prevent valid fusions and likely regress model coverage; it should be removed or replaced with the actual supported constraint (if any).

Suggested change

inputs = conv_layer.args[0]

if "tensor_meta" in inputs.meta:

inputs_shape = inputs.meta["tensor_meta"].shape

# Bail if length != kernel size - Not yet supported

if inputs_shape[-1] != cnn_weights_shape[2]:

return (

PartitionAnchors(

empty=True,

),

conv_layer,

)

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 14, 2026

meta-codesync bot added fb-exported meta-exported labels Jan 14, 2026

mgiordy force-pushed the export-D89863750 branch from 727b730 to 5d8c391 Compare January 29, 2026 14:49

Copilot AI review requested due to automatic review settings January 29, 2026 14:49

Copilot started reviewing on behalf of mgiordy January 29, 2026 14:50 View session

Copilot AI reviewed Jan 29, 2026

View reviewed changes

mgiordy force-pushed the export-D89863750 branch from 5d8c391 to 7850292 Compare February 4, 2026 23:18

meta-codesync bot changed the title ~~Fix Conv1d w8a32 operator~~ Fix Conv1d w8a32 operator (#16607) Apr 5, 2026

mgiordy force-pushed the export-D89863750 branch from 7850292 to 24f179a Compare April 5, 2026 10:34

mgiordy force-pushed the export-D89863750 branch from 24f179a to d870e83 Compare April 5, 2026 10:38

Copilot AI review requested due to automatic review settings April 8, 2026 08:14

mgiordy force-pushed the export-D89863750 branch from d870e83 to 54e7f94 Compare April 8, 2026 08:14

Copilot started reviewing on behalf of mgiordy April 8, 2026 08:14 View session

mgiordy force-pushed the export-D89863750 branch from 54e7f94 to f01d1a1 Compare April 8, 2026 08:17

Copilot AI reviewed Apr 8, 2026

View reviewed changes

mgiordy force-pushed the export-D89863750 branch from f01d1a1 to 065e5a9 Compare April 8, 2026 10:59

mgiordy force-pushed the export-D89863750 branch from 065e5a9 to 51c81d0 Compare April 8, 2026 11:03

mgiordy force-pushed the export-D89863750 branch from 51c81d0 to 5132ba0 Compare April 8, 2026 13:17

Copilot AI review requested due to automatic review settings April 8, 2026 15:36

mgiordy force-pushed the export-D89863750 branch from 5132ba0 to b96a064 Compare April 8, 2026 15:36

Copilot started reviewing on behalf of mgiordy April 8, 2026 15:37 View session

mgiordy force-pushed the export-D89863750 branch from b96a064 to 81fc23b Compare April 8, 2026 15:40

Copilot AI reviewed Apr 8, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Conv1d w8a32 operator (#16607)#16607

Fix Conv1d w8a32 operator (#16607)#16607
mgiordy wants to merge 1 commit intopytorch:mainfrom
mgiordy:export-D89863750

mgiordy commented Jan 14, 2026 •

edited by meta-codesync bot

Loading

Uh oh!

pytorch-bot bot commented Jan 14, 2026 •

edited

Loading

Uh oh!

meta-codesync bot commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI Jan 29, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Bail if length != kernel size - Not yet supported
		if inputs_shape[-1] != cnn_weights_shape[2]:

-                # Bail if length != kernel size - Not yet supported
-                if inputs_shape[-1] != cnn_weights_shape[2]:
+                # Bail only when the input length is smaller than the kernel size.
+                # Conv1d supports input lengths greater than the kernel size.
+                if inputs_shape[-1] < cnn_weights_shape[2]:

Conversation

mgiordy commented Jan 14, 2026 • edited by meta-codesync bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

pytorch-bot bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/16607

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

meta-codesync bot commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mgiordy commented Jan 14, 2026 •

edited by meta-codesync bot

Loading

pytorch-bot bot commented Jan 14, 2026 •

edited

Loading

This PR needs a `release notes:` label