Skip to content

Conversation

@GuoningHuang
Copy link

@GuoningHuang GuoningHuang commented Sep 6, 2025

Fuse tosa.mul + tosa.sigmoid into a linalg-based SiLU op to enable efficient IR lowering and further backend optimization.

Copy link
Member

@linuxlonelyeagle linuxlonelyeagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rewrite the title as [frontend] Implement SiLU fusion with mul + sigmoid pattern

graph.op_groups = {}
graph.op_groups["subgraph0"] = new_op_group
graph.group_map_device = {"subgraph0": device}
graph.group_map_device = {"subgraph0": device}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add endline

# CHECK: %{{.*}} = linalg.generic
# CHECK: return %{{.*}}
# CHECK: %{{.*}} = tensor.empty() : tensor<4x4xf32>
# CHECK: %{{.*}} = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<4x4xf32>) outs(%0 : tensor<4x4xf32>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my view, this is a completely useless check. You should also capture the SSA value.

@linuxlonelyeagle
Copy link
Member

Keep the PR description concise—ideally under 50 characters. You should explain what you did and why you did it. This description will appear in the git log.

@GuoningHuang GuoningHuang changed the title frontend: implement SiLU fusion with mul + sigmoid pattern [frontend] Implement SiLU fusion with mul + sigmoid pattern Sep 9, 2025
@GuoningHuang
Copy link
Author

@linuxlonelyeagle Hi, thanks a lot for your suggestions!

Title: I'll update it to "[frontend] Implement SiLU fusion with mul + sigmoid pattern" for clarity and log consistency.

Newline: I'll add the missing newline at the end of the file.

Test check: I’ll revise the test in tests/Python/test_silu.py to also capture the SSA value as you suggested.

PR description: I'll shorten the description to under 50 characters and make sure it succinctly states “what” was done and “why”.

# CHECK: %{{.*}} = linalg.generic
# CHECK: return %{{.*}}
# CHECK: %{{.*}} = tensor.empty() : tensor<4x4xf32>
# CHECK: %[[RES:.*]] = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<4x4xf32>) outs(%0 : tensor<4x4xf32>) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# CHECK-LABEL: func @forward
#       CHECK:   %{{.*}} = tensor.empty() : tensor<4x4xf32>
#       CHECK:   %[[RES:.*]] = linalg.generic {indexing_maps = [#map, #map], iterator_types = ["parallel", "parallel"]} ins(%arg0 : tensor<4x4xf32>) outs(%0 : tensor<4x4xf32>) {
....
#       CHECK:   return %[[RES]] : tensor<4x4xf32>
#       CHECK: }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above is an example I provided.

  • Eliminate unnecessary content from checks.
  • Ensure the format of the check is sufficiently elegant.
  • I believe you should also capture the SSA values within the generic op block.

Copy link
Author

@GuoningHuang GuoningHuang Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this ok?

# CHECK-LABEL: func.func @forward
# CHECK: %[[EMPTY:.*]] = tensor.empty() : tensor<4x4xf32>
# CHECK: %[[RES:.*]] = linalg.generic {.*} ins(%arg0 : tensor<4x4xf32>) outs(%[[EMPTY]] : tensor<4x4xf32>) {
# CHECK:   ^bb0(%in: f32, %out: f32):
# CHECK:     %[[NEG:.*]] = arith.negf %in : f32
# CHECK:     %[[EXP:.*]] = math.exp %[[NEG]] : f32
# CHECK:     %[[ONE:.*]] = arith.constant 1.000000e+00 : f32
# CHECK:     %[[ADD:.*]] = arith.addf %[[EXP]], %[[ONE]] : f32
# CHECK:     %[[DIV:.*]] = arith.divf %in, %[[ADD]] : f32
# CHECK:     linalg.yield %[[DIV]] : f32
# CHECK: } -> tensor<4x4xf32>
# CHECK: return %[[RES]] : tensor<4x4xf32>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a close look at the example I provided: the colons in CHECK-LABEL and CHECK are aligned.

Copy link
Author

@GuoningHuang GuoningHuang Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thank you for pointing that out! Is this version ok? If so, I'll update my PR accordingly:

# CHECK-LABEL: func.func @forward
#       CHECK:   %[[EMPTY:.*]] = tensor.empty() : tensor<4x4xf32>
#       CHECK:   %[[RES:.*]] = linalg.generic {.*} ins(%arg0 : tensor<4x4xf32>) outs(%[[EMPTY]] : tensor<4x4xf32>) {
#       CHECK:   ^bb0(%in: f32, %out: f32):
#       CHECK:       %[[NEG:.*]] = arith.negf %in : f32 
#       CHECK:       %[[EXP:.*]] = math.exp %[[NEG]] : f32
#       CHECK:       %[[ONE:.*]] = arith.constant 1.000000e+00 : f32
#       CHECK:       %[[ADD:.*]] = arith.addf %[[EXP]], %[[ONE]] : f32
#       CHECK:       %[[DIV:.*]] = arith.divf %in, %[[ADD]] : f32
#       CHECK:       linalg.yield %[[DIV]] : f32
#       CHECK:    } -> tensor<4x4xf32>
#       CHECK: return %[[RES]] : tensor<4x4xf32>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,I have update my PR accordingly.

@zhanghb97
Copy link
Member

@GuoningHuang, please check the CI error.

@zhanghb97 zhanghb97 added the ci Related to CI/CD label Sep 18, 2025
@GuoningHuang
Copy link
Author

@zhanghb97 CI error has been fixed. Thanks for pointing it out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci Related to CI/CD final review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants