[AArch64] Consider COPY between disjoint register classes as expensive #167661

guy-david · 2025-11-12T09:40:27Z

The motivation is to allow passes such as MachineLICM to hoist trivial FMOV instructions out of loops, where previously it didn't do so even when the RHS is a constant.
On most architectures, these expensive move instructions have a latency of 2-6 cycles, and certainly not cheap as a 0-1 cycle move.

llvmbot · 2025-11-12T09:41:01Z

@llvm/pr-subscribers-backend-aarch64

Author: Guy David (guy-david)

Changes

The motivation is to allow passes such as MachineLICM to hoist trivial FMOV instructions out of loops, where previously it didn't do so even when the RHS is a constant.
On most architectures, these expensive move instructions have a latency of 2-6 cycles, and certainly not cheap as a 0-1 cycle move.

Full diff: https://github.com/llvm/llvm-project/pull/167661.diff

2 Files Affected:

(modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+24)
(added) llvm/test/CodeGen/AArch64/licm-regclass-copy.mir (+55)

diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 4b4073365483e..6482091c2cc70 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -1043,6 +1043,27 @@ static bool isCheapImmediate(const MachineInstr &MI, unsigned BitSize) {
   return Is.size() <= 2;
 }
 
+// Check if a COPY instruction is cheap.
+static bool isCheapCopy(const MachineInstr &MI,
+                        const AArch64RegisterInfo &RI) {
+  assert(MI.isCopy() && "Expected COPY instruction");
+  const MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
+
+  // Cross-register-class copies (e.g., between GPR and FPR) are expensive on
+  // AArch64, typically requiring an FMOV instruction with a 2-6 cycle latency.
+  auto getRegClass = [&](Register Reg) -> const TargetRegisterClass * {
+    return Reg.isVirtual() ? MRI.getRegClass(Reg)
+           : Reg.isPhysical() ? RI.getMinimalPhysRegClass(Reg)
+           : nullptr;
+  };
+  const TargetRegisterClass *DstRC = getRegClass(MI.getOperand(0).getReg());
+  const TargetRegisterClass *SrcRC = getRegClass(MI.getOperand(1).getReg());
+  if (DstRC && SrcRC && !RI.getCommonSubClass(DstRC, SrcRC))
+    return false;
+
+  return MI.isAsCheapAsAMove();
+}
+
 // FIXME: this implementation should be micro-architecture dependent, so a
 // micro-architecture target hook should be introduced here in future.
 bool AArch64InstrInfo::isAsCheapAsAMove(const MachineInstr &MI) const {
@@ -1056,6 +1077,9 @@ bool AArch64InstrInfo::isAsCheapAsAMove(const MachineInstr &MI) const {
   default:
     return MI.isAsCheapAsAMove();
 
+  case TargetOpcode::COPY:
+    return isCheapCopy(MI, RI);
+
   case AArch64::ADDWrs:
   case AArch64::ADDXrs:
   case AArch64::SUBWrs:
diff --git a/llvm/test/CodeGen/AArch64/licm-regclass-copy.mir b/llvm/test/CodeGen/AArch64/licm-regclass-copy.mir
new file mode 100644
index 0000000000000..287379774f519
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/licm-regclass-copy.mir
@@ -0,0 +1,55 @@
+# NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+# RUN: llc -mtriple=aarch64 -run-pass=early-machinelicm -verify-machineinstrs -o - %s | FileCheck %s
+
+# This test verifies that cross-register-class copies (e.g., between GPR and FPR)
+# ARE hoisted out of loops by MachineLICM, as they translate to expensive
+# instructions like FMOV (2-6 cycles) on AArch64.
+
+---
+name: cross_regclass_copy_hoisted
+tracksRegLiveness: true
+registers:
+  - { id: 0, class: gpr64 }
+  - { id: 1, class: gpr64 }
+  - { id: 2, class: fpr64 }
+body: |
+  ; CHECK-LABEL: name: cross_regclass_copy_hoisted
+  ; CHECK: bb.0:
+  ; CHECK-NEXT:   successors: %bb.1(0x80000000)
+  ; CHECK-NEXT:   liveins: $x0, $d0
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %0:gpr64 = COPY $x0
+  ; CHECK-NEXT:   %2:fpr64 = COPY $d0
+  ; CHECK-NEXT:   %1:gpr64 = COPY %2
+  ; CHECK-NEXT:   B %bb.1
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.1:
+  ; CHECK-NEXT:   successors: %bb.1(0x40000000), %bb.2(0x40000000)
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   %0:gpr64 = ADDXri %0, 1, 0
+  ; CHECK-NEXT:   $xzr = SUBSXri %0, 100, 0, implicit-def $nzcv
+  ; CHECK-NEXT:   Bcc 11, %bb.1, implicit $nzcv
+  ; CHECK-NEXT:   B %bb.2
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT: bb.2:
+  ; CHECK-NEXT: {{  $}}
+  ; CHECK-NEXT:   $x0 = COPY %1
+  ; CHECK-NEXT:   RET_ReallyLR implicit $x0
+  bb.0:
+    liveins: $x0, $d0
+    %0:gpr64 = COPY $x0
+    %2:fpr64 = COPY $d0
+    B %bb.1
+
+  bb.1:
+    ; This COPY between FPR64 and GPR64 should be hoisted
+    %1:gpr64 = COPY %2:fpr64
+    %0:gpr64 = ADDXri %0:gpr64, 1, 0
+    $xzr = SUBSXri %0:gpr64, 100, 0, implicit-def $nzcv
+    Bcc 11, %bb.1, implicit $nzcv
+    B %bb.2
+
+  bb.2:
+    $x0 = COPY %1:gpr64
+    RET_ReallyLR implicit $x0
+...

github-actions · 2025-11-12T09:42:15Z

✅ With the latest revision this PR passed the C/C++ code formatter.

The motivation is to allow passes such as MachineLICM to hoist trivial FMOV instructions out of loops, where previously it didn't do so even when the RHS is a constant. On most architectures, these expensive move instructions have a latency of 2-6 cycles, and certainly not cheap as a 0-1 cycle move.

nasherm

LGTM

fhahn · 2025-11-12T11:38:51Z

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

+    return Reg.isVirtual()    ? MRI.getRegClass(Reg)
+           : Reg.isPhysical() ? RI.getMinimalPhysRegClass(Reg)
+                              : nullptr;


Might be more readable by breaking up into if() with early return rather than nested ? ops

fhahn · 2025-11-12T11:38:53Z

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

+
+  // Cross-register-class copies (e.g., between GPR and FPR) are expensive on
+  // AArch64, typically requiring an FMOV instruction with a 2-6 cycle latency.
+  auto getRegClass = [&](Register Reg) -> const TargetRegisterClass * {


nit: llvm coding standard uses upper case first letter for varibles.

Suggested change

auto getRegClass = [&](Register Reg) -> const TargetRegisterClass * {

auto GetRegClass = [&](Register Reg) -> const TargetRegisterClass * {

fhahn · 2025-11-12T11:39:26Z

llvm/test/CodeGen/AArch64/licm-regclass-copy.mir

+    %0:gpr32 = MOVi32imm 2143289344
+    %1:fpr32 = COPY %0:gpr32
+    %2:fpr32 = FMOVS0


Could you also add a test for a physical register?

guy-david requested review from ahmedbougacha and jroelofs November 12, 2025 09:40

llvmbot added the backend:AArch64 label Nov 12, 2025

guy-david force-pushed the users/guy-david/aarch64-fmov-not-cheap branch from 258e4cf to 0714e8b Compare November 12, 2025 10:31

nasherm approved these changes Nov 12, 2025

View reviewed changes

fhahn reviewed Nov 12, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AArch64] Consider COPY between disjoint register classes as expensive #167661

[AArch64] Consider COPY between disjoint register classes as expensive #167661

guy-david commented Nov 12, 2025

Uh oh!

llvmbot commented Nov 12, 2025

Uh oh!

github-actions bot commented Nov 12, 2025 •

edited

Loading

Uh oh!

nasherm left a comment

Uh oh!

fhahn Nov 12, 2025

Uh oh!

fhahn Nov 12, 2025

Uh oh!

fhahn Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	auto getRegClass = [&](Register Reg) -> const TargetRegisterClass * {
	auto GetRegClass = [&](Register Reg) -> const TargetRegisterClass * {

[AArch64] Consider COPY between disjoint register classes as expensive #167661

Are you sure you want to change the base?

[AArch64] Consider COPY between disjoint register classes as expensive #167661

Conversation

guy-david commented Nov 12, 2025

Uh oh!

llvmbot commented Nov 12, 2025

Uh oh!

github-actions bot commented Nov 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nasherm left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

fhahn Nov 12, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions bot commented Nov 12, 2025 •

edited

Loading